deepflow 0.1.78 → 0.1.80
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +14 -3
- package/bin/install.js +3 -2
- package/package.json +4 -1
- package/src/commands/df/auto-cycle.md +33 -19
- package/src/commands/df/execute.md +166 -473
- package/src/commands/df/plan.md +113 -163
- package/src/commands/df/verify.md +433 -3
- package/src/skills/browse-fetch/SKILL.md +258 -0
- package/src/skills/browse-verify/SKILL.md +264 -0
- package/templates/config-template.yaml +14 -0
- package/src/skills/context-hub/SKILL.md +0 -87
|
@@ -8,93 +8,44 @@ You are a coordinator. Spawn agents, run ratchet checks, update PLAN.md. Never i
|
|
|
8
8
|
|
|
9
9
|
**ONLY:** Read PLAN.md, read specs/doing-*.md, spawn background agents, run ratchet health checks after each agent completes, update PLAN.md, write `.deepflow/decisions.md` in the main tree
|
|
10
10
|
|
|
11
|
-
|
|
11
|
+
## Core Loop (Notification-Driven)
|
|
12
12
|
|
|
13
|
-
|
|
14
|
-
|
|
13
|
+
Each task = one background agent. Completion notifications drive the loop.
|
|
14
|
+
|
|
15
|
+
**NEVER use TaskOutput** — returns full transcripts (100KB+) that explode context.
|
|
15
16
|
|
|
16
|
-
## Usage
|
|
17
17
|
```
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
18
|
+
1. Spawn ALL wave agents with run_in_background=true in ONE message
|
|
19
|
+
2. STOP. End your turn. Do NOT poll or monitor.
|
|
20
|
+
3. On EACH notification:
|
|
21
|
+
a. Run ratchet check (section 5.5)
|
|
22
|
+
b. Passed → TaskUpdate(status: "completed"), update PLAN.md [x] + commit hash
|
|
23
|
+
c. Failed → git revert HEAD --no-edit, TaskUpdate(status: "pending")
|
|
24
|
+
d. Report ONE line: "✓ T1: ratchet passed (abc123)" or "✗ T1: ratchet failed, reverted"
|
|
25
|
+
e. NOT all done → end turn, wait | ALL done → next wave or finish
|
|
26
|
+
4. Between waves: check context %. If ≥50% → checkpoint and exit.
|
|
27
|
+
5. Repeat until: all done, all blocked, or context ≥50%.
|
|
23
28
|
```
|
|
24
29
|
|
|
25
|
-
##
|
|
26
|
-
- Skill: `atomic-commits` — Clean commit protocol
|
|
27
|
-
- Skill: `context-hub` — Fetch external API docs before coding (when task involves external libraries)
|
|
30
|
+
## Context Threshold
|
|
28
31
|
|
|
29
|
-
|
|
30
|
-
| Agent | subagent_type | Purpose |
|
|
31
|
-
|-------|---------------|---------|
|
|
32
|
-
| Implementation | `general-purpose` | Task implementation |
|
|
33
|
-
| Debugger | `reasoner` | Debugging failures |
|
|
34
|
-
|
|
35
|
-
**Model routing from frontmatter:**
|
|
36
|
-
The model for each agent is determined by the `model:` field in the command/agent/skill frontmatter being invoked. The orchestrator reads the relevant frontmatter to determine which model to pass to `Task()`. If no `model:` field is present in the frontmatter, default to `sonnet`.
|
|
37
|
-
|
|
38
|
-
## Context-Aware Execution
|
|
39
|
-
|
|
40
|
-
Statusline writes to `.deepflow/context.json`: `{"percentage": 45}`
|
|
32
|
+
Statusline writes `.deepflow/context.json`: `{"percentage": 45}`
|
|
41
33
|
|
|
42
34
|
| Context % | Action |
|
|
43
35
|
|-----------|--------|
|
|
44
36
|
| < 50% | Full parallelism (up to 5 agents) |
|
|
45
37
|
| ≥ 50% | Wait for running agents, checkpoint, exit |
|
|
46
38
|
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
Each task = one background agent. Use agent completion notifications as the feedback loop.
|
|
50
|
-
|
|
51
|
-
**NEVER use TaskOutput** — returns full agent transcripts (100KB+) that explode context.
|
|
52
|
-
|
|
53
|
-
### Notification-Driven Execution
|
|
54
|
-
|
|
55
|
-
```
|
|
56
|
-
1. Spawn ALL wave agents with run_in_background=true in ONE message
|
|
57
|
-
2. STOP. End your turn. Do NOT run Bash monitors or poll for results.
|
|
58
|
-
3. Wait for "Agent X completed" notifications (they arrive automatically)
|
|
59
|
-
4. On EACH notification:
|
|
60
|
-
a. Run ratchet check (health checks on the worktree)
|
|
61
|
-
b. Report: "✓ T1: ratchet passed (abc123)" or "✗ T1: ratchet failed, reverted"
|
|
62
|
-
c. Update PLAN.md for that task
|
|
63
|
-
d. Check: all wave agents done?
|
|
64
|
-
- No → end turn, wait for next notification
|
|
65
|
-
- Yes → proceed to next wave or write final summary
|
|
66
|
-
```
|
|
67
|
-
|
|
68
|
-
After spawning, your turn ENDS. Per notification: run ratchet, output ONE line, update PLAN.md. Write full summary only after ALL wave agents complete.
|
|
69
|
-
|
|
70
|
-
## Checkpoint & Resume
|
|
71
|
-
|
|
72
|
-
**File:** `.deepflow/checkpoint.json` — stored in WORKTREE directory, not main.
|
|
73
|
-
|
|
74
|
-
**Schema:**
|
|
75
|
-
```json
|
|
76
|
-
{
|
|
77
|
-
"completed_tasks": ["T1", "T2"],
|
|
78
|
-
"current_wave": 2,
|
|
79
|
-
"worktree_path": ".deepflow/worktrees/upload",
|
|
80
|
-
"worktree_branch": "df/upload"
|
|
81
|
-
}
|
|
82
|
-
```
|
|
83
|
-
|
|
84
|
-
**On checkpoint:** Complete wave → update PLAN.md → save to worktree → exit.
|
|
85
|
-
**Resume:** `--continue` loads checkpoint, verifies worktree, skips completed tasks.
|
|
39
|
+
---
|
|
86
40
|
|
|
87
41
|
## Behavior
|
|
88
42
|
|
|
89
43
|
### 1. CHECK CHECKPOINT
|
|
90
44
|
|
|
91
45
|
```
|
|
92
|
-
--continue → Load checkpoint
|
|
93
|
-
→
|
|
94
|
-
|
|
95
|
-
→ If missing: Error "Worktree deleted. Use --fresh"
|
|
96
|
-
→ If exists: Use it, skip worktree creation
|
|
97
|
-
→ Resume execution with completed tasks
|
|
46
|
+
--continue → Load .deepflow/checkpoint.json from worktree
|
|
47
|
+
→ Verify worktree exists on disk (else error: "Use --fresh")
|
|
48
|
+
→ Skip completed tasks, resume execution
|
|
98
49
|
--fresh → Delete checkpoint, start fresh
|
|
99
50
|
checkpoint exists → Prompt: "Resume? (y/n)"
|
|
100
51
|
else → Start fresh
|
|
@@ -102,88 +53,29 @@ else → Start fresh
|
|
|
102
53
|
|
|
103
54
|
### 1.5. CREATE WORKTREE
|
|
104
55
|
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
# Check main is clean (ignore untracked)
|
|
109
|
-
git diff --quiet HEAD || Error: "Main has uncommitted changes. Commit or stash first."
|
|
110
|
-
|
|
111
|
-
# Generate paths
|
|
112
|
-
SPEC_NAME=$(basename spec/doing-*.md .md | sed 's/doing-//')
|
|
113
|
-
BRANCH_NAME="df/${SPEC_NAME}"
|
|
114
|
-
WORKTREE_PATH=".deepflow/worktrees/${SPEC_NAME}"
|
|
115
|
-
|
|
116
|
-
# Create worktree (or reuse existing)
|
|
117
|
-
if [ -d "${WORKTREE_PATH}" ]; then
|
|
118
|
-
echo "Reusing existing worktree"
|
|
119
|
-
else
|
|
120
|
-
git worktree add -b "${BRANCH_NAME}" "${WORKTREE_PATH}"
|
|
121
|
-
fi
|
|
122
|
-
```
|
|
123
|
-
|
|
124
|
-
**Existing worktree:** Reuse it (same spec = same worktree).
|
|
125
|
-
|
|
126
|
-
**--fresh flag:** Deletes existing worktree and creates new one.
|
|
56
|
+
Require clean HEAD (`git diff --quiet`). Derive SPEC_NAME from `specs/doing-*.md`.
|
|
57
|
+
Create worktree: `.deepflow/worktrees/{spec}` on branch `df/{spec}`.
|
|
58
|
+
Reuse if exists. `--fresh` deletes first.
|
|
127
59
|
|
|
128
60
|
### 1.6. RATCHET SNAPSHOT
|
|
129
61
|
|
|
130
|
-
|
|
62
|
+
Snapshot pre-existing test files in worktree — only these count for ratchet (agent-created tests excluded):
|
|
131
63
|
|
|
132
64
|
```bash
|
|
133
65
|
cd ${WORKTREE_PATH}
|
|
134
|
-
|
|
135
|
-
# Snapshot pre-existing test files (only these count for ratchet)
|
|
136
66
|
git ls-files | grep -E '\.(test|spec)\.[^/]+$|^test_|_test\.[^/]+$|^tests/|__tests__/' \
|
|
137
67
|
> .deepflow/auto-snapshot.txt
|
|
138
|
-
|
|
139
|
-
echo "Ratchet snapshot: $(wc -l < .deepflow/auto-snapshot.txt) pre-existing test files"
|
|
140
68
|
```
|
|
141
69
|
|
|
142
|
-
**Only pre-existing test files are used for ratchet evaluation.** New test files created by agents during implementation don't influence the pass/fail decision. This prevents agents from gaming the ratchet by writing tests that pass trivially.
|
|
143
|
-
|
|
144
70
|
### 1.7. NO-TESTS BOOTSTRAP
|
|
145
71
|
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
```bash
|
|
149
|
-
TEST_COUNT=$(wc -l < .deepflow/auto-snapshot.txt | tr -d ' ')
|
|
150
|
-
|
|
151
|
-
if [ "${TEST_COUNT}" = "0" ]; then
|
|
152
|
-
echo "Bootstrap needed: no pre-existing test files found."
|
|
153
|
-
BOOTSTRAP_NEEDED=true
|
|
154
|
-
else
|
|
155
|
-
BOOTSTRAP_NEEDED=false
|
|
156
|
-
fi
|
|
157
|
-
```
|
|
158
|
-
|
|
159
|
-
**If `BOOTSTRAP_NEEDED=true`:**
|
|
160
|
-
|
|
161
|
-
1. **Inject a bootstrap task** as the FIRST action before any regular PLAN.md task is executed:
|
|
162
|
-
- Bootstrap task description: "Write tests for files in edit_scope"
|
|
163
|
-
- Read `edit_scope` from `specs/doing-*.md` to know which files need tests
|
|
164
|
-
- Spawn ONE dedicated bootstrap agent using the Bootstrap Task prompt (section 6)
|
|
72
|
+
If snapshot has zero test files:
|
|
165
73
|
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
- The bootstrap agent's ONLY job is writing tests — no implementation changes
|
|
74
|
+
1. Spawn ONE bootstrap agent (section 6 Bootstrap Task) to write tests for `edit_scope` files
|
|
75
|
+
2. On ratchet pass: re-snapshot, report `"bootstrap: completed"`, end cycle (no PLAN.md tasks this cycle)
|
|
76
|
+
3. On ratchet fail: revert, halt with "Bootstrap failed — manual intervention required"
|
|
170
77
|
|
|
171
|
-
|
|
172
|
-
- Run ratchet health checks (build must pass; test suite must not error out)
|
|
173
|
-
- If ratchet passes: re-take the ratchet snapshot so subsequent tasks use the new tests as baseline:
|
|
174
|
-
```bash
|
|
175
|
-
cd ${WORKTREE_PATH}
|
|
176
|
-
git ls-files | grep -E '\.(test|spec)\.[^/]+$|^test_|_test\.[^/]+$|^tests/|__tests__/' \
|
|
177
|
-
> .deepflow/auto-snapshot.txt
|
|
178
|
-
echo "Post-bootstrap snapshot: $(wc -l < .deepflow/auto-snapshot.txt) test files"
|
|
179
|
-
```
|
|
180
|
-
- If ratchet fails: revert bootstrap commit, log error, halt and report "Bootstrap failed — manual intervention required"
|
|
181
|
-
|
|
182
|
-
4. **Signal to caller:** After bootstrap completes successfully, report `"bootstrap: completed"` in the cycle summary. This cycle's sole output is the test bootstrap — no regular PLAN.md task is executed this cycle.
|
|
183
|
-
|
|
184
|
-
5. **Subsequent cycles:** The updated `.deepflow/auto-snapshot.txt` now contains the bootstrapped test files. All subsequent ratchet checks use these as the baseline.
|
|
185
|
-
|
|
186
|
-
**If `BOOTSTRAP_NEEDED=false`:** Proceed normally to section 2.
|
|
78
|
+
Subsequent cycles use bootstrapped tests as ratchet baseline.
|
|
187
79
|
|
|
188
80
|
### 2. LOAD PLAN
|
|
189
81
|
|
|
@@ -194,7 +86,7 @@ If missing: "No PLAN.md found. Run /df:plan first."
|
|
|
194
86
|
|
|
195
87
|
### 2.5. REGISTER NATIVE TASKS
|
|
196
88
|
|
|
197
|
-
For each `[ ]` task in PLAN.md: `TaskCreate(subject: "{task_id}: {description}", activeForm: "{gerund}", description: full block)`. Store task_id → native ID mapping.
|
|
89
|
+
For each `[ ]` task in PLAN.md: `TaskCreate(subject: "{task_id}: {description}", activeForm: "{gerund}", description: full block)`. Store task_id → native ID mapping. Set dependencies via `TaskUpdate(addBlockedBy: [...])`. On `--continue`: only register remaining `[ ]` items.
|
|
198
90
|
|
|
199
91
|
### 3. CHECK FOR UNPLANNED SPECS
|
|
200
92
|
|
|
@@ -202,237 +94,84 @@ Warn if `specs/*.md` (excluding doing-/done-) exist. Non-blocking.
|
|
|
202
94
|
|
|
203
95
|
### 4. IDENTIFY READY TASKS
|
|
204
96
|
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
```
|
|
208
|
-
Ready = TaskList results where:
|
|
209
|
-
- status: "pending"
|
|
210
|
-
- blockedBy: empty (auto-unblocked by native dependency system)
|
|
211
|
-
```
|
|
97
|
+
Ready = TaskList where status: "pending" AND blockedBy: empty.
|
|
212
98
|
|
|
213
99
|
### 5. SPAWN AGENTS
|
|
214
100
|
|
|
215
101
|
Context ≥50%: checkpoint and exit.
|
|
216
102
|
|
|
217
|
-
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
```
|
|
221
|
-
This activates the UI spinner showing the task's activeForm (e.g. "Creating upload endpoint").
|
|
103
|
+
Before spawning: `TaskUpdate(taskId: native_id, status: "in_progress")` — activates UI spinner.
|
|
104
|
+
|
|
105
|
+
**NEVER use `isolation: "worktree"` on Task calls.** Deepflow manages a shared worktree so wave 2 sees wave 1 commits.
|
|
222
106
|
|
|
223
|
-
**
|
|
107
|
+
**Spawn ALL ready tasks in ONE message** — EXCEPT file conflicts (see below).
|
|
224
108
|
|
|
225
|
-
**
|
|
109
|
+
**File conflict enforcement (1 file = 1 writer):**
|
|
110
|
+
Before spawning, check `Files:` lists of all ready tasks. If two+ ready tasks share a file:
|
|
111
|
+
1. Sort conflicting tasks by task number (T1 < T2 < T3)
|
|
112
|
+
2. Spawn only the lowest-numbered task from each conflict group
|
|
113
|
+
3. Remaining tasks stay `pending` — they become ready once the spawned task completes
|
|
114
|
+
4. Log: `"⏳ T{N} deferred — file conflict with T{M} on {filename}"`
|
|
226
115
|
|
|
227
|
-
|
|
116
|
+
**≥2 [SPIKE] tasks for same problem:** Follow Parallel Spike Probes (section 5.7).
|
|
228
117
|
|
|
229
118
|
### 5.5. RATCHET CHECK
|
|
230
119
|
|
|
231
|
-
After each agent completes
|
|
120
|
+
After each agent completes, run health checks in the worktree.
|
|
232
121
|
|
|
233
|
-
**
|
|
122
|
+
**Auto-detect commands:**
|
|
234
123
|
|
|
235
124
|
| File | Build | Test | Typecheck | Lint |
|
|
236
125
|
|------|-------|------|-----------|------|
|
|
237
|
-
| `package.json` | `npm run build`
|
|
238
|
-
| `pyproject.toml` | — | `pytest` | `mypy .`
|
|
239
|
-
| `Cargo.toml` | `cargo build` | `cargo test` | — | `cargo clippy`
|
|
126
|
+
| `package.json` | `npm run build` | `npm test` | `npx tsc --noEmit` | `npm run lint` |
|
|
127
|
+
| `pyproject.toml` | — | `pytest` | `mypy .` | `ruff check .` |
|
|
128
|
+
| `Cargo.toml` | `cargo build` | `cargo test` | — | `cargo clippy` |
|
|
240
129
|
| `go.mod` | `go build ./...` | `go test ./...` | — | `go vet ./...` |
|
|
241
130
|
|
|
242
|
-
|
|
243
|
-
```bash
|
|
244
|
-
cd ${WORKTREE_PATH}
|
|
131
|
+
Run Build → Test → Typecheck → Lint (stop on first failure).
|
|
245
132
|
|
|
246
|
-
|
|
247
|
-
# Build → Test → Typecheck → Lint (stop on first failure)
|
|
248
|
-
```
|
|
133
|
+
**Edit scope validation** (if spec declares `edit_scope`): check `git diff HEAD~1 --name-only` against allowed globs. Violations → `git revert HEAD --no-edit`, report "Edit scope violation: {files}".
|
|
249
134
|
|
|
250
|
-
**
|
|
251
|
-
|
|
252
|
-
|
|
253
|
-
CHANGED=$(git diff HEAD~1 --name-only)
|
|
254
|
-
|
|
255
|
-
# Load edit_scope from spec (files/globs)
|
|
256
|
-
EDIT_SCOPE=$(grep 'edit_scope:' specs/doing-*.md | sed 's/edit_scope://' | tr ',' '\n' | xargs)
|
|
257
|
-
|
|
258
|
-
# Check each changed file against allowed scope
|
|
259
|
-
for file in ${CHANGED}; do
|
|
260
|
-
ALLOWED=false
|
|
261
|
-
for pattern in ${EDIT_SCOPE}; do
|
|
262
|
-
# Match file against glob pattern
|
|
263
|
-
[[ "${file}" == ${pattern} ]] && ALLOWED=true
|
|
264
|
-
done
|
|
265
|
-
${ALLOWED} || VIOLATIONS+=("${file}")
|
|
266
|
-
done
|
|
267
|
-
```
|
|
268
|
-
|
|
269
|
-
- Violations found → revert: `git revert HEAD --no-edit`, report "✗ Edit scope violation: {files}"
|
|
270
|
-
- No violations → continue to health checks
|
|
135
|
+
**Impact completeness check** (if task has Impact block in PLAN.md):
|
|
136
|
+
Compare `git diff HEAD~1 --name-only` against Impact callers/duplicates list.
|
|
137
|
+
File listed but not modified → **advisory warning**: "Impact gap: {file} listed as {caller|duplicate} but not modified — verify manually". Not auto-revert (callers sometimes don't need changes), but flags the risk.
|
|
271
138
|
|
|
272
|
-
**
|
|
273
|
-
- All checks pass AND no scope violations → task succeeds, commit stands
|
|
274
|
-
- Any check fails → regression detected → revert: `git revert HEAD --no-edit`
|
|
139
|
+
**Evaluate:** All pass + no violations → commit stands. Any failure → `git revert HEAD --no-edit`.
|
|
275
140
|
|
|
276
|
-
|
|
277
|
-
|
|
278
|
-
**For spike tasks:** Same ratchet. If the spike's code passes pre-existing health checks, the spike passes. No LLM judges another LLM's work.
|
|
141
|
+
Ratchet uses ONLY pre-existing test files from `.deepflow/auto-snapshot.txt`.
|
|
279
142
|
|
|
280
143
|
### 5.7. PARALLEL SPIKE PROBES
|
|
281
144
|
|
|
282
|
-
|
|
283
|
-
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
|
|
287
|
-
|
|
288
|
-
|
|
289
|
-
|
|
290
|
-
|
|
291
|
-
|
|
292
|
-
|
|
293
|
-
|
|
294
|
-
|
|
295
|
-
|
|
296
|
-
|
|
297
|
-
|
|
298
|
-
|
|
299
|
-
|
|
300
|
-
|
|
301
|
-
|
|
302
|
-
|
|
303
|
-
|
|
304
|
-
|
|
305
|
-
|
|
306
|
-
|
|
307
|
-
|
|
308
|
-
|
|
309
|
-
|
|
310
|
-
|
|
311
|
-
|
|
312
|
-
echo "Created probe worktree: ${PROBE_PATH} (branch: ${PROBE_BRANCH})"
|
|
313
|
-
```
|
|
314
|
-
|
|
315
|
-
#### Step 3: Spawn all probes in parallel
|
|
316
|
-
|
|
317
|
-
Mark every spike task as `in_progress`, then spawn one agent per probe **in a single message** using the Spike Task prompt (section 6), with the probe's worktree path as its working directory.
|
|
318
|
-
|
|
319
|
-
```
|
|
320
|
-
TaskUpdate(taskId: native_id_SPIKE_A, status: "in_progress")
|
|
321
|
-
TaskUpdate(taskId: native_id_SPIKE_B, status: "in_progress")
|
|
322
|
-
[spawn agent for SPIKE_A → PROBE_PATH_A]
|
|
323
|
-
[spawn agent for SPIKE_B → PROBE_PATH_B]
|
|
324
|
-
... (all in ONE message)
|
|
325
|
-
```
|
|
326
|
-
|
|
327
|
-
End your turn. Do NOT poll or monitor. Wait for completion notifications.
|
|
328
|
-
|
|
329
|
-
#### Step 4: Ratchet each probe (on completion notifications)
|
|
330
|
-
|
|
331
|
-
When a probe agent's notification arrives, run the standard ratchet (section 5.5) against its dedicated probe worktree:
|
|
332
|
-
|
|
333
|
-
```bash
|
|
334
|
-
cd ${PROBE_PATH}
|
|
335
|
-
|
|
336
|
-
# Identical health-check commands as standard tasks
|
|
337
|
-
# Build → Test → Typecheck → Lint (stop on first failure)
|
|
338
|
-
```
|
|
339
|
-
|
|
340
|
-
Record per-probe metrics:
|
|
341
|
-
|
|
342
|
-
```yaml
|
|
343
|
-
probe_id: SPIKE_A
|
|
344
|
-
worktree: .deepflow/worktrees/{spec}/probe-SPIKE_A
|
|
345
|
-
branch: df/{spec}/probe-SPIKE_A
|
|
346
|
-
ratchet_passed: true/false
|
|
347
|
-
regressions: 0 # failing pre-existing tests
|
|
348
|
-
coverage_delta: +3 # new lines covered (positive = better)
|
|
349
|
-
files_changed: 4 # number of files touched
|
|
350
|
-
commit: abc1234
|
|
351
|
-
```
|
|
352
|
-
|
|
353
|
-
Wait until **all** probe notifications have arrived before proceeding to selection.
|
|
354
|
-
|
|
355
|
-
#### Step 5: Machine-select winner
|
|
356
|
-
|
|
357
|
-
No LLM evaluates another LLM's work. Apply the following ordered criteria to all probes that **passed** the ratchet:
|
|
358
|
-
|
|
359
|
-
```
|
|
360
|
-
1. Fewer regressions (lower is better — hard gate: any regression disqualifies)
|
|
361
|
-
2. Better coverage (higher delta is better)
|
|
362
|
-
3. Fewer files changed (lower is better — smaller blast radius)
|
|
363
|
-
|
|
364
|
-
Tie-break: first probe to complete (chronological)
|
|
365
|
-
```
|
|
366
|
-
|
|
367
|
-
If **no** probe passes the ratchet, all are failed probes. Log insights (step 7) and reset the spike tasks to `pending` for retry with debugger guidance.
|
|
368
|
-
|
|
369
|
-
#### Step 6: Preserve ALL probe worktrees
|
|
370
|
-
|
|
371
|
-
Do NOT delete losing probe worktrees. They are preserved for manual inspection and cross-cycle learning:
|
|
372
|
-
|
|
373
|
-
```bash
|
|
374
|
-
# Winning probe: leave as-is, will be used as implementation base (step 8)
|
|
375
|
-
# Losing probes: leave worktrees intact, mark branches with -failed suffix for clarity
|
|
376
|
-
git branch -m "df/{spec}/probe-SPIKE_B" "df/{spec}/probe-SPIKE_B-failed"
|
|
377
|
-
```
|
|
378
|
-
|
|
379
|
-
Record all probe paths in `.deepflow/checkpoint.json` under `"spike_probes"` so future `--continue` runs know they exist.
|
|
380
|
-
|
|
381
|
-
#### Step 7: Log failed probe insights
|
|
382
|
-
|
|
383
|
-
For every probe that failed the ratchet (or lost selection), write two entries to `.deepflow/auto-memory.yaml` in the **main** tree.
|
|
384
|
-
|
|
385
|
-
**Entry 1 — `spike_insights` (detailed probe record):**
|
|
386
|
-
|
|
387
|
-
```yaml
|
|
388
|
-
spike_insights:
|
|
389
|
-
- date: "YYYY-MM-DD"
|
|
390
|
-
spec: "{spec_name}"
|
|
391
|
-
spike_id: "SPIKE_B"
|
|
392
|
-
hypothesis: "{hypothesis text from PLAN.md}"
|
|
393
|
-
outcome: "failed" # or "passed-but-lost"
|
|
394
|
-
failure_reason: "{first failed check and error summary}"
|
|
395
|
-
ratchet_metrics:
|
|
396
|
-
regressions: 2
|
|
397
|
-
coverage_delta: -1
|
|
398
|
-
files_changed: 7
|
|
399
|
-
worktree: ".deepflow/worktrees/{spec}/probe-SPIKE_B-failed"
|
|
400
|
-
branch: "df/{spec}/probe-SPIKE_B-failed"
|
|
401
|
-
edge_cases: [] # orchestrator may populate after manual review
|
|
402
|
-
```
|
|
403
|
-
|
|
404
|
-
**Entry 2 — `probe_learnings` (cross-cycle memory, read by `/df:auto-cycle` on each cycle start):**
|
|
405
|
-
|
|
406
|
-
```yaml
|
|
407
|
-
probe_learnings:
|
|
408
|
-
- spike: "SPIKE_B"
|
|
409
|
-
probe: "{probe branch suffix, e.g. probe-SPIKE_B}"
|
|
410
|
-
insight: "{one-sentence summary of what the probe revealed, derived from failure_reason}"
|
|
411
|
-
```
|
|
412
|
-
|
|
413
|
-
If the file does not exist, create it. Initialize both `spike_insights:` and `probe_learnings:` as empty lists before appending. Preserve all existing keys when merging.
|
|
414
|
-
|
|
415
|
-
#### Step 8: Promote winning probe
|
|
416
|
-
|
|
417
|
-
Cherry-pick the winner's commit into the shared spec worktree so downstream implementation tasks see the winning approach:
|
|
418
|
-
|
|
419
|
-
```bash
|
|
420
|
-
cd ${WORKTREE_PATH} # shared worktree (not the probe sub-worktree)
|
|
421
|
-
git cherry-pick ${WINNER_COMMIT}
|
|
422
|
-
```
|
|
423
|
-
|
|
424
|
-
Then mark the winning spike task as `completed` and auto-unblock its dependents:
|
|
425
|
-
|
|
426
|
-
```
|
|
427
|
-
TaskUpdate(taskId: native_id_SPIKE_WINNER, status: "completed")
|
|
428
|
-
TaskUpdate(taskId: native_id_SPIKE_LOSERS, status: "pending") # keep visible for audit
|
|
429
|
-
```
|
|
430
|
-
|
|
431
|
-
Update PLAN.md:
|
|
432
|
-
- Winning spike → `[x]` with commit hash and `[PROBE_WINNER]` tag
|
|
433
|
-
- Losing spikes → `[~]` (skipped) with `[PROBE_FAILED: see auto-memory.yaml]` note
|
|
434
|
-
|
|
435
|
-
Resume the standard execution loop (section 9) — implementation tasks blocked by the spike group are now unblocked.
|
|
145
|
+
Trigger: ≥2 [SPIKE] tasks with same "Blocked by:" target or identical hypothesis.
|
|
146
|
+
|
|
147
|
+
1. **Baseline:** Record `BASELINE=$(git rev-parse HEAD)` in shared worktree
|
|
148
|
+
2. **Sub-worktrees:** Per spike: `git worktree add -b df/{spec}--probe-{SPIKE_ID} .deepflow/worktrees/{spec}/probe-{SPIKE_ID} ${BASELINE}`
|
|
149
|
+
3. **Spawn:** All probes in ONE message, each targeting its probe worktree. End turn.
|
|
150
|
+
4. **Ratchet:** Per notification, run standard ratchet (5.5) in probe worktree. Record: ratchet_passed, regressions, coverage_delta, files_changed, commit
|
|
151
|
+
5. **Select winner** (after ALL complete, no LLM judge):
|
|
152
|
+
- Disqualify any with regressions
|
|
153
|
+
- Rank: fewer regressions > higher coverage_delta > fewer files_changed > first to complete
|
|
154
|
+
- No passes → reset all to pending for retry with debugger
|
|
155
|
+
6. **Preserve all worktrees.** Losers: rename branch + `-failed` suffix. Record in checkpoint.json under `"spike_probes"`
|
|
156
|
+
7. **Log failed probes** to `.deepflow/auto-memory.yaml` (main tree):
|
|
157
|
+
```yaml
|
|
158
|
+
spike_insights:
|
|
159
|
+
- date: "YYYY-MM-DD"
|
|
160
|
+
spec: "{spec_name}"
|
|
161
|
+
spike_id: "SPIKE_B"
|
|
162
|
+
hypothesis: "{from PLAN.md}"
|
|
163
|
+
outcome: "failed" # or "passed-but-lost"
|
|
164
|
+
failure_reason: "{first failed check + error summary}"
|
|
165
|
+
ratchet_metrics: {regressions: N, coverage_delta: N, files_changed: N}
|
|
166
|
+
worktree: ".deepflow/worktrees/{spec}/probe-SPIKE_B-failed"
|
|
167
|
+
branch: "df/{spec}--probe-SPIKE_B-failed"
|
|
168
|
+
probe_learnings: # read by /df:auto-cycle each start
|
|
169
|
+
- spike: "SPIKE_B"
|
|
170
|
+
probe: "probe-SPIKE_B"
|
|
171
|
+
insight: "{one-sentence summary from failure_reason}"
|
|
172
|
+
```
|
|
173
|
+
Create file if missing. Preserve existing keys when merging.
|
|
174
|
+
8. **Promote winner:** Cherry-pick into shared worktree. Winner → `[x] [PROBE_WINNER]`, losers → `[~] [PROBE_FAILED]`. Resume standard loop.
|
|
436
175
|
|
|
437
176
|
---
|
|
438
177
|
|
|
@@ -444,143 +183,127 @@ Working directory: {worktree_absolute_path}
|
|
|
444
183
|
All file operations MUST use this absolute path as base. Do NOT write files to the main project directory.
|
|
445
184
|
Commit format: {commit_type}({spec}): {description}
|
|
446
185
|
|
|
447
|
-
STOP after committing. Do NOT merge branches, rename spec files, remove worktrees, or run git checkout on main.
|
|
186
|
+
STOP after committing. Do NOT merge branches, rename spec files, remove worktrees, or run git checkout on main.
|
|
448
187
|
```
|
|
449
188
|
|
|
450
|
-
**Standard Task
|
|
189
|
+
**Standard Task:**
|
|
451
190
|
```
|
|
452
191
|
{task_id}: {description from PLAN.md}
|
|
453
|
-
Files: {target files}
|
|
454
|
-
|
|
192
|
+
Files: {target files} Spec: {spec_name}
|
|
193
|
+
{Impact block from PLAN.md — include verbatim if present}
|
|
194
|
+
|
|
195
|
+
{Prior failure context — include ONLY if task was previously reverted. Read from .deepflow/auto-memory.yaml revert_history for this task_id:}
|
|
196
|
+
Previous attempts (DO NOT repeat these approaches):
|
|
197
|
+
- Cycle {N}: reverted — "{reason from revert_history}"
|
|
198
|
+
- Cycle {N}: reverted — "{reason from revert_history}"
|
|
199
|
+
{Omit this entire block if task has no revert history.}
|
|
200
|
+
|
|
201
|
+
CRITICAL: If Impact lists duplicates or callers, you MUST verify each one is consistent with your changes.
|
|
202
|
+
- [active] duplicates → consolidate into single source of truth (e.g., local generateYAML → use shared buildConfigData)
|
|
203
|
+
- [dead] duplicates → DELETE the dead code entirely. Dead code pollutes context and causes drift.
|
|
455
204
|
|
|
456
205
|
Steps:
|
|
457
|
-
1.
|
|
458
|
-
|
|
459
|
-
|
|
460
|
-
|
|
461
|
-
3. Commit as feat({spec}): {description}
|
|
206
|
+
1. External APIs/SDKs → chub search "<library>" --json → chub get <id> --lang <lang> (skip if chub unavailable or internal code only)
|
|
207
|
+
2. Read ALL files in Impact before implementing — understand the full picture
|
|
208
|
+
3. Implement the task, updating all impacted files
|
|
209
|
+
4. Commit as feat({spec}): {description}
|
|
462
210
|
|
|
463
|
-
Your ONLY job is to write code and commit.
|
|
211
|
+
Your ONLY job is to write code and commit. Orchestrator runs health checks after.
|
|
464
212
|
```
|
|
465
213
|
|
|
466
|
-
**Bootstrap Task
|
|
214
|
+
**Bootstrap Task:**
|
|
467
215
|
```
|
|
468
216
|
BOOTSTRAP: Write tests for files in edit_scope
|
|
469
|
-
Files: {edit_scope files
|
|
470
|
-
Spec: {spec_name}
|
|
217
|
+
Files: {edit_scope files} Spec: {spec_name}
|
|
471
218
|
|
|
472
|
-
|
|
473
|
-
|
|
474
|
-
2. Do NOT change implementation files — tests only
|
|
475
|
-
3. Commit as test({spec}): bootstrap tests for edit_scope
|
|
476
|
-
|
|
477
|
-
Your ONLY job is to write tests and commit. The orchestrator will run health checks after you finish.
|
|
219
|
+
Write tests covering listed files. Do NOT change implementation files.
|
|
220
|
+
Commit as test({spec}): bootstrap tests for edit_scope
|
|
478
221
|
```
|
|
479
222
|
|
|
480
|
-
**Spike Task
|
|
223
|
+
**Spike Task:**
|
|
481
224
|
```
|
|
482
225
|
{task_id} [SPIKE]: {hypothesis}
|
|
483
|
-
Files: {target files}
|
|
484
|
-
Spec: {spec_name}
|
|
226
|
+
Files: {target files} Spec: {spec_name}
|
|
485
227
|
|
|
486
|
-
|
|
487
|
-
|
|
488
|
-
|
|
228
|
+
{Prior failure context — include ONLY if this spike was previously reverted. Read from .deepflow/auto-memory.yaml revert_history + spike_insights for this task_id:}
|
|
229
|
+
Previous attempts (DO NOT repeat these approaches):
|
|
230
|
+
- Cycle {N}: reverted — "{reason}"
|
|
231
|
+
{Omit this entire block if no revert history.}
|
|
489
232
|
|
|
490
|
-
|
|
233
|
+
Implement minimal spike to validate hypothesis.
|
|
234
|
+
Commit as spike({spec}): {description}
|
|
491
235
|
```
|
|
492
236
|
|
|
493
|
-
###
|
|
237
|
+
### 8. COMPLETE SPECS
|
|
494
238
|
|
|
495
|
-
When
|
|
239
|
+
When all tasks done for a `doing-*` spec:
|
|
240
|
+
1. Run `/df:verify doing-{name}` via the Skill tool (`skill: "df:verify", args: "doing-{name}"`)
|
|
241
|
+
- Verify runs quality gates (L0-L4), merges worktree branch to main, cleans up worktree, renames spec `doing-*` → `done-*`, and extracts decisions
|
|
242
|
+
- If verify fails (adds fix tasks): stop here — `/df:execute --continue` will pick up the fix tasks
|
|
243
|
+
- If verify passes: proceed to step 2
|
|
244
|
+
2. Remove spec's ENTIRE section from PLAN.md (header, tasks, summaries, fix tasks, separators)
|
|
245
|
+
3. Recalculate Summary table at top of PLAN.md
|
|
496
246
|
|
|
497
|
-
|
|
247
|
+
---
|
|
498
248
|
|
|
499
|
-
|
|
249
|
+
## Usage
|
|
500
250
|
|
|
501
|
-
|
|
251
|
+
```
|
|
252
|
+
/df:execute # Execute all ready tasks
|
|
253
|
+
/df:execute T1 T2 # Specific tasks only
|
|
254
|
+
/df:execute --continue # Resume from checkpoint
|
|
255
|
+
/df:execute --fresh # Ignore checkpoint
|
|
256
|
+
/df:execute --dry-run # Show plan only
|
|
257
|
+
```
|
|
502
258
|
|
|
503
|
-
|
|
259
|
+
## Skills & Agents
|
|
504
260
|
|
|
505
|
-
|
|
506
|
-
|
|
507
|
-
2. Rename: `doing-upload.md` → `done-upload.md`
|
|
508
|
-
3. Extract decisions from done-* spec: Read the `done-{name}.md` file. Model-extract architectural decisions — look for explicit choices (→ `[APPROACH]`), unvalidated assumptions (→ `[ASSUMPTION]`), and "for now" decisions (→ `[PROVISIONAL]`). Append as a new section to **main tree** `.deepflow/decisions.md`:
|
|
509
|
-
```
|
|
510
|
-
### {YYYY-MM-DD} — {spec-name}
|
|
511
|
-
- [TAG] decision text — rationale
|
|
512
|
-
```
|
|
513
|
-
After successful append, delete `specs/done-{name}.md`. If write fails, preserve the file.
|
|
514
|
-
4. Remove the spec's ENTIRE section from PLAN.md:
|
|
515
|
-
- The `### doing-{spec}` header
|
|
516
|
-
- All task entries (`- [x] **T{n}**: ...` and their sub-items)
|
|
517
|
-
- Any `## Execution Summary` block for that spec
|
|
518
|
-
- Any `### Fix Tasks` sub-section for that spec
|
|
519
|
-
- Separators (`---`) between removed sections
|
|
520
|
-
5. Recalculate the Summary table at the top of PLAN.md (update counts for completed/pending)
|
|
261
|
+
- Skill: `atomic-commits` — Clean commit protocol
|
|
262
|
+
- Skill: `browse-fetch` — Fetch live web pages and external API docs via browser before coding
|
|
521
263
|
|
|
522
|
-
|
|
264
|
+
| Agent | subagent_type | Purpose |
|
|
265
|
+
|-------|---------------|---------|
|
|
266
|
+
| Implementation | `general-purpose` | Task implementation |
|
|
267
|
+
| Debugger | `reasoner` | Debugging failures |
|
|
523
268
|
|
|
524
|
-
|
|
269
|
+
**Model routing:** Use `model:` from command/agent/skill frontmatter. Default: `sonnet`.
|
|
525
270
|
|
|
526
|
-
**
|
|
527
|
-
|
|
528
|
-
|
|
529
|
-
|
|
530
|
-
|
|
531
|
-
|
|
532
|
-
6. If NOT all wave agents done → end turn, wait
|
|
533
|
-
7. If ALL wave agents done → use TaskList to find newly unblocked tasks, check context, spawn next wave or finish
|
|
271
|
+
**Checkpoint schema:** `.deepflow/checkpoint.json` in worktree:
|
|
272
|
+
```json
|
|
273
|
+
{"completed_tasks": ["T1","T2"], "current_wave": 2, "worktree_path": ".deepflow/worktrees/upload", "worktree_branch": "df/upload"}
|
|
274
|
+
```
|
|
275
|
+
|
|
276
|
+
---
|
|
534
277
|
|
|
535
|
-
|
|
278
|
+
## Failure Handling
|
|
536
279
|
|
|
537
|
-
|
|
280
|
+
When task fails ratchet and is reverted:
|
|
281
|
+
- `TaskUpdate(taskId: native_id, status: "pending")` — dependents remain blocked
|
|
282
|
+
- Repeated failure → spawn `Task(subagent_type="reasoner", prompt="Debug failure: {ratchet output}")`
|
|
283
|
+
- Leave worktree intact, keep checkpoint.json
|
|
284
|
+
- Output: worktree path/branch, `cd {path}` to investigate, `--continue` to resume, `--fresh` to discard
|
|
538
285
|
|
|
539
286
|
## Rules
|
|
540
287
|
|
|
541
288
|
| Rule | Detail |
|
|
542
289
|
|------|--------|
|
|
543
|
-
| Zero test files → bootstrap first |
|
|
290
|
+
| Zero test files → bootstrap first | Bootstrap is cycle's sole task when snapshot empty |
|
|
544
291
|
| 1 task = 1 agent = 1 commit | `atomic-commits` skill |
|
|
545
292
|
| 1 file = 1 writer | Sequential if conflict |
|
|
546
293
|
| Agent writes code, orchestrator measures | Ratchet is the judge |
|
|
547
294
|
| No LLM evaluates LLM work | Health checks only |
|
|
548
|
-
| ≥2 spikes
|
|
549
|
-
| All probe worktrees preserved |
|
|
550
|
-
| Machine-selected winner |
|
|
551
|
-
|
|
|
552
|
-
| Winner cherry-picked to shared worktree | Downstream tasks see winning approach via shared worktree |
|
|
553
|
-
| External APIs → chub first | Agents fetch curated docs before implementing external API calls; skip if chub unavailable |
|
|
295
|
+
| ≥2 spikes same problem → parallel probes | Never run competing spikes sequentially |
|
|
296
|
+
| All probe worktrees preserved | Losers renamed `-failed`; never deleted |
|
|
297
|
+
| Machine-selected winner | Regressions > coverage > files changed; no LLM judge |
|
|
298
|
+
| External APIs → chub first | Skip if unavailable |
|
|
554
299
|
|
|
555
300
|
## Example
|
|
556
301
|
|
|
557
|
-
### No-Tests Bootstrap
|
|
558
|
-
|
|
559
|
-
```
|
|
560
|
-
/df:execute (context: 8%)
|
|
561
|
-
|
|
562
|
-
Loading PLAN.md... T1 ready, T2/T3 blocked by T1
|
|
563
|
-
Ratchet snapshot: 0 pre-existing test files
|
|
564
|
-
Bootstrap needed: no pre-existing test files found.
|
|
565
|
-
|
|
566
|
-
Spawning bootstrap agent for edit_scope...
|
|
567
|
-
[Bootstrap agent completed]
|
|
568
|
-
Running ratchet: build ✓ | tests ✓ (12 new tests pass)
|
|
569
|
-
✓ Bootstrap: ratchet passed (boo1234)
|
|
570
|
-
Re-taking ratchet snapshot: 3 test files
|
|
571
|
-
|
|
572
|
-
bootstrap: completed — cycle's sole task was test bootstrap
|
|
573
|
-
Next: Run /df:auto-cycle again to execute T1
|
|
574
|
-
```
|
|
575
|
-
|
|
576
|
-
### Standard Execution
|
|
577
|
-
|
|
578
302
|
```
|
|
579
303
|
/df:execute (context: 12%)
|
|
580
304
|
|
|
581
305
|
Loading PLAN.md... T1 ready, T2/T3 blocked by T1
|
|
582
306
|
Ratchet snapshot: 24 pre-existing test files
|
|
583
|
-
Registering native tasks: TaskCreate T1/T2/T3, TaskUpdate(T2 blockedBy T1), TaskUpdate(T3 blockedBy T1)
|
|
584
307
|
|
|
585
308
|
Wave 1: TaskUpdate(T1, in_progress)
|
|
586
309
|
[Agent "T1" completed]
|
|
@@ -589,43 +312,13 @@ Wave 1: TaskUpdate(T1, in_progress)
|
|
|
589
312
|
TaskUpdate(T1, completed) → auto-unblocks T2, T3
|
|
590
313
|
|
|
591
314
|
Wave 2: TaskUpdate(T2/T3, in_progress)
|
|
592
|
-
[Agent "T2" completed]
|
|
593
|
-
|
|
594
|
-
|
|
595
|
-
|
|
596
|
-
|
|
597
|
-
✓
|
|
598
|
-
|
|
599
|
-
|
|
600
|
-
|
|
601
|
-
Next: Run /df:verify to verify specs and merge to main
|
|
602
|
-
```
|
|
603
|
-
|
|
604
|
-
### Ratchet Failure (Regression Detected)
|
|
605
|
-
|
|
606
|
-
```
|
|
607
|
-
/df:execute (context: 10%)
|
|
608
|
-
|
|
609
|
-
Wave 1: TaskUpdate(T1, in_progress)
|
|
610
|
-
[Agent "T1" completed]
|
|
611
|
-
Running ratchet: build ✓ | tests ✗ (2 failed of 24)
|
|
612
|
-
✗ T1: ratchet failed, reverted
|
|
613
|
-
TaskUpdate(T1, pending)
|
|
614
|
-
|
|
615
|
-
Spawning debugger for T1...
|
|
616
|
-
[Debugger completed]
|
|
617
|
-
Re-running T1 with fix guidance...
|
|
618
|
-
|
|
619
|
-
[Agent "T1 retry" completed]
|
|
620
|
-
Running ratchet: build ✓ | tests ✓ (24 passed) | typecheck ✓
|
|
621
|
-
✓ T1: ratchet passed (abc1234)
|
|
622
|
-
```
|
|
623
|
-
|
|
624
|
-
### With Checkpoint
|
|
625
|
-
|
|
626
|
-
```
|
|
627
|
-
Wave 1 complete (context: 52%)
|
|
628
|
-
Checkpoint saved.
|
|
629
|
-
|
|
630
|
-
Next: Run /df:execute --continue to resume execution
|
|
315
|
+
[Agent "T2" completed] ✓ T2: ratchet passed (def5678)
|
|
316
|
+
[Agent "T3" completed] ✓ T3: ratchet passed (ghi9012)
|
|
317
|
+
|
|
318
|
+
Context: 35% — All tasks done for doing-upload.
|
|
319
|
+
Running /df:verify doing-upload...
|
|
320
|
+
✓ L0 | ✓ L1 (3/3 files) | ⚠ L2 (no coverage tool) | ✓ L4 (24 tests)
|
|
321
|
+
✓ Merged df/upload to main
|
|
322
|
+
✓ Spec complete: doing-upload → done-upload
|
|
323
|
+
Complete: 3/3
|
|
631
324
|
```
|