deepflow 0.1.86 → 0.1.88
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/install.js +32 -2
- package/hooks/df-spec-lint.js +78 -4
- package/hooks/df-statusline.js +77 -5
- package/hooks/df-tool-usage-spike.js +41 -0
- package/hooks/df-tool-usage.js +86 -0
- package/package.json +1 -1
- package/src/commands/df/auto-cycle.md +75 -558
- package/src/commands/df/auto.md +9 -48
- package/src/commands/df/consolidate.md +14 -38
- package/src/commands/df/debate.md +27 -156
- package/src/commands/df/discover.md +43 -149
- package/src/commands/df/execute.md +148 -585
- package/src/commands/df/note.md +37 -176
- package/src/commands/df/plan.md +80 -210
- package/src/commands/df/report.md +27 -184
- package/src/commands/df/resume.md +18 -101
- package/src/commands/df/spec.md +49 -145
- package/src/commands/df/update.md +3 -1
- package/src/commands/df/verify.md +59 -606
- package/src/skills/browse-fetch/SKILL.md +32 -257
- package/src/skills/browse-verify/SKILL.md +40 -174
- package/src/skills/code-completeness/SKILL.md +2 -9
- package/src/skills/gap-discovery/SKILL.md +19 -86
- package/templates/spec-template.md +12 -1
|
@@ -5,627 +5,144 @@ description: Execute one task from PLAN.md with ratchet health checks and state
|
|
|
5
5
|
|
|
6
6
|
# /df:auto-cycle — Single Cycle of Auto Mode
|
|
7
7
|
|
|
8
|
-
|
|
9
|
-
Execute one task from PLAN.md. Designed to be called by `/loop 1m /df:auto-cycle` — each invocation gets fresh context.
|
|
8
|
+
Execute one task from PLAN.md. Called by `/loop 1m /df:auto-cycle` — each invocation gets fresh context.
|
|
10
9
|
|
|
11
10
|
**NEVER:** use EnterPlanMode, use ExitPlanMode
|
|
12
11
|
|
|
13
|
-
---
|
|
14
|
-
|
|
15
|
-
## Usage
|
|
16
|
-
```
|
|
17
|
-
/df:auto-cycle # Pick next undone task and execute it (or verify if all done)
|
|
18
|
-
```
|
|
19
|
-
|
|
20
12
|
## Behavior
|
|
21
13
|
|
|
22
14
|
### 1. LOAD STATE
|
|
23
15
|
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
Load: .deepflow/auto-memory.yaml (optional — cross-cycle state, ignore if missing)
|
|
28
|
-
```
|
|
16
|
+
Shell injection (use output directly):
|
|
17
|
+
- `` !`cat PLAN.md 2>/dev/null || echo 'NOT_FOUND'` `` — required, error if missing
|
|
18
|
+
- `` !`cat .deepflow/auto-memory.yaml 2>/dev/null || echo 'NOT_FOUND'` `` — optional cross-cycle state
|
|
29
19
|
|
|
30
|
-
|
|
31
|
-
- `` !`cat PLAN.md 2>/dev/null || echo 'NOT_FOUND'` ``
|
|
32
|
-
- `` !`cat .deepflow/auto-memory.yaml 2>/dev/null || echo 'NOT_FOUND'` ``
|
|
33
|
-
|
|
34
|
-
**auto-memory.yaml full schema:**
|
|
35
|
-
|
|
36
|
-
```yaml
|
|
37
|
-
task_results:
|
|
38
|
-
T1: { status: success, commit: abc1234, cycle: 3 }
|
|
39
|
-
T2: { status: reverted, reason: "tests failed: 2 of 24", cycle: 4 }
|
|
40
|
-
revert_history:
|
|
41
|
-
- { task: T2, cycle: 4, reason: "tests failed" }
|
|
42
|
-
- { task: T2, cycle: 5, reason: "build error" }
|
|
43
|
-
consecutive_reverts: # written by circuit breaker (step 3.5)
|
|
44
|
-
T1: 0
|
|
45
|
-
T2: 2
|
|
46
|
-
probe_learnings:
|
|
47
|
-
- { spike: T1, probe: "streaming", insight: "discovered hidden dependency on fs.watch" }
|
|
48
|
-
optimize_state: # present only when an optimize task is active or was completed
|
|
49
|
-
task_id: "T{n}"
|
|
50
|
-
metric_command: "{shell command}"
|
|
51
|
-
target: {number}
|
|
52
|
-
direction: "higher|lower"
|
|
53
|
-
baseline: null # float; set on first measure
|
|
54
|
-
current_best: null # best metric value seen
|
|
55
|
-
best_commit: null # short commit hash of best value
|
|
56
|
-
cycles_run: 0
|
|
57
|
-
cycles_without_improvement: 0
|
|
58
|
-
consecutive_reverts: 0 # optimize-specific revert counter (separate from global)
|
|
59
|
-
probe_scale: 0 # 0=no probes yet, 2/4/6
|
|
60
|
-
max_cycles: {number}
|
|
61
|
-
history: [] # [{cycle, value, delta_pct, kept: bool, commit}]
|
|
62
|
-
failed_hypotheses: [] # ["{description}"] — written to experiments/ on completion
|
|
63
|
-
```
|
|
64
|
-
|
|
65
|
-
Each section is optional. Missing keys are treated as empty. The file is created on first write if absent.
|
|
20
|
+
**auto-memory.yaml schema:** see `/df:execute`. Each section optional, missing keys = empty. Created on first write if absent.
|
|
66
21
|
|
|
67
22
|
### 2. PICK NEXT TASK
|
|
68
23
|
|
|
69
|
-
**Optimize-active override:**
|
|
24
|
+
**Optimize-active override:** Check `optimize_state.task_id` in auto-memory.yaml first. If present and task still `[ ]` in PLAN.md, resume it (skip normal scan). If task is `[x]`, clear `optimize_state.task_id` and fall through.
|
|
70
25
|
|
|
71
|
-
|
|
72
|
-
If optimize_state.task_id exists in auto-memory.yaml:
|
|
73
|
-
→ Look up that task_id in PLAN.md
|
|
74
|
-
→ If the task is still [ ] → select it (override normal scan)
|
|
75
|
-
→ If the task is [x] → clear optimize_state.task_id and fall through to normal scan
|
|
76
|
-
```
|
|
26
|
+
**Normal scan:** First `[ ]` task in PLAN.md where all `Blocked by:` deps are `[x]`.
|
|
77
27
|
|
|
78
|
-
|
|
28
|
+
**No `[ ]` tasks:** Skip to step 5 (completion check).
|
|
79
29
|
|
|
80
|
-
|
|
81
|
-
For each [ ] task in PLAN.md (top to bottom):
|
|
82
|
-
→ Parse "Blocked by:" line (if present)
|
|
83
|
-
→ Check each listed dependency in PLAN.md
|
|
84
|
-
→ If ALL listed blockers are [x] (or no blockers) → this task is READY
|
|
85
|
-
→ Select first READY task
|
|
86
|
-
```
|
|
87
|
-
|
|
88
|
-
**No tasks remaining (`[ ]` not found):** → skip to step 5 (completion check).
|
|
89
|
-
|
|
90
|
-
**All remaining tasks blocked:** → Error with blocker info:
|
|
91
|
-
```
|
|
92
|
-
Error: All remaining tasks are blocked.
|
|
93
|
-
[ ] T3 — blocked by: T2 (incomplete)
|
|
94
|
-
[ ] T4 — blocked by: T2 (incomplete)
|
|
95
|
-
|
|
96
|
-
Run /df:execute to investigate or resolve blockers manually.
|
|
97
|
-
```
|
|
30
|
+
**All remaining blocked:** Error with blocker details, suggest `/df:execute` for manual resolution.
|
|
98
31
|
|
|
99
32
|
### 3. EXECUTE
|
|
100
33
|
|
|
101
|
-
Run
|
|
102
|
-
|
|
103
|
-
```
|
|
104
|
-
Skill: "df:execute"
|
|
105
|
-
Args: "{task_id}" (e.g., "T3")
|
|
106
|
-
```
|
|
107
|
-
|
|
108
|
-
This handles worktree creation, agent spawning, ratchet health checks, and commit.
|
|
34
|
+
Run via Skill tool: `skill: "df:execute", args: "{task_id}"`. Handles worktree, agent spawning, ratchet, commit.
|
|
109
35
|
|
|
110
|
-
**Bootstrap handling:**
|
|
111
|
-
|
|
112
|
-
- Do NOT
|
|
113
|
-
-
|
|
114
|
-
- Exit normally — the NEXT cycle will pick up the first regular task (now protected by the bootstrapped tests)
|
|
115
|
-
- Do NOT attempt to execute a regular task in the same cycle as a bootstrap
|
|
36
|
+
**Bootstrap handling:** If execute returns `"bootstrap: completed"` (zero pre-existing tests, baseline written):
|
|
37
|
+
- Record as task `BOOTSTRAP`, status `passed`
|
|
38
|
+
- Do NOT run a regular task in the same cycle
|
|
39
|
+
- Next cycle picks up the first regular task
|
|
116
40
|
|
|
117
41
|
### 3.5. WRITE STATE
|
|
118
42
|
|
|
119
|
-
After
|
|
120
|
-
|
|
121
|
-
**On success (ratchet passed — non-optimize task):**
|
|
43
|
+
After execute returns, update `.deepflow/auto-memory.yaml` (read-merge-write, preserve all keys):
|
|
122
44
|
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
task_results:
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
**On revert (ratchet failed — non-optimize task):**
|
|
130
|
-
|
|
131
|
-
```yaml
|
|
132
|
-
# Set task_results[task_id] = reverted entry
|
|
133
|
-
task_results:
|
|
134
|
-
{task_id}: { status: reverted, reason: "{ratchet failure summary}", cycle: {cycle_number} }
|
|
135
|
-
|
|
136
|
-
# Append to revert_history
|
|
137
|
-
revert_history:
|
|
138
|
-
- { task: {task_id}, cycle: {cycle_number}, reason: "{ratchet failure summary}" }
|
|
139
|
-
```
|
|
140
|
-
|
|
141
|
-
**On optimize cycle result** (task has `Optimize:` block — execute.md section 5.9 handles the inner cycle; auto-cycle only updates the outer state here):
|
|
142
|
-
|
|
143
|
-
After each optimize cycle reported by `/df:execute`:
|
|
144
|
-
|
|
145
|
-
```yaml
|
|
146
|
-
# Merge updated optimize_state written by execute into auto-memory.yaml
|
|
147
|
-
# execute already persists optimize_state after each cycle (5.9.5) — confirm it was written
|
|
148
|
-
# Increment cycles_run tracked at auto-cycle level for report summary
|
|
149
|
-
optimize_state:
|
|
150
|
-
cycles_run: {N} # echoed from execute's optimize_state
|
|
151
|
-
current_best: {value}
|
|
152
|
-
history: [...] # full history from execute's optimize_state
|
|
153
|
-
```
|
|
154
|
-
|
|
155
|
-
Read the current file first (create if missing), merge the new values, and write back. Preserve all existing keys.
|
|
45
|
+
| Outcome | Write |
|
|
46
|
+
|---------|-------|
|
|
47
|
+
| Success (non-optimize) | `task_results[id]: {status: success, commit: {hash}, cycle: {N}}` |
|
|
48
|
+
| Revert (non-optimize) | `task_results[id]: {status: reverted, reason: "{msg}", cycle: {N}}` + append to `revert_history` |
|
|
49
|
+
| Optimize cycle | Merge updated `optimize_state` from execute (confirm `cycles_run`, `current_best`, `history`) |
|
|
156
50
|
|
|
157
51
|
### 3.6. CIRCUIT BREAKER
|
|
158
52
|
|
|
159
|
-
|
|
160
|
-
|
|
161
|
-
**What counts as a failure (increments counter):**
|
|
162
|
-
|
|
163
|
-
```
|
|
164
|
-
- L0 ✗ (build failed)
|
|
165
|
-
- L1 ✗ (files missing)
|
|
166
|
-
- L2 ✗ (coverage dropped)
|
|
167
|
-
- L4 ✗ (tests failed)
|
|
168
|
-
- L5 ✗ (browser assertions failed — both attempts)
|
|
169
|
-
- L5 ✗ (flaky) (browser assertions failed on both attempts, different assertions)
|
|
170
|
-
|
|
171
|
-
What does NOT count as a failure:
|
|
172
|
-
- L5 — (no frontend): skipped, not a revert trigger
|
|
173
|
-
- L5 ⚠ (passed on retry): treated as pass, resets counter
|
|
174
|
-
```
|
|
175
|
-
|
|
176
|
-
**On revert (ratchet failed — any of L0 ✗, L1 ✗, L2 ✗, L4 ✗, L5 ✗, or L5 ✗ flaky — non-optimize task):**
|
|
177
|
-
|
|
178
|
-
```
|
|
179
|
-
1. Read .deepflow/auto-memory.yaml (create if missing)
|
|
180
|
-
2. Increment consecutive_reverts[task_id] by 1
|
|
181
|
-
3. Write updated value back to .deepflow/auto-memory.yaml
|
|
182
|
-
4. Read circuit_breaker_threshold from .deepflow/config.yaml (default: 3 if key absent)
|
|
183
|
-
5. If consecutive_reverts[task_id] >= threshold:
|
|
184
|
-
→ Do NOT start /loop again
|
|
185
|
-
→ Report: "Circuit breaker tripped: T{n} failed {N} consecutive times. Reason: {last ratchet failure}"
|
|
186
|
-
→ Halt (exit without scheduling next cycle)
|
|
187
|
-
Else:
|
|
188
|
-
→ Continue to step 4 (UPDATE REPORT) as normal
|
|
189
|
-
```
|
|
190
|
-
|
|
191
|
-
**On success (ratchet passed — including L5 — no frontend or L5 ⚠ pass-on-retry — non-optimize task):**
|
|
192
|
-
|
|
193
|
-
```
|
|
194
|
-
1. Reset consecutive_reverts[task_id] to 0 in .deepflow/auto-memory.yaml
|
|
195
|
-
```
|
|
196
|
-
|
|
197
|
-
**Optimize stop conditions** (task has `Optimize:` block — checked after every optimize cycle result from execute):
|
|
198
|
-
|
|
199
|
-
Execute (section 5.9.3) handles the inner-cycle circuit breaker inside the optimize loop. At the auto-cycle level, watch for these terminal outcomes reported by `/df:execute`:
|
|
200
|
-
|
|
201
|
-
```
|
|
202
|
-
1. "target reached: {value}"
|
|
203
|
-
→ Mark task [x] (execute already did this — confirm)
|
|
204
|
-
→ Write optimize completion (step 3.7)
|
|
205
|
-
→ Report: "Optimize complete: target reached — {value} (target: {target})"
|
|
206
|
-
→ Continue to step 4
|
|
207
|
-
|
|
208
|
-
2. "max cycles reached, best: {current_best}"
|
|
209
|
-
→ Mark task [x] (execute already did this — confirm)
|
|
210
|
-
→ Write optimize completion (step 3.7)
|
|
211
|
-
→ Report: "Optimize complete: max cycles reached — best: {current_best} (target: {target})"
|
|
212
|
-
→ Continue to step 4
|
|
213
|
-
|
|
214
|
-
3. "circuit breaker: 3 consecutive reverts"
|
|
215
|
-
→ Task stays [ ] — do NOT mark [x]
|
|
216
|
-
→ Write optimize failure to experiments/ (step 3.7)
|
|
217
|
-
→ Clear optimize_state.task_id (task stays [ ] for manual intervention)
|
|
218
|
-
→ Report: "Circuit breaker tripped (optimize): T{n} halted after 3 consecutive reverts. Resolve manually."
|
|
219
|
-
→ Halt (exit without scheduling next cycle)
|
|
220
|
-
```
|
|
53
|
+
**Failure = any L0-L5 verification failure** (build, files, coverage, tests, browser assertions). Does NOT count: L5 skip (no frontend), L5 pass-on-retry.
|
|
221
54
|
|
|
222
|
-
**
|
|
55
|
+
**On revert (non-optimize):**
|
|
56
|
+
1. Increment `consecutive_reverts[task_id]` in auto-memory.yaml
|
|
57
|
+
2. Read `circuit_breaker_threshold` from `.deepflow/config.yaml` (default: 3)
|
|
58
|
+
3. If `consecutive_reverts[task_id] >= threshold`: halt loop, report "Circuit breaker tripped: T{n} failed {N} times. Reason: {msg}"
|
|
59
|
+
4. Else: continue to step 4
|
|
223
60
|
|
|
224
|
-
|
|
225
|
-
consecutive_reverts:
|
|
226
|
-
T1: 0
|
|
227
|
-
T3: 2
|
|
228
|
-
```
|
|
61
|
+
**On success (non-optimize):** Reset `consecutive_reverts[task_id]` to 0.
|
|
229
62
|
|
|
230
|
-
**
|
|
63
|
+
**Optimize stop conditions** (from execute terminal outcomes):
|
|
231
64
|
|
|
232
|
-
|
|
233
|
-
|
|
234
|
-
|
|
65
|
+
| Outcome | Action |
|
|
66
|
+
|---------|--------|
|
|
67
|
+
| `"target reached: {value}"` | Confirm task [x], write optimize completion (3.7), report, continue |
|
|
68
|
+
| `"max cycles reached, best: {value}"` | Confirm task [x], write optimize completion (3.7), report, continue |
|
|
69
|
+
| `"circuit breaker: 3 consecutive reverts"` | Task stays [ ], write failure to experiments (3.7), preserve optimize_state, halt loop |
|
|
235
70
|
|
|
236
71
|
### 3.7. OPTIMIZE COMPLETION
|
|
237
72
|
|
|
238
|
-
When an optimize task reaches a terminal stop condition (target reached, max cycles, or circuit breaker):
|
|
239
|
-
|
|
240
73
|
**On target reached or max cycles (task [x]):**
|
|
74
|
+
1. Write each `failed_hypotheses` entry to `.deepflow/experiments/{spec}--optimize-{task_id}--{slug}--failed.md`
|
|
75
|
+
2. Write summary to `.deepflow/experiments/{spec}--optimize-{task_id}--summary--{status}.md` with metric/target/direction/baseline/best/cycles/history table
|
|
76
|
+
3. Clear `optimize_state` from auto-memory.yaml
|
|
241
77
|
|
|
242
|
-
|
|
243
|
-
1. Read optimize_state.failed_hypotheses from .deepflow/auto-memory.yaml
|
|
244
|
-
2. For each failed hypothesis, write to .deepflow/experiments/:
|
|
245
|
-
File: {spec}--optimize-{task_id}--{slug}--failed.md
|
|
246
|
-
Content:
|
|
247
|
-
# Failed Hypothesis: {description}
|
|
248
|
-
Task: {task_id} Spec: {spec_name} Cycle: {cycle_N}
|
|
249
|
-
Metric before: {value_before} Metric after: {value_after}
|
|
250
|
-
Reason: {why it was reverted}
|
|
251
|
-
3. Write a summary experiment file for the optimize run:
|
|
252
|
-
File: {spec}--optimize-{task_id}--summary--{status}.md
|
|
253
|
-
Content:
|
|
254
|
-
# Optimize Summary: {task_id}
|
|
255
|
-
Metric: {metric_command} Target: {target} Direction: {direction}
|
|
256
|
-
Baseline: {baseline} Best achieved: {current_best} Final: {final_value}
|
|
257
|
-
Cycles run: {cycles_run} Status: {reached|max_cycles}
|
|
258
|
-
History (all cycles):
|
|
259
|
-
| Cycle | Value | Delta | Kept | Commit |
|
|
260
|
-
...
|
|
261
|
-
4. Clear optimize_state from .deepflow/auto-memory.yaml (set to null or remove key)
|
|
262
|
-
```
|
|
263
|
-
|
|
264
|
-
**On circuit breaker halt:**
|
|
265
|
-
|
|
266
|
-
```
|
|
267
|
-
1. Write failed_hypotheses to .deepflow/experiments/ (same as above)
|
|
268
|
-
2. Write summary experiment file with status: circuit_breaker
|
|
269
|
-
3. Preserve optimize_state in auto-memory.yaml (do NOT clear — enables human diagnosis)
|
|
270
|
-
Add note: "halted: circuit_breaker — requires manual intervention"
|
|
271
|
-
```
|
|
78
|
+
**On circuit breaker halt:** Same experiment writes but with status `circuit_breaker`. Preserve `optimize_state` in auto-memory.yaml (add `halted: circuit_breaker` note).
|
|
272
79
|
|
|
273
80
|
### 4. UPDATE REPORT
|
|
274
81
|
|
|
275
|
-
Write
|
|
276
|
-
|
|
277
|
-
#### 4.1 File structure
|
|
278
|
-
|
|
279
|
-
The report uses four sections. On the **first cycle** (file does not exist), create the full skeleton. On **subsequent cycles**, update the existing file in-place:
|
|
280
|
-
|
|
281
|
-
```markdown
|
|
282
|
-
# Auto Mode Report — {spec_name}
|
|
283
|
-
|
|
284
|
-
_Last updated: {YYYY-MM-DDTHH:MM:SSZ}_
|
|
285
|
-
|
|
286
|
-
## Summary
|
|
287
|
-
|
|
288
|
-
| Metric | Value |
|
|
289
|
-
|--------|-------|
|
|
290
|
-
| Total cycles run | {N} |
|
|
291
|
-
| Tasks committed | {N} |
|
|
292
|
-
| Tasks reverted | {N} |
|
|
293
|
-
| Optimize cycles run | {N} | ← present only when optimize tasks exist in PLAN.md
|
|
294
|
-
| Optimize best value | {value} / {target} | ← present only when optimize tasks exist
|
|
295
|
-
|
|
296
|
-
## Cycle Log
|
|
297
|
-
|
|
298
|
-
| Cycle | Task | Status | Commit / Revert | Delta | Metric Delta | Reason | Timestamp |
|
|
299
|
-
|-------|------|--------|-----------------|-------|--------------|--------|-----------|
|
|
300
|
-
| 1 | T1 | passed | abc1234 | tests: 24→24, build: ok | — | — | 2025-01-15T10:00:00Z |
|
|
301
|
-
| 2 | T2 | failed | reverted | tests: 24→22 (−2) | — | tests failed: 2 of 24 | 2025-01-15T10:05:00Z |
|
|
302
|
-
| 3 | T3 | optimize | def789 | tests: 24→24, build: ok | 72.3→74.1 (+2.5%) | — | 2025-01-15T10:10:00Z |
|
|
303
|
-
|
|
304
|
-
## Probe Results
|
|
305
|
-
|
|
306
|
-
_(empty until a probe/spike task runs)_
|
|
307
|
-
|
|
308
|
-
| Probe | Metric | Winner | Loser | Notes |
|
|
309
|
-
|-------|--------|--------|-------|-------|
|
|
310
|
-
|
|
311
|
-
## Optimize Runs
|
|
312
|
-
|
|
313
|
-
_(empty until an optimize task completes)_
|
|
314
|
-
|
|
315
|
-
| Task | Metric | Baseline | Best | Target | Cycles | Status |
|
|
316
|
-
|------|--------|----------|------|--------|--------|--------|
|
|
82
|
+
Write to `.deepflow/auto-report.md` — append each cycle, never overwrite. First cycle creates skeleton, subsequent cycles update in-place.
|
|
317
83
|
|
|
318
|
-
|
|
84
|
+
**File sections:** Summary table, Cycle Log, Probe Results, Optimize Runs, Secondary Metric Warnings, Health Score, Reverted Tasks.
|
|
319
85
|
|
|
320
|
-
|
|
86
|
+
#### Per-cycle update rules
|
|
321
87
|
|
|
322
|
-
|
|
|
323
|
-
|
|
88
|
+
| Section | When | Action |
|
|
89
|
+
|---------|------|--------|
|
|
90
|
+
| Cycle Log | Every cycle | Append row: `cycle | task_id | status | commit/reverted | delta | metric_delta | reason | timestamp` |
|
|
91
|
+
| Summary | Every cycle | Recalculate from Cycle Log: total cycles, committed, reverted, optimize cycles/best (if applicable) |
|
|
92
|
+
| Last updated | Every cycle | Overwrite timestamp |
|
|
93
|
+
| Probe Results | Probe/spike task | Append row from `probe_learnings` in auto-memory.yaml |
|
|
94
|
+
| Optimize Runs | Optimize terminal event | Append row: task/metric/baseline/best/target/cycles/status |
|
|
95
|
+
| Secondary Metric Warnings | >5% regression | Append row (severity: WARNING, advisory only — no auto-revert) |
|
|
96
|
+
| Health Score | Every cycle | Replace with latest: tests passed, build status, ratchet green/red, optimize status |
|
|
97
|
+
| Reverted Tasks | On revert | Append row from `revert_history` |
|
|
324
98
|
|
|
325
|
-
|
|
99
|
+
**Status values:** `passed`, `failed` (reverted), `skipped` (already done), `optimize` (inner cycle).
|
|
326
100
|
|
|
327
|
-
|
|
328
|
-
|-------|--------|
|
|
329
|
-
| Tests passed | {N} / {total} |
|
|
330
|
-
| Build status | passing / failing |
|
|
331
|
-
| Ratchet | green / red |
|
|
332
|
-
| Optimize status | in_progress / reached / max_cycles / circuit_breaker / — | ← present only when optimize tasks exist
|
|
333
|
-
|
|
334
|
-
## Reverted Tasks
|
|
335
|
-
|
|
336
|
-
_(tasks that were reverted with their failure reasons)_
|
|
337
|
-
|
|
338
|
-
| Task | Cycle | Reason |
|
|
339
|
-
|------|-------|--------|
|
|
340
|
-
```
|
|
341
|
-
|
|
342
|
-
#### 4.2 Per-cycle update rules
|
|
343
|
-
|
|
344
|
-
**Cycle Log — append one row:**
|
|
345
|
-
|
|
346
|
-
```
|
|
347
|
-
| {cycle_number} | {task_id} | {status} | {commit_hash or "reverted"} | {delta} | {metric_delta} | {reason or "—"} | {YYYY-MM-DDTHH:MM:SSZ} |
|
|
348
|
-
```
|
|
349
|
-
|
|
350
|
-
- `cycle_number`: total number of cycles executed so far (count existing data rows in the Cycle Log + 1)
|
|
351
|
-
- `task_id`: task ID from PLAN.md, or `BOOTSTRAP` for bootstrap cycles
|
|
352
|
-
- `status`: `passed` (ratchet passed), `failed` (ratchet failed, reverted), `skipped` (task was already done), or `optimize` (optimize cycle — one inner cycle of an Optimize task)
|
|
353
|
-
- `commit_hash`: short hash from the commit, or `reverted` if ratchet failed
|
|
354
|
-
- `delta`: ratchet metric change from this cycle. Format: `tests: {before}→{after}, build: ok/fail`. Include coverage delta if available (e.g., `cov: 80%→82% (+2%)`). On revert, show the regression that triggered it (e.g., `tests: 24→22 (−2)`)
|
|
355
|
-
- `metric_delta`: for optimize cycles, show `{old}→{new} ({+/-pct}%)`. For non-optimize cycles, use `—`.
|
|
356
|
-
- `reason`: failure reason from ratchet output (e.g., `"tests failed: 2 of 24"`), or `—` if passed
|
|
357
|
-
|
|
358
|
-
**Summary table — recalculate from Cycle Log rows:**
|
|
359
|
-
|
|
360
|
-
- `Total cycles run`: count of all data rows in the Cycle Log
|
|
361
|
-
- `Tasks committed`: count of rows where Status = `passed`
|
|
362
|
-
- `Tasks reverted`: count of rows where Status = `failed`
|
|
363
|
-
- `Optimize cycles run`: count of rows where Status = `optimize` (omit row if no optimize tasks in PLAN.md)
|
|
364
|
-
- `Optimize best value`: `{current_best} / {target}` from `optimize_state` in auto-memory.yaml (omit row if no optimize tasks)
|
|
365
|
-
|
|
366
|
-
**Last updated timestamp:** always overwrite the `_Last updated:` line with the current timestamp.
|
|
367
|
-
|
|
368
|
-
**Optimize Runs table — update on optimize terminal events:**
|
|
369
|
-
|
|
370
|
-
When an optimize stop condition is reached (target reached, max cycles, circuit breaker), append or update the row for the optimize task:
|
|
371
|
-
|
|
372
|
-
```
|
|
373
|
-
| {task_id} | {metric_command} | {baseline} | {current_best} | {target} | {cycles_run} | {reached|max_cycles|circuit_breaker} |
|
|
374
|
-
```
|
|
375
|
-
|
|
376
|
-
If the task is still in progress, do not add a row yet (it will be added when the terminal event fires).
|
|
377
|
-
|
|
378
|
-
**Secondary Metric Warnings table — append on regression >5%:**
|
|
379
|
-
|
|
380
|
-
After each optimize cycle, `/df:execute` section 5.9.2 step j measures secondary metrics. If a regression exceeds the threshold, auto-cycle reads the warning from execute's output and appends to the table:
|
|
381
|
-
|
|
382
|
-
```
|
|
383
|
-
| {cycle_number} | {task_id} | {secondary_metric_command} | {before} | {after} | {+/-pct}% | WARNING |
|
|
384
|
-
```
|
|
101
|
+
**Delta format:** `tests: {before}→{after}, build: ok/fail`. Include coverage if available. On revert, show regression.
|
|
385
102
|
|
|
386
|
-
|
|
387
|
-
|
|
388
|
-
#### 4.3 Probe results (when applicable)
|
|
389
|
-
|
|
390
|
-
If the executed task was a probe/spike (task description contains "probe" or "spike"), append a row to the Probe Results table:
|
|
391
|
-
|
|
392
|
-
```
|
|
393
|
-
| {probe_name} | {metric description} | {winner approach} | {loser approach} | {key insight from probe_learnings in auto-memory.yaml} |
|
|
394
|
-
```
|
|
395
|
-
|
|
396
|
-
Read `probe_learnings` from `.deepflow/auto-memory.yaml` for the insight text.
|
|
397
|
-
|
|
398
|
-
If no probe has run yet, leave the `_(empty until a probe/spike task runs)_` placeholder in place.
|
|
399
|
-
|
|
400
|
-
#### 4.4 Health score (after every cycle)
|
|
401
|
-
|
|
402
|
-
Read the ratchet output from the last `/df:execute` result and populate:
|
|
403
|
-
|
|
404
|
-
- `Tests passed`: e.g., `22 / 24` (from ratchet summary line)
|
|
405
|
-
- `Build status`: `passing` if exit code 0, `failing` if build error
|
|
406
|
-
- `Ratchet`: `green` if ratchet passed, `red` if ratchet failed
|
|
407
|
-
- `Optimize status`: read from `optimize_state` in auto-memory.yaml:
|
|
408
|
-
- `in_progress` if `optimize_state.task_id` present and task still `[ ]`
|
|
409
|
-
- `reached` if stop condition was "target reached"
|
|
410
|
-
- `max_cycles` if stop condition was "max cycles"
|
|
411
|
-
- `circuit_breaker` if halted by circuit breaker
|
|
412
|
-
- `—` if no optimize task is active or was ever run
|
|
413
|
-
- Omit this row entirely if PLAN.md contains no `[OPTIMIZE]` tasks
|
|
414
|
-
|
|
415
|
-
Replace the entire Health Score section content with the latest values each cycle.
|
|
416
|
-
|
|
417
|
-
#### 4.5 Reverted tasks section
|
|
418
|
-
|
|
419
|
-
After every revert, append a row to the Reverted Tasks table:
|
|
420
|
-
|
|
421
|
-
```
|
|
422
|
-
| {task_id} | {cycle_number} | {failure reason} |
|
|
423
|
-
```
|
|
424
|
-
|
|
425
|
-
Read from `revert_history` in `.deepflow/auto-memory.yaml` to ensure no entry is missed. If no tasks have been reverted, leave the `_(tasks that were reverted...)_` placeholder in place.
|
|
103
|
+
**Optimize status in Health Score:** `in_progress` | `reached` | `max_cycles` | `circuit_breaker` | `—` (omit row if no optimize tasks in PLAN.md).
|
|
426
104
|
|
|
427
105
|
### 5. CHECK COMPLETION
|
|
428
106
|
|
|
429
|
-
|
|
430
|
-
```
|
|
431
|
-
done_count = number of [x] tasks
|
|
432
|
-
pending_count = number of [ ] tasks
|
|
433
|
-
```
|
|
434
|
-
|
|
435
|
-
**Note:** Per-spec verification and merge to main happens automatically in `/df:execute` (step 8) when all tasks for a spec complete. No separate verify call is needed here.
|
|
436
|
-
|
|
437
|
-
**If no `[ ]` tasks remain (pending_count == 0):**
|
|
438
|
-
```
|
|
439
|
-
→ Report: "All specs verified and merged. Workflow complete."
|
|
440
|
-
→ Exit
|
|
441
|
-
```
|
|
107
|
+
Count `[x]` and `[ ]` tasks in PLAN.md. Per-spec verify+merge happens in `/df:execute` step 8 automatically.
|
|
442
108
|
|
|
443
|
-
**
|
|
444
|
-
|
|
445
|
-
→ Report: "Cycle complete. {pending_count} tasks remaining."
|
|
446
|
-
→ Exit — next /loop invocation will pick up
|
|
447
|
-
```
|
|
109
|
+
- **No `[ ]` remaining:** "All specs verified and merged. Workflow complete." → exit
|
|
110
|
+
- **Tasks remain:** "Cycle complete. {N} tasks remaining." → exit (next /loop invocation picks up)
|
|
448
111
|
|
|
449
112
|
## Rules
|
|
450
113
|
|
|
451
114
|
| Rule | Detail |
|
|
452
115
|
|------|--------|
|
|
453
116
|
| One task per cycle | Fresh context each invocation — no multi-task batching |
|
|
454
|
-
| Bootstrap
|
|
455
|
-
| Idempotent | Safe to call with no work
|
|
456
|
-
| Never modifies PLAN.md
|
|
457
|
-
|
|
|
458
|
-
|
|
|
459
|
-
|
|
|
460
|
-
|
|
|
461
|
-
|
|
|
462
|
-
|
|
|
463
|
-
| Optimize
|
|
464
|
-
| Secondary metric regression is advisory only | >5% regression generates WARNING in `auto-report.md` Secondary Metric Warnings table — never triggers auto-revert |
|
|
465
|
-
| Optimize completion writes experiments | Failed hypotheses and run summary are written to `.deepflow/experiments/` when a terminal stop condition fires |
|
|
117
|
+
| Bootstrap = sole task | No regular task runs in a bootstrap cycle |
|
|
118
|
+
| Idempotent | Safe to call with no work — reports "0 tasks remaining" |
|
|
119
|
+
| Never modifies PLAN.md | `/df:execute` handles PLAN.md updates |
|
|
120
|
+
| Auto-memory after every cycle | `task_results`, `revert_history`, `consecutive_reverts` always written |
|
|
121
|
+
| Circuit breaker halts loop | Default 3 consecutive reverts (configurable: `circuit_breaker_threshold` in config.yaml) |
|
|
122
|
+
| One optimize at a time | Defers other optimize tasks until active one terminates |
|
|
123
|
+
| Optimize resumes across contexts | `optimize_state.task_id` overrides normal scan |
|
|
124
|
+
| Optimize CB preserves state | On halt: task stays [ ], optimize_state kept for diagnosis |
|
|
125
|
+
| Secondary metric regression advisory | >5% = WARNING in report, never auto-revert |
|
|
126
|
+
| Optimize completion writes experiments | Failed hypotheses + summary to `.deepflow/experiments/` |
|
|
466
127
|
|
|
467
128
|
## Example
|
|
468
129
|
|
|
469
|
-
###
|
|
470
|
-
|
|
471
|
-
```
|
|
472
|
-
/df:auto-cycle
|
|
473
|
-
|
|
474
|
-
Loading PLAN.md... 3 tasks total, 0 done, 3 pending
|
|
475
|
-
Next ready task: T1 (no blockers)
|
|
476
|
-
|
|
477
|
-
Running: /df:execute T1
|
|
478
|
-
Ratchet snapshot: 0 pre-existing test files
|
|
479
|
-
Bootstrap needed — writing tests for edit_scope first
|
|
480
|
-
✓ Bootstrap: ratchet passed (boo1234)
|
|
481
|
-
bootstrap: completed
|
|
482
|
-
|
|
483
|
-
Updated .deepflow/auto-report.md:
|
|
484
|
-
Summary: cycles=1, committed=1, reverted=0
|
|
485
|
-
Cycle Log row: | 1 | BOOTSTRAP | passed | boo1234 | — | 2025-01-15T10:00:00Z |
|
|
486
|
-
Health: tests 10/10, build passing, ratchet green
|
|
487
|
-
|
|
488
|
-
Cycle complete. 3 tasks remaining.
|
|
489
|
-
```
|
|
490
|
-
|
|
491
|
-
### Normal Cycle (task executed)
|
|
492
|
-
|
|
130
|
+
### Normal Cycle
|
|
493
131
|
```
|
|
494
132
|
/df:auto-cycle
|
|
495
|
-
|
|
496
|
-
|
|
497
|
-
|
|
498
|
-
|
|
499
|
-
Running: /df:execute T2
|
|
500
|
-
✓ T2: ratchet passed (abc1234)
|
|
501
|
-
|
|
502
|
-
Updated .deepflow/auto-report.md:
|
|
503
|
-
Summary: cycles=2, committed=2, reverted=0
|
|
504
|
-
Cycle Log row: | 2 | T2 | passed | abc1234 | — | 2025-01-15T10:05:00Z |
|
|
505
|
-
Health: tests 22/22, build passing, ratchet green
|
|
506
|
-
|
|
133
|
+
Loading PLAN.md... 3 tasks, 1 done, 2 pending
|
|
134
|
+
Next: T2 (T1 satisfied)
|
|
135
|
+
Running: /df:execute T2 → ✓ ratchet passed (abc1234)
|
|
136
|
+
Updated auto-report.md: cycles=2, committed=2
|
|
507
137
|
Cycle complete. 1 tasks remaining.
|
|
508
138
|
```
|
|
509
139
|
|
|
510
|
-
### All Tasks Done (workflow complete)
|
|
511
|
-
|
|
512
|
-
```
|
|
513
|
-
/df:auto-cycle
|
|
514
|
-
|
|
515
|
-
Loading PLAN.md... 0 tasks total, 0 done, 0 pending
|
|
516
|
-
|
|
517
|
-
All specs verified and merged. Workflow complete.
|
|
518
|
-
```
|
|
519
|
-
|
|
520
|
-
### No Work Remaining (idempotent)
|
|
521
|
-
|
|
522
|
-
```
|
|
523
|
-
/df:auto-cycle
|
|
524
|
-
|
|
525
|
-
Loading PLAN.md... 0 tasks total, 0 done, 0 pending
|
|
526
|
-
|
|
527
|
-
All specs verified and merged. Workflow complete.
|
|
528
|
-
```
|
|
529
|
-
|
|
530
140
|
### Circuit Breaker Tripped
|
|
531
|
-
|
|
532
141
|
```
|
|
533
142
|
/df:auto-cycle
|
|
534
|
-
|
|
535
|
-
|
|
536
|
-
|
|
537
|
-
|
|
538
|
-
Running: /df:execute T3
|
|
539
|
-
✗ T3: ratchet failed — "2 tests regressed"
|
|
540
|
-
Reverted changes.
|
|
541
|
-
|
|
143
|
+
Loading PLAN.md... 3 tasks, 1 done, 2 pending
|
|
144
|
+
Next: T3
|
|
145
|
+
Running: /df:execute T3 → ✗ ratchet failed — "2 tests regressed"
|
|
542
146
|
Circuit breaker: consecutive_reverts[T3] = 3 (threshold: 3)
|
|
543
|
-
|
|
544
|
-
|
|
545
|
-
Loop halted. Resolve T3 manually, then run /df:auto-cycle to resume.
|
|
546
|
-
```
|
|
547
|
-
|
|
548
|
-
### Optimize Cycle (in progress — task resumes from optimize_state)
|
|
549
|
-
|
|
550
|
-
```
|
|
551
|
-
/df:auto-cycle
|
|
552
|
-
|
|
553
|
-
Loading PLAN.md... 4 tasks total, 2 done, 2 pending
|
|
554
|
-
Loading auto-memory.yaml... optimize_state.task_id = T3
|
|
555
|
-
|
|
556
|
-
Optimize-active override: T3 still [ ] — resuming optimize task
|
|
557
|
-
optimize_state: cycles_run=4, current_best=74.1, target=85.0, direction=higher
|
|
558
|
-
|
|
559
|
-
Running: /df:execute T3
|
|
560
|
-
⟳ T3 cycle 5: 74.1 → 75.8 (+2.3%) — kept [best: 75.8, target: 85.0]
|
|
561
|
-
|
|
562
|
-
Updated .deepflow/auto-memory.yaml:
|
|
563
|
-
optimize_state.cycles_run = 5
|
|
564
|
-
optimize_state.current_best = 75.8
|
|
565
|
-
|
|
566
|
-
Updated .deepflow/auto-report.md:
|
|
567
|
-
Summary: cycles=5, committed=2, reverted=0, optimize_cycles=5, optimize_best=75.8/85.0
|
|
568
|
-
Cycle Log row: | 5 | T3 | optimize | abc1234 | tests: 24→24, build: ok | 74.1→75.8 (+2.3%) | — | 2025-01-15T10:15:00Z |
|
|
569
|
-
Health: tests 24/24, build passing, ratchet green, optimize in_progress
|
|
570
|
-
|
|
571
|
-
Cycle complete. 2 tasks remaining.
|
|
572
|
-
```
|
|
573
|
-
|
|
574
|
-
### Optimize Complete (target reached)
|
|
575
|
-
|
|
576
|
-
```
|
|
577
|
-
/df:auto-cycle
|
|
578
|
-
|
|
579
|
-
Loading PLAN.md... 4 tasks total, 2 done, 2 pending
|
|
580
|
-
Loading auto-memory.yaml... optimize_state.task_id = T3
|
|
581
|
-
|
|
582
|
-
Optimize-active override: T3 still [ ] — resuming optimize task
|
|
583
|
-
optimize_state: cycles_run=12, current_best=84.9, target=85.0, direction=higher
|
|
584
|
-
|
|
585
|
-
Running: /df:execute T3
|
|
586
|
-
⟳ T3 cycle 13: 84.9 → 85.3 (+0.5%) — kept [best: 85.3, target: 85.0]
|
|
587
|
-
Target reached: 85.3 >= 85.0 — marking T3 [x]
|
|
588
|
-
|
|
589
|
-
Optimize completion:
|
|
590
|
-
Writing 3 failed hypotheses to .deepflow/experiments/
|
|
591
|
-
Writing summary: specs--optimize-T3--summary--reached.md
|
|
592
|
-
Clearing optimize_state from auto-memory.yaml
|
|
593
|
-
|
|
594
|
-
Updated .deepflow/auto-report.md:
|
|
595
|
-
Summary: cycles=13, committed=3, reverted=0, optimize_cycles=13, optimize_best=85.3/85.0
|
|
596
|
-
Cycle Log row: | 13 | T3 | optimize | def456 | tests: 24→24, build: ok | 84.9→85.3 (+0.5%) | — | 2025-01-15T10:45:00Z |
|
|
597
|
-
Optimize Runs row: | T3 | coverage_cmd | 72.3 | 85.3 | 85.0 | 13 | reached |
|
|
598
|
-
Health: tests 24/24, build passing, ratchet green, optimize reached
|
|
599
|
-
|
|
600
|
-
Cycle complete. 1 tasks remaining.
|
|
601
|
-
```
|
|
602
|
-
|
|
603
|
-
### Optimize Secondary Metric Warning
|
|
604
|
-
|
|
605
|
-
```
|
|
606
|
-
/df:auto-cycle
|
|
607
|
-
|
|
608
|
-
Running: /df:execute T3
|
|
609
|
-
⟳ T3 cycle 8: 80.1 → 81.4 (+1.6%) — kept [best: 81.4, target: 85.0]
|
|
610
|
-
WARNING: secondary metric 'lint_errors' regressed: 2 → 5 (+150%) — exceeds 5% threshold
|
|
611
|
-
|
|
612
|
-
Updated .deepflow/auto-report.md:
|
|
613
|
-
Secondary Metric Warnings row: | 8 | T3 | lint_errors | 2 | 5 | +150% | WARNING |
|
|
614
|
-
(No auto-revert — human decision required)
|
|
615
|
-
|
|
616
|
-
Cycle complete. 2 tasks remaining.
|
|
617
|
-
```
|
|
618
|
-
|
|
619
|
-
### All Tasks Blocked
|
|
620
|
-
|
|
621
|
-
```
|
|
622
|
-
/df:auto-cycle
|
|
623
|
-
|
|
624
|
-
Loading PLAN.md... 3 tasks total, 1 done, 2 pending
|
|
625
|
-
|
|
626
|
-
Error: All remaining tasks are blocked.
|
|
627
|
-
[ ] T3 — blocked by: T2 (incomplete)
|
|
628
|
-
[ ] T4 — blocked by: T2 (incomplete)
|
|
629
|
-
|
|
630
|
-
Run /df:execute to investigate or resolve blockers manually.
|
|
147
|
+
Loop halted. Resolve T3 manually, then resume.
|
|
631
148
|
```
|