deepflow 0.1.71 → 0.1.73

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,384 @@
1
+ # /df:auto-cycle — Single Cycle of Auto Mode
2
+
3
+ ## Purpose
4
+ Execute one task from PLAN.md. Designed to be called by `/loop 1m /df:auto-cycle` — each invocation gets fresh context.
5
+
6
+ **NEVER:** use EnterPlanMode, use ExitPlanMode
7
+
8
+ ---
9
+
10
+ ## Usage
11
+ ```
12
+ /df:auto-cycle # Pick next undone task and execute it (or verify if all done)
13
+ ```
14
+
15
+ ## Behavior
16
+
17
+ ### 1. LOAD STATE
18
+
19
+ ```
20
+ Load: PLAN.md (required)
21
+ → If missing: Error "No PLAN.md. Run /df:plan first."
22
+ Load: .deepflow/auto-memory.yaml (optional — cross-cycle state, ignore if missing)
23
+ ```
24
+
25
+ **auto-memory.yaml full schema:**
26
+
27
+ ```yaml
28
+ task_results:
29
+ T1: { status: success, commit: abc1234, cycle: 3 }
30
+ T2: { status: reverted, reason: "tests failed: 2 of 24", cycle: 4 }
31
+ revert_history:
32
+ - { task: T2, cycle: 4, reason: "tests failed" }
33
+ - { task: T2, cycle: 5, reason: "build error" }
34
+ consecutive_reverts: # written by circuit breaker (step 3.5)
35
+ T1: 0
36
+ T2: 2
37
+ probe_learnings:
38
+ - { spike: T1, probe: "streaming", insight: "discovered hidden dependency on fs.watch" }
39
+ ```
40
+
41
+ Each section is optional. Missing keys are treated as empty. The file is created on first write if absent.
42
+
43
+ ### 2. PICK NEXT TASK
44
+
45
+ Scan PLAN.md for the first `[ ]` task where all "Blocked by:" dependencies are `[x]`:
46
+
47
+ ```
48
+ For each [ ] task in PLAN.md (top to bottom):
49
+ → Parse "Blocked by:" line (if present)
50
+ → Check each listed dependency in PLAN.md
51
+ → If ALL listed blockers are [x] (or no blockers) → this task is READY
52
+ → Select first READY task
53
+ ```
54
+
55
+ **No tasks remaining (`[ ]` not found):** → skip to step 5 (completion check).
56
+
57
+ **All remaining tasks blocked:** → Error with blocker info:
58
+ ```
59
+ Error: All remaining tasks are blocked.
60
+ [ ] T3 — blocked by: T2 (incomplete)
61
+ [ ] T4 — blocked by: T2 (incomplete)
62
+
63
+ Run /df:execute to investigate or resolve blockers manually.
64
+ ```
65
+
66
+ ### 3. EXECUTE
67
+
68
+ Run the selected task using the Skill tool:
69
+
70
+ ```
71
+ Skill: "df:execute"
72
+ Args: "{task_id}" (e.g., "T3")
73
+ ```
74
+
75
+ This handles worktree creation, agent spawning, ratchet health checks, and commit.
76
+
77
+ **Bootstrap handling:** `/df:execute` may report `"bootstrap: completed"` instead of a regular task result. This means the ratchet snapshot was empty (zero test files) and the cycle was used to write baseline tests. When this happens:
78
+
79
+ - Do NOT treat it as a task failure or skip
80
+ - Record the bootstrap in the report (step 4) using task ID `BOOTSTRAP` and status `passed`
81
+ - Exit normally — the NEXT cycle will pick up the first regular task (now protected by the bootstrapped tests)
82
+ - Do NOT attempt to execute a regular task in the same cycle as a bootstrap
83
+
84
+ ### 3.5. WRITE STATE
85
+
86
+ After `/df:execute` returns, record the task result in `.deepflow/auto-memory.yaml`:
87
+
88
+ **On success (ratchet passed):**
89
+
90
+ ```yaml
91
+ # Set task_results[task_id] = success entry
92
+ task_results:
93
+ {task_id}: { status: success, commit: {short_hash}, cycle: {cycle_number} }
94
+ ```
95
+
96
+ **On revert (ratchet failed):**
97
+
98
+ ```yaml
99
+ # Set task_results[task_id] = reverted entry
100
+ task_results:
101
+ {task_id}: { status: reverted, reason: "{ratchet failure summary}", cycle: {cycle_number} }
102
+
103
+ # Append to revert_history
104
+ revert_history:
105
+ - { task: {task_id}, cycle: {cycle_number}, reason: "{ratchet failure summary}" }
106
+ ```
107
+
108
+ Read the current file first (create if missing), merge the new values, and write back. Preserve all existing keys.
109
+
110
+ ### 3.6. CIRCUIT BREAKER
111
+
112
+ After `/df:execute` returns, check whether the task was reverted (ratchet failed):
113
+
114
+ **On revert (ratchet failed):**
115
+
116
+ ```
117
+ 1. Read .deepflow/auto-memory.yaml (create if missing)
118
+ 2. Increment consecutive_reverts[task_id] by 1
119
+ 3. Write updated value back to .deepflow/auto-memory.yaml
120
+ 4. Read circuit_breaker_threshold from .deepflow/config.yaml (default: 3 if key absent)
121
+ 5. If consecutive_reverts[task_id] >= threshold:
122
+ → Do NOT start /loop again
123
+ → Report: "Circuit breaker tripped: T{n} failed {N} consecutive times. Reason: {last ratchet failure}"
124
+ → Halt (exit without scheduling next cycle)
125
+ Else:
126
+ → Continue to step 4 (UPDATE REPORT) as normal
127
+ ```
128
+
129
+ **On success (ratchet passed):**
130
+
131
+ ```
132
+ 1. Reset consecutive_reverts[task_id] to 0 in .deepflow/auto-memory.yaml
133
+ ```
134
+
135
+ **auto-memory.yaml schema for the circuit breaker:**
136
+
137
+ ```yaml
138
+ consecutive_reverts:
139
+ T1: 0
140
+ T3: 2
141
+ ```
142
+
143
+ **config.yaml key:**
144
+
145
+ ```yaml
146
+ circuit_breaker_threshold: 3 # halt after this many consecutive reverts on the same task
147
+ ```
148
+
149
+ ### 4. UPDATE REPORT
150
+
151
+ Write a comprehensive report to `.deepflow/auto-report.md` after every cycle. The file is appended each cycle — never overwritten. Each cycle adds its row to the per-cycle log table and updates the running summary counts.
152
+
153
+ #### 4.1 File structure
154
+
155
+ The report uses four sections. On the **first cycle** (file does not exist), create the full skeleton. On **subsequent cycles**, update the existing file in-place:
156
+
157
+ ```markdown
158
+ # Auto Mode Report — {spec_name}
159
+
160
+ _Last updated: {YYYY-MM-DDTHH:MM:SSZ}_
161
+
162
+ ## Summary
163
+
164
+ | Metric | Value |
165
+ |--------|-------|
166
+ | Total cycles run | {N} |
167
+ | Tasks committed | {N} |
168
+ | Tasks reverted | {N} |
169
+
170
+ ## Cycle Log
171
+
172
+ | Cycle | Task | Status | Commit / Revert | Reason | Timestamp |
173
+ |-------|------|--------|-----------------|--------|-----------|
174
+ | 1 | T1 | passed | abc1234 | — | 2025-01-15T10:00:00Z |
175
+ | 2 | T2 | failed | reverted | tests failed: 2 of 24 | 2025-01-15T10:05:00Z |
176
+
177
+ ## Probe Results
178
+
179
+ _(empty until a probe/spike task runs)_
180
+
181
+ | Probe | Metric | Winner | Loser | Notes |
182
+ |-------|--------|--------|-------|-------|
183
+
184
+ ## Health Score
185
+
186
+ | Check | Status |
187
+ |-------|--------|
188
+ | Tests passed | {N} / {total} |
189
+ | Build status | passing / failing |
190
+ | Ratchet | green / red |
191
+
192
+ ## Reverted Tasks
193
+
194
+ _(tasks that were reverted with their failure reasons)_
195
+
196
+ | Task | Cycle | Reason |
197
+ |------|-------|--------|
198
+ ```
199
+
200
+ #### 4.2 Per-cycle update rules
201
+
202
+ **Cycle Log — append one row:**
203
+
204
+ ```
205
+ | {cycle_number} | {task_id} | {status} | {commit_hash or "reverted"} | {reason or "—"} | {YYYY-MM-DDTHH:MM:SSZ} |
206
+ ```
207
+
208
+ - `cycle_number`: total number of cycles executed so far (count existing data rows in the Cycle Log + 1)
209
+ - `task_id`: task ID from PLAN.md, or `BOOTSTRAP` for bootstrap cycles
210
+ - `status`: `passed` (ratchet passed), `failed` (ratchet failed, reverted), or `skipped` (task was already done)
211
+ - `commit_hash`: short hash from the commit, or `reverted` if ratchet failed
212
+ - `reason`: failure reason from ratchet output (e.g., `"tests failed: 2 of 24"`), or `—` if passed
213
+
214
+ **Summary table — recalculate from Cycle Log rows:**
215
+
216
+ - `Total cycles run`: count of all data rows in the Cycle Log
217
+ - `Tasks committed`: count of rows where Status = `passed`
218
+ - `Tasks reverted`: count of rows where Status = `failed`
219
+
220
+ **Last updated timestamp:** always overwrite the `_Last updated:` line with the current timestamp.
221
+
222
+ #### 4.3 Probe results (when applicable)
223
+
224
+ If the executed task was a probe/spike (task description contains "probe" or "spike"), append a row to the Probe Results table:
225
+
226
+ ```
227
+ | {probe_name} | {metric description} | {winner approach} | {loser approach} | {key insight from probe_learnings in auto-memory.yaml} |
228
+ ```
229
+
230
+ Read `probe_learnings` from `.deepflow/auto-memory.yaml` for the insight text.
231
+
232
+ If no probe has run yet, leave the `_(empty until a probe/spike task runs)_` placeholder in place.
233
+
234
+ #### 4.4 Health score (after every cycle)
235
+
236
+ Read the ratchet output from the last `/df:execute` result and populate:
237
+
238
+ - `Tests passed`: e.g., `22 / 24` (from ratchet summary line)
239
+ - `Build status`: `passing` if exit code 0, `failing` if build error
240
+ - `Ratchet`: `green` if ratchet passed, `red` if ratchet failed
241
+
242
+ Replace the entire Health Score section content with the latest values each cycle.
243
+
244
+ #### 4.5 Reverted tasks section
245
+
246
+ After every revert, append a row to the Reverted Tasks table:
247
+
248
+ ```
249
+ | {task_id} | {cycle_number} | {failure reason} |
250
+ ```
251
+
252
+ Read from `revert_history` in `.deepflow/auto-memory.yaml` to ensure no entry is missed. If no tasks have been reverted, leave the `_(tasks that were reverted...)_` placeholder in place.
253
+
254
+ ### 5. CHECK COMPLETION
255
+
256
+ **Count tasks in PLAN.md:**
257
+ ```
258
+ done_count = number of [x] tasks
259
+ pending_count = number of [ ] tasks
260
+ ```
261
+
262
+ **If ALL tasks are `[x]` (pending_count == 0):**
263
+ ```
264
+ → Run /df:verify via Skill tool (skill: "df:verify", no args)
265
+ → Report: "All tasks complete. Verification triggered."
266
+ ```
267
+
268
+ **If tasks remain (pending_count > 0):**
269
+ ```
270
+ → Report: "Cycle complete. {pending_count} tasks remaining."
271
+ → Exit — next /loop invocation will pick up
272
+ ```
273
+
274
+ ## Rules
275
+
276
+ | Rule | Detail |
277
+ |------|--------|
278
+ | One task per cycle | Fresh context each invocation — no multi-task batching |
279
+ | Bootstrap counts as the cycle's sole task | When `/df:execute` returns `bootstrap: completed`, no regular task runs that cycle |
280
+ | Idempotent | Safe to call with no work remaining — just reports "0 tasks remaining" |
281
+ | Never modifies PLAN.md directly | `/df:execute` handles PLAN.md updates and commits |
282
+ | Zero coordination overhead | Read plan → pick task → execute → update report → exit |
283
+ | Auto-memory updated after every cycle | `task_results`, `revert_history`, and `consecutive_reverts` in `.deepflow/auto-memory.yaml` are written after each EXECUTE result |
284
+ | Cross-cycle state read at cycle start | LOAD STATE reads the full `auto-memory.yaml` schema; prior task outcomes and probe learnings are available to the cycle |
285
+ | Circuit breaker halts the loop | After N consecutive reverts on the same task (default 3, configurable via `circuit_breaker_threshold` in `.deepflow/config.yaml`), the loop is stopped and the reason is reported |
286
+
287
+ ## Example
288
+
289
+ ### Bootstrap Cycle (no pre-existing tests)
290
+
291
+ ```
292
+ /df:auto-cycle
293
+
294
+ Loading PLAN.md... 3 tasks total, 0 done, 3 pending
295
+ Next ready task: T1 (no blockers)
296
+
297
+ Running: /df:execute T1
298
+ Ratchet snapshot: 0 pre-existing test files
299
+ Bootstrap needed — writing tests for edit_scope first
300
+ ✓ Bootstrap: ratchet passed (boo1234)
301
+ bootstrap: completed
302
+
303
+ Updated .deepflow/auto-report.md:
304
+ Summary: cycles=1, committed=1, reverted=0
305
+ Cycle Log row: | 1 | BOOTSTRAP | passed | boo1234 | — | 2025-01-15T10:00:00Z |
306
+ Health: tests 10/10, build passing, ratchet green
307
+
308
+ Cycle complete. 3 tasks remaining.
309
+ ```
310
+
311
+ ### Normal Cycle (task executed)
312
+
313
+ ```
314
+ /df:auto-cycle
315
+
316
+ Loading PLAN.md... 3 tasks total, 1 done, 2 pending
317
+ Next ready task: T2 (T1 dependency satisfied)
318
+
319
+ Running: /df:execute T2
320
+ ✓ T2: ratchet passed (abc1234)
321
+
322
+ Updated .deepflow/auto-report.md:
323
+ Summary: cycles=2, committed=2, reverted=0
324
+ Cycle Log row: | 2 | T2 | passed | abc1234 | — | 2025-01-15T10:05:00Z |
325
+ Health: tests 22/22, build passing, ratchet green
326
+
327
+ Cycle complete. 1 tasks remaining.
328
+ ```
329
+
330
+ ### All Tasks Done (verify triggered)
331
+
332
+ ```
333
+ /df:auto-cycle
334
+
335
+ Loading PLAN.md... 3 tasks total, 3 done, 0 pending
336
+
337
+ All tasks complete. Verification triggered.
338
+ Running: /df:verify
339
+ ✓ L0 | ✓ L1 | ⚠ L2 (no coverage tool) | ✓ L4
340
+ ✓ Merged df/upload to main
341
+ ```
342
+
343
+ ### No Work Remaining (idempotent)
344
+
345
+ ```
346
+ /df:auto-cycle
347
+
348
+ Loading PLAN.md... 3 tasks total, 3 done, 0 pending
349
+ Verification already complete (no doing-* specs found).
350
+
351
+ Nothing to do. Cycle complete. 0 tasks remaining.
352
+ ```
353
+
354
+ ### Circuit Breaker Tripped
355
+
356
+ ```
357
+ /df:auto-cycle
358
+
359
+ Loading PLAN.md... 3 tasks total, 1 done, 2 pending
360
+ Next ready task: T3 (no blockers)
361
+
362
+ Running: /df:execute T3
363
+ ✗ T3: ratchet failed — "2 tests regressed"
364
+ Reverted changes.
365
+
366
+ Circuit breaker: consecutive_reverts[T3] = 3 (threshold: 3)
367
+ Circuit breaker tripped: T3 failed 3 consecutive times. Reason: 2 tests regressed
368
+
369
+ Loop halted. Resolve T3 manually, then run /df:auto-cycle to resume.
370
+ ```
371
+
372
+ ### All Tasks Blocked
373
+
374
+ ```
375
+ /df:auto-cycle
376
+
377
+ Loading PLAN.md... 3 tasks total, 1 done, 2 pending
378
+
379
+ Error: All remaining tasks are blocked.
380
+ [ ] T3 — blocked by: T2 (incomplete)
381
+ [ ] T4 — blocked by: T2 (incomplete)
382
+
383
+ Run /df:execute to investigate or resolve blockers manually.
384
+ ```
@@ -1,16 +1,79 @@
1
- # /df:auto — Autonomous Mode
1
+ # /df:auto — Autonomous Mode Setup
2
2
 
3
- Run the full autonomous cycle via agent teams. Auto-promotes unprefixed specs to `doing-*`, then processes all `doing-*` specs through every phase: discover, pre-check, hypothesize, spike, implement, select, verify, PR, report.
3
+ Set up and launch fully autonomous execution. Runs `/df:plan` if no PLAN.md exists, takes a ratchet snapshot, then starts `/loop 1m /df:auto-cycle`.
4
+
5
+ **NEVER:** use EnterPlanMode, use ExitPlanMode
4
6
 
5
7
  ## Usage
6
8
  ```
7
- /df:auto # process all specs
9
+ /df:auto # Set up and start autonomous loop
8
10
  ```
9
11
 
10
12
  ## Behavior
11
13
 
12
- Load and execute the lead agent at `.claude/agents/deepflow-auto.md`.
14
+ ### 1. RUN PLAN IF NEEDED
15
+
16
+ ```
17
+ If PLAN.md does not exist:
18
+ → Run /df:plan via Skill tool (skill: "df:plan", no args)
19
+ → Wait for plan to complete before continuing
20
+ If PLAN.md exists:
21
+ → Skip planning, proceed to step 2
22
+ ```
23
+
24
+ ### 2. RATCHET SNAPSHOT
25
+
26
+ Before starting the loop, snapshot pre-existing test files so the ratchet has a stable baseline:
27
+
28
+ ```bash
29
+ # Snapshot pre-existing test files (only these count for ratchet)
30
+ git ls-files | grep -E '\.(test|spec)\.[^/]+$|^test_|_test\.[^/]+$|^tests/|__tests__/' \
31
+ > .deepflow/auto-snapshot.txt
32
+
33
+ echo "Ratchet snapshot: $(wc -l < .deepflow/auto-snapshot.txt) pre-existing test files"
34
+ ```
35
+
36
+ **Only pre-existing test files are used for ratchet evaluation.** New test files created by agents during implementation do not influence pass/fail decisions. This prevents agents from gaming the ratchet by writing tests that pass trivially.
37
+
38
+ ### 3. START LOOP
39
+
40
+ Launch the autonomous cycle loop:
41
+
42
+ ```
43
+ /loop 1m /df:auto-cycle
44
+ ```
13
45
 
14
- Run the full autonomous cycle now. Auto-promote unprefixed specs to `doing-*`, then process all `doing-*` specs through every phase. Do not ask questions act autonomously.
46
+ This starts `/df:auto-cycle` on a 1-minute recurring interval. Each invocation runs with fresh context no coordination overhead, zero LLM tokens on loop management.
15
47
 
16
- Output progress as each phase completes. Generate `.deepflow/auto-report.md` at the end.
48
+ ## Rules
49
+
50
+ | Rule | Detail |
51
+ |------|--------|
52
+ | Plan once | Only runs `/df:plan` if PLAN.md is absent |
53
+ | Snapshot before loop | Ratchet baseline is set before any agents run |
54
+ | No lead agent | No custom orchestrator — `/loop` is a native Claude Code feature |
55
+ | Zero loop overhead | Loop coordination uses zero LLM tokens |
56
+ | Cycle logic lives in `/df:auto-cycle` | This command is setup only |
57
+
58
+ ## Example
59
+
60
+ ```
61
+ /df:auto
62
+
63
+ No PLAN.md found — running /df:plan...
64
+ ✓ Plan generated — 1 spec, 5 tasks.
65
+
66
+ Ratchet snapshot: 12 pre-existing test files
67
+
68
+ Starting loop: /loop 1m /df:auto-cycle
69
+ ```
70
+
71
+ ```
72
+ /df:auto
73
+
74
+ PLAN.md exists — skipping plan.
75
+
76
+ Ratchet snapshot: 12 pre-existing test files
77
+
78
+ Starting loop: /loop 1m /df:auto-cycle
79
+ ```