deepflow 0.1.89 → 0.1.90

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "deepflow",
3
- "version": "0.1.89",
3
+ "version": "0.1.90",
4
4
  "description": "Doing reveals what thinking can't predict — spec-driven iterative development for Claude Code",
5
5
  "keywords": [
6
6
  "claude",
@@ -3,146 +3,4 @@ name: df:auto-cycle
3
3
  description: Execute one task from PLAN.md with ratchet health checks and state tracking for autonomous mode
4
4
  ---
5
5
 
6
- # /df:auto-cycle Single Cycle of Auto Mode
7
-
8
- Execute one task from PLAN.md. Called by `/loop 1m /df:auto-cycle` — each invocation gets fresh context.
9
-
10
- **NEVER:** use EnterPlanMode, use ExitPlanMode
11
-
12
- ## Behavior
13
-
14
- ### 1. LOAD STATE
15
-
16
- Shell injection (use output directly):
17
- - `` !`cat PLAN.md 2>/dev/null || echo 'NOT_FOUND'` `` — required, error if missing
18
- - `` !`cat .deepflow/auto-memory.yaml 2>/dev/null || echo 'NOT_FOUND'` `` — optional cross-cycle state
19
-
20
- **auto-memory.yaml schema:** see `/df:execute`. Each section optional, missing keys = empty. Created on first write if absent.
21
-
22
- ### 2. PICK NEXT TASK
23
-
24
- **Optimize-active override:** Check `optimize_state.task_id` in auto-memory.yaml first. If present and task still `[ ]` in PLAN.md, resume it (skip normal scan). If task is `[x]`, clear `optimize_state.task_id` and fall through.
25
-
26
- **Normal scan:** First `[ ]` task in PLAN.md where all `Blocked by:` deps are `[x]`.
27
-
28
- **No `[ ]` tasks:** Skip to step 5 (completion check).
29
-
30
- **All remaining blocked:** Error with blocker details, suggest `/df:execute` for manual resolution.
31
-
32
- ### 3. EXECUTE
33
-
34
- Run via Skill tool: `skill: "df:execute", args: "{task_id}"`. Handles worktree, agent spawning, ratchet, commit.
35
-
36
- **Bootstrap handling:** If execute returns `"bootstrap: completed"` (zero pre-existing tests, baseline written):
37
- - Record as task `BOOTSTRAP`, status `passed`
38
- - Do NOT run a regular task in the same cycle
39
- - Next cycle picks up the first regular task
40
-
41
- ### 3.5. WRITE STATE
42
-
43
- After execute returns, update `.deepflow/auto-memory.yaml` (read-merge-write, preserve all keys):
44
-
45
- | Outcome | Write |
46
- |---------|-------|
47
- | Success (non-optimize) | `task_results[id]: {status: success, commit: {hash}, cycle: {N}}` |
48
- | Revert (non-optimize) | `task_results[id]: {status: reverted, reason: "{msg}", cycle: {N}}` + append to `revert_history` |
49
- | Optimize cycle | Merge updated `optimize_state` from execute (confirm `cycles_run`, `current_best`, `history`) |
50
-
51
- ### 3.6. CIRCUIT BREAKER
52
-
53
- **Failure = any L0-L5 verification failure** (build, files, coverage, tests, browser assertions). Does NOT count: L5 skip (no frontend), L5 pass-on-retry.
54
-
55
- **On revert (non-optimize):**
56
- 1. Increment `consecutive_reverts[task_id]` in auto-memory.yaml
57
- 2. Read `circuit_breaker_threshold` from `.deepflow/config.yaml` (default: 3)
58
- 3. If `consecutive_reverts[task_id] >= threshold`: halt loop, report "Circuit breaker tripped: T{n} failed {N} times. Reason: {msg}"
59
- 4. Else: continue to step 4
60
-
61
- **On success (non-optimize):** Reset `consecutive_reverts[task_id]` to 0.
62
-
63
- **Optimize stop conditions** (from execute terminal outcomes):
64
-
65
- | Outcome | Action |
66
- |---------|--------|
67
- | `"target reached: {value}"` | Confirm task [x], write optimize completion (3.7), report, continue |
68
- | `"max cycles reached, best: {value}"` | Confirm task [x], write optimize completion (3.7), report, continue |
69
- | `"circuit breaker: 3 consecutive reverts"` | Task stays [ ], write failure to experiments (3.7), preserve optimize_state, halt loop |
70
-
71
- ### 3.7. OPTIMIZE COMPLETION
72
-
73
- **On target reached or max cycles (task [x]):**
74
- 1. Write each `failed_hypotheses` entry to `.deepflow/experiments/{spec}--optimize-{task_id}--{slug}--failed.md`
75
- 2. Write summary to `.deepflow/experiments/{spec}--optimize-{task_id}--summary--{status}.md` with metric/target/direction/baseline/best/cycles/history table
76
- 3. Clear `optimize_state` from auto-memory.yaml
77
-
78
- **On circuit breaker halt:** Same experiment writes but with status `circuit_breaker`. Preserve `optimize_state` in auto-memory.yaml (add `halted: circuit_breaker` note).
79
-
80
- ### 4. UPDATE REPORT
81
-
82
- Write to `.deepflow/auto-report.md` — append each cycle, never overwrite. First cycle creates skeleton, subsequent cycles update in-place.
83
-
84
- **File sections:** Summary table, Cycle Log, Probe Results, Optimize Runs, Secondary Metric Warnings, Health Score, Reverted Tasks.
85
-
86
- #### Per-cycle update rules
87
-
88
- | Section | When | Action |
89
- |---------|------|--------|
90
- | Cycle Log | Every cycle | Append row: `cycle | task_id | status | commit/reverted | delta | metric_delta | reason | timestamp` |
91
- | Summary | Every cycle | Recalculate from Cycle Log: total cycles, committed, reverted, optimize cycles/best (if applicable) |
92
- | Last updated | Every cycle | Overwrite timestamp |
93
- | Probe Results | Probe/spike task | Append row from `probe_learnings` in auto-memory.yaml |
94
- | Optimize Runs | Optimize terminal event | Append row: task/metric/baseline/best/target/cycles/status |
95
- | Secondary Metric Warnings | >5% regression | Append row (severity: WARNING, advisory only — no auto-revert) |
96
- | Health Score | Every cycle | Replace with latest: tests passed, build status, ratchet green/red, optimize status |
97
- | Reverted Tasks | On revert | Append row from `revert_history` |
98
-
99
- **Status values:** `passed`, `failed` (reverted), `skipped` (already done), `optimize` (inner cycle).
100
-
101
- **Delta format:** `tests: {before}→{after}, build: ok/fail`. Include coverage if available. On revert, show regression.
102
-
103
- **Optimize status in Health Score:** `in_progress` | `reached` | `max_cycles` | `circuit_breaker` | `—` (omit row if no optimize tasks in PLAN.md).
104
-
105
- ### 5. CHECK COMPLETION
106
-
107
- Count `[x]` and `[ ]` tasks in PLAN.md. Per-spec verify+merge happens in `/df:execute` step 8 automatically.
108
-
109
- - **No `[ ]` remaining:** "All specs verified and merged. Workflow complete." → exit
110
- - **Tasks remain:** "Cycle complete. {N} tasks remaining." → exit (next /loop invocation picks up)
111
-
112
- ## Rules
113
-
114
- | Rule | Detail |
115
- |------|--------|
116
- | One task per cycle | Fresh context each invocation — no multi-task batching |
117
- | Bootstrap = sole task | No regular task runs in a bootstrap cycle |
118
- | Idempotent | Safe to call with no work — reports "0 tasks remaining" |
119
- | Never modifies PLAN.md | `/df:execute` handles PLAN.md updates |
120
- | Auto-memory after every cycle | `task_results`, `revert_history`, `consecutive_reverts` always written |
121
- | Circuit breaker halts loop | Default 3 consecutive reverts (configurable: `circuit_breaker_threshold` in config.yaml) |
122
- | One optimize at a time | Defers other optimize tasks until active one terminates |
123
- | Optimize resumes across contexts | `optimize_state.task_id` overrides normal scan |
124
- | Optimize CB preserves state | On halt: task stays [ ], optimize_state kept for diagnosis |
125
- | Secondary metric regression advisory | >5% = WARNING in report, never auto-revert |
126
- | Optimize completion writes experiments | Failed hypotheses + summary to `.deepflow/experiments/` |
127
-
128
- ## Example
129
-
130
- ### Normal Cycle
131
- ```
132
- /df:auto-cycle
133
- Loading PLAN.md... 3 tasks, 1 done, 2 pending
134
- Next: T2 (T1 satisfied)
135
- Running: /df:execute T2 → ✓ ratchet passed (abc1234)
136
- Updated auto-report.md: cycles=2, committed=2
137
- Cycle complete. 1 tasks remaining.
138
- ```
139
-
140
- ### Circuit Breaker Tripped
141
- ```
142
- /df:auto-cycle
143
- Loading PLAN.md... 3 tasks, 1 done, 2 pending
144
- Next: T3
145
- Running: /df:execute T3 → ✗ ratchet failed — "2 tests regressed"
146
- Circuit breaker: consecutive_reverts[T3] = 3 (threshold: 3)
147
- Loop halted. Resolve T3 manually, then resume.
148
- ```
6
+ Use the Skill tool to invoke the `auto-cycle` skill, passing through any arguments.
@@ -42,4 +42,4 @@ Each invocation gets fresh context — zero LLM tokens on loop management.
42
42
  | Plan once | Only runs `/df:plan` if PLAN.md absent |
43
43
  | Snapshot before loop | Ratchet baseline set before any agents run |
44
44
  | No lead agent | `/loop` is native Claude Code — no custom orchestrator |
45
- | Cycle logic in `/df:auto-cycle` | This command is setup only |
45
+ | Cycle logic in `src/skills/auto-cycle/SKILL.md` | This command is setup only; `/df:auto-cycle` is a shim that delegates to the skill |
@@ -185,7 +185,7 @@ Trigger: ≥2 [SPIKE] tasks with same blocker or identical hypothesis.
185
185
  4. Per notification: ratchet (§5.5). Record: ratchet_passed, regressions, coverage_delta, files_changed, commit.
186
186
  5. **Winner selection** (no LLM judge): disqualify regressions. Standard: fewer regressions > coverage > fewer files > first complete. Optimize: best metric delta > fewer regressions > fewer files. No passes → reset pending for debugger.
187
187
  6. Preserve all worktrees. Losers: branch + `-failed`. Record in checkpoint.json.
188
- 7. Log all outcomes to `.deepflow/auto-memory.yaml` under `spike_insights`+`probe_learnings` (schema in auto-cycle.md). Both winners and losers.
188
+ 7. Log all outcomes to `.deepflow/auto-memory.yaml` under `spike_insights`+`probe_learnings` (schema in src/skills/auto-cycle/SKILL.md). Both winners and losers.
189
189
  8. Cherry-pick winner into shared worktree. Winner → `[x] [PROBE_WINNER]`, losers → `[~] [PROBE_FAILED]`.
190
190
 
191
191
  #### 5.7.1. PROBE DIVERSITY (Optimize Probes)
@@ -361,23 +361,31 @@ Before merge, spawn an independent Opus QA agent that sees ONLY the spec and exp
361
361
 
362
362
  3. On notification:
363
363
  a. Run ratchet check (§5.5) — all integration tests must pass.
364
- b. **Tests pass** → commit stands. Proceed to step 8.2 (merge).
365
- c. **Tests fail** → **merge is blocked**. Do NOT retry. Report:
366
- `"✗ Final integration tests failed for {spec} — merge blocked, requires human review"`
367
- Leave worktree intact. Set all spec tasks back to `TaskUpdate(status: "pending")`.
368
- Write failure details to `.deepflow/results/final-test-{spec}.yaml`:
364
+ b. **Tests pass** → commit stands. Proceed to step 8.2 (full L0-L5 verify + merge).
365
+ c. **Tests fail** → **merge is blocked**. Do NOT retry. Run diagnostic verify:
366
+ ```
367
+ skill: "df:verify", args: "--diagnostic doing-{name}"
368
+ ```
369
+ Capture the L0-L4 results from verify output (pass/fail/warn per level). Write to `.deepflow/results/final-test-{spec}.yaml`:
369
370
  ```yaml
370
371
  spec: {spec}
371
372
  status: blocked
372
373
  reason: "Final integration tests failed"
373
374
  output: |
374
375
  {truncated test output — last 30 lines}
376
+ diagnostics:
377
+ L0: {pass|fail}
378
+ L1: {pass|fail}
379
+ L2: {pass|warn|fail}
380
+ L4: {pass|fail}
375
381
  ```
376
- STOP. Do not proceed to merge.
382
+ Leave worktree intact. Set all spec tasks back to `TaskUpdate(status: "pending")`.
383
+ Report: `"✗ Final tests failed for {spec} — diagnostic verify: L0 {✓|✗} | L1 {✓|✗} | L2 {✓|⚠|✗} | L4 {✓|✗} — merge blocked"`
384
+ STOP. Do not proceed to merge. Diagnostic verify is informational only — no fix agents, no retries.
377
385
 
378
386
  **8.2. Merge and cleanup:**
379
387
  1. `skill: "df:verify", args: "doing-{name}"` — runs L0-L4 gates, merges, cleans worktree, renames doing→done, extracts decisions. Fail (fix tasks added) → stop; `--continue` picks them up.
380
- 2. Remove spec's ENTIRE section from PLAN.md. Recalculate Summary table.
388
+ 2. PLAN.md section cleanup handled by verify (step 6).
381
389
 
382
390
  ---
383
391
 
@@ -428,5 +436,5 @@ Reverted task: `TaskUpdate(status: "pending")`, dependents stay blocked. Repeate
428
436
  | Plateau → probes | 3 cycles <1% triggers probes |
429
437
  | Circuit breaker = 3 reverts | Halts, needs human |
430
438
  | Wave test after ratchet | Opus writes tests; 3 attempts then revert |
431
- | Final test before merge | Opus black-box integration tests; failure blocks merge, no retry |
439
+ | Final test before merge | Opus black-box integration tests; pass → full L0-L5 verify + merge; failure diagnostic L0-L4 verify, results in final-test-{spec}.yaml, merge blocked |
432
440
  | Probe diversity | ≥1 contraditoria + ≥1 ingenua |
@@ -12,14 +12,36 @@ context: fork
12
12
 
13
13
  ## Usage
14
14
  ```
15
- /df:verify # Verify doing-* specs with all tasks completed
16
- /df:verify doing-upload # Verify specific spec
17
- /df:verify --re-verify # Re-verify done-* specs (already merged)
15
+ /df:verify # Verify doing-* specs with all tasks completed
16
+ /df:verify doing-upload # Verify specific spec
17
+ /df:verify --re-verify # Re-verify done-* specs (already merged)
18
+ /df:verify --diagnostic doing-upload # L0-L4 only; write results to diagnostics yaml; no merge/fix/rename
18
19
  ```
19
20
 
20
21
  ## Spec File States
21
22
  `specs/feature.md` → unplanned (skip) | `doing-*.md` → default target | `done-*.md` → `--re-verify` only
22
23
 
24
+ ## Diagnostic Mode (`--diagnostic`)
25
+
26
+ When invoked with `--diagnostic`:
27
+
28
+ - Run **L0-L4 only** (skip L5 entirely, even if frontend detected).
29
+ - Write results to `.deepflow/results/final-test-{spec}.yaml` under a `diagnostics:` key:
30
+ ```yaml
31
+ diagnostics:
32
+ spec: doing-upload
33
+ timestamp: 2024-01-15T10:30:00Z
34
+ L0: pass # or fail
35
+ L1: pass # or fail
36
+ L2: pass # or warn (no tool)
37
+ L4: fail # or pass
38
+ summary: "L0 ✓ | L1 ✓ | L2 ⚠ | L3 — | L4 ✗"
39
+ ```
40
+ - Prefix all report output with `[DIAGNOSTIC]`.
41
+ - **Skip entirely:** Post-Verification merge (§4), fix task creation, spec rename, decision extraction, PLAN.md cleanup (step 6).
42
+ - Does **not** count as a revert for the circuit breaker.
43
+ - Does **not** modify `auto-snapshot.txt`.
44
+
23
45
  ## Behavior
24
46
 
25
47
  ### 1. LOAD CONTEXT
@@ -71,7 +93,9 @@ No tool → pass with warning. When available: stash changes → run coverage on
71
93
 
72
94
  Algorithm: detect frontend → resolve dev command/port → start server → poll readiness → read assertions from PLAN.md → auto-install Playwright Chromium → evaluate via `locator.ariaSnapshot()` → screenshot → retry once on failure → report.
73
95
 
74
- **Step 1: Detect frontend.** Config `quality.browser_verify` overrides: `false` → always skip (`L5 — (no frontend)`), `true` → always run, absent → auto-detect from package.json (both deps and devDeps):
96
+ **Step 1: Detect frontend.** Config `quality.browser_verify` overrides: `false` → always skip (`L5 — (no frontend)`), `true` → always run, absent → auto-detect using BOTH conditions:
97
+
98
+ 1. Frontend framework found in package.json (deps or devDeps):
75
99
 
76
100
  | Package(s) | Framework |
77
101
  |------------|-----------|
@@ -82,7 +106,12 @@ Algorithm: detect frontend → resolve dev command/port → start server → pol
82
106
  | `@sveltejs/kit` | SvelteKit |
83
107
  | `svelte`, `@sveltejs/*` | Svelte |
84
108
 
85
- No frontend detected and no config override `L5 (no frontend)`, skip remaining L5 steps.
109
+ 2. A `browser_assertions:` block exists in PLAN.md scoped to the current spec.
110
+
111
+ **Auto-detect outcomes (no config override):**
112
+ - No frontend detected → `L5 — (no frontend)`, skip remaining L5 steps.
113
+ - Frontend detected but no `browser_assertions:` block in PLAN.md for current spec → `L5 — (no browser_assertions in PLAN.md)`, skip remaining L5 steps.
114
+ - Both conditions met → proceed to Steps 2–6.
86
115
 
87
116
  **Step 2: Dev server lifecycle.**
88
117
  1. **Resolve dev command:** Config `quality.dev_command` wins → fallback to `npm run dev` if `scripts.dev` exists → none found → skip L5 with warning.
@@ -113,7 +142,7 @@ No frontend detected and no config override → `L5 — (no frontend)`, skip rem
113
142
  | Fail | Fail — same selectors | L5 ✗ — genuine failure |
114
143
  | Fail | Fail — different selectors | L5 ✗ (flaky) |
115
144
 
116
- All L5 outcomes: `✓` pass | `⚠` passed on retry | `✗` both failed (same) | `✗ (flaky)` both failed (different) | `— (no frontend)` | `— (no assertions)` | `✗ (install failed)`
145
+ All L5 outcomes: `✓` pass | `⚠` passed on retry | `✗` both failed (same) | `✗ (flaky)` both failed (different) | `— (no frontend)` | `— (no browser_assertions in PLAN.md)` | `— (no assertions)` | `✗ (install failed)`
117
146
 
118
147
  **Fix task on L5 failure:** Append to PLAN.md under spec section with next T{n} ID. Include: failing assertions (selector + detail), first 40 lines of `locator('body').ariaSnapshot()` DOM excerpt, screenshot path, flakiness note if assertion sets differed.
119
148
 
@@ -158,12 +187,13 @@ Objective: ... | Approach: ... | Why it worked: ... | Files: ...
158
187
 
159
188
  ## Post-Verification: Worktree Merge & Cleanup
160
189
 
161
- **Only runs when ALL gates pass.**
190
+ **Only runs when ALL gates pass AND `--diagnostic` was NOT used.**
162
191
 
163
192
  1. **Discover worktree:** Read `.deepflow/checkpoint.json` for `worktree_branch`/`worktree_path`. Fallback: infer from `doing-*` spec name + `git worktree list --porcelain`. No worktree → "nothing to merge", exit.
164
193
  2. **Merge:** `git checkout main && git merge ${BRANCH} --no-ff -m "feat({spec}): merge verified changes"`. On conflict → keep worktree, output "Resolve manually, run /df:verify --merge-only", exit.
165
194
  3. **Cleanup:** `git worktree remove --force ${PATH} && git branch -d ${BRANCH} && rm -f .deepflow/checkpoint.json`
166
195
  4. **Rename spec:** `mv specs/doing-${NAME}.md specs/done-${NAME}.md`
167
196
  5. **Extract decisions:** Read done spec, extract `[APPROACH]`/`[ASSUMPTION]`/`[PROVISIONAL]` decisions, append to `.deepflow/decisions.md` as `### {date} — {spec}\n- [TAG] decision — rationale`. Delete done spec after successful write; preserve on failure.
197
+ 6. **Clean PLAN.md:** Find the `### {spec-name}` section (match on name stem, strip `doing-`/`done-` prefix). Delete from header through the line before the next `### ` header (or EOF). Recalculate Summary table (recount `### ` headers for spec count, `- [ ]`/`- [x]` for task counts). If no spec sections remain, delete PLAN.md entirely. Skip silently if PLAN.md missing or section already gone.
168
198
 
169
- Output: `✓ Merged → main | ✓ Cleaned worktree | ✓ Spec complete | Workflow complete! Ready: /df:spec <name>`
199
+ Output: `✓ Merged → main | ✓ Cleaned worktree | ✓ Spec complete | ✓ Cleaned PLAN.md | Workflow complete! Ready: /df:spec <name>`
@@ -0,0 +1,148 @@
1
+ ---
2
+ name: auto-cycle
3
+ description: Execute one task from PLAN.md with ratchet health checks and state tracking for autonomous mode
4
+ ---
5
+
6
+ # auto-cycle — Single Cycle of Auto Mode
7
+
8
+ Execute one task from PLAN.md. Called by `/loop 1m /df:auto-cycle` — each invocation gets fresh context.
9
+
10
+ **NEVER:** use EnterPlanMode, use ExitPlanMode
11
+
12
+ ## Behavior
13
+
14
+ ### 1. LOAD STATE
15
+
16
+ Shell injection (use output directly):
17
+ - `` !`cat PLAN.md 2>/dev/null || echo 'NOT_FOUND'` `` — required, error if missing
18
+ - `` !`cat .deepflow/auto-memory.yaml 2>/dev/null || echo 'NOT_FOUND'` `` — optional cross-cycle state
19
+
20
+ **auto-memory.yaml schema:** see `/df:execute`. Each section optional, missing keys = empty. Created on first write if absent.
21
+
22
+ ### 2. PICK NEXT TASK
23
+
24
+ **Optimize-active override:** Check `optimize_state.task_id` in auto-memory.yaml first. If present and task still `[ ]` in PLAN.md, resume it (skip normal scan). If task is `[x]`, clear `optimize_state.task_id` and fall through.
25
+
26
+ **Normal scan:** First `[ ]` task in PLAN.md where all `Blocked by:` deps are `[x]`.
27
+
28
+ **No `[ ]` tasks:** Skip to step 5 (completion check).
29
+
30
+ **All remaining blocked:** Error with blocker details, suggest `/df:execute` for manual resolution.
31
+
32
+ ### 3. EXECUTE
33
+
34
+ Run via Skill tool: `skill: "df:execute", args: "{task_id}"`. Handles worktree, agent spawning, ratchet, commit.
35
+
36
+ **Bootstrap handling:** If execute returns `"bootstrap: completed"` (zero pre-existing tests, baseline written):
37
+ - Record as task `BOOTSTRAP`, status `passed`
38
+ - Do NOT run a regular task in the same cycle
39
+ - Next cycle picks up the first regular task
40
+
41
+ ### 3.5. WRITE STATE
42
+
43
+ After execute returns, update `.deepflow/auto-memory.yaml` (read-merge-write, preserve all keys):
44
+
45
+ | Outcome | Write |
46
+ |---------|-------|
47
+ | Success (non-optimize) | `task_results[id]: {status: success, commit: {hash}, cycle: {N}}` |
48
+ | Revert (non-optimize) | `task_results[id]: {status: reverted, reason: "{msg}", cycle: {N}}` + append to `revert_history` |
49
+ | Optimize cycle | Merge updated `optimize_state` from execute (confirm `cycles_run`, `current_best`, `history`) |
50
+
51
+ ### 3.6. CIRCUIT BREAKER
52
+
53
+ **Failure = any L0-L5 verification failure** (build, files, coverage, tests, browser assertions). Does NOT count: L5 skip (no frontend), L5 pass-on-retry.
54
+
55
+ **On revert (non-optimize):**
56
+ 1. Increment `consecutive_reverts[task_id]` in auto-memory.yaml
57
+ 2. Read `circuit_breaker_threshold` from `.deepflow/config.yaml` (default: 3)
58
+ 3. If `consecutive_reverts[task_id] >= threshold`: halt loop, report "Circuit breaker tripped: T{n} failed {N} times. Reason: {msg}"
59
+ 4. Else: continue to step 4
60
+
61
+ **On success (non-optimize):** Reset `consecutive_reverts[task_id]` to 0.
62
+
63
+ **Optimize stop conditions** (from execute terminal outcomes):
64
+
65
+ | Outcome | Action |
66
+ |---------|--------|
67
+ | `"target reached: {value}"` | Confirm task [x], write optimize completion (3.7), report, continue |
68
+ | `"max cycles reached, best: {value}"` | Confirm task [x], write optimize completion (3.7), report, continue |
69
+ | `"circuit breaker: 3 consecutive reverts"` | Task stays [ ], write failure to experiments (3.7), preserve optimize_state, halt loop |
70
+
71
+ ### 3.7. OPTIMIZE COMPLETION
72
+
73
+ **On target reached or max cycles (task [x]):**
74
+ 1. Write each `failed_hypotheses` entry to `.deepflow/experiments/{spec}--optimize-{task_id}--{slug}--failed.md`
75
+ 2. Write summary to `.deepflow/experiments/{spec}--optimize-{task_id}--summary--{status}.md` with metric/target/direction/baseline/best/cycles/history table
76
+ 3. Clear `optimize_state` from auto-memory.yaml
77
+
78
+ **On circuit breaker halt:** Same experiment writes but with status `circuit_breaker`. Preserve `optimize_state` in auto-memory.yaml (add `halted: circuit_breaker` note).
79
+
80
+ ### 4. UPDATE REPORT
81
+
82
+ Write to `.deepflow/auto-report.md` — append each cycle, never overwrite. First cycle creates skeleton, subsequent cycles update in-place.
83
+
84
+ **File sections:** Summary table, Cycle Log, Probe Results, Optimize Runs, Secondary Metric Warnings, Health Score, Reverted Tasks.
85
+
86
+ #### Per-cycle update rules
87
+
88
+ | Section | When | Action |
89
+ |---------|------|--------|
90
+ | Cycle Log | Every cycle | Append row: `cycle | task_id | status | commit/reverted | delta | metric_delta | reason | timestamp` |
91
+ | Summary | Every cycle | Recalculate from Cycle Log: total cycles, committed, reverted, optimize cycles/best (if applicable) |
92
+ | Last updated | Every cycle | Overwrite timestamp |
93
+ | Probe Results | Probe/spike task | Append row from `probe_learnings` in auto-memory.yaml |
94
+ | Optimize Runs | Optimize terminal event | Append row: task/metric/baseline/best/target/cycles/status |
95
+ | Secondary Metric Warnings | >5% regression | Append row (severity: WARNING, advisory only — no auto-revert) |
96
+ | Health Score | Every cycle | Replace with latest: tests passed, build status, ratchet green/red, optimize status |
97
+ | Reverted Tasks | On revert | Append row from `revert_history` |
98
+
99
+ **Status values:** `passed`, `failed` (reverted), `skipped` (already done), `optimize` (inner cycle).
100
+
101
+ **Delta format:** `tests: {before}→{after}, build: ok/fail`. Include coverage if available. On revert, show regression.
102
+
103
+ **Optimize status in Health Score:** `in_progress` | `reached` | `max_cycles` | `circuit_breaker` | `—` (omit row if no optimize tasks in PLAN.md).
104
+
105
+ ### 5. CHECK COMPLETION
106
+
107
+ Count `[x]` and `[ ]` tasks in PLAN.md. Per-spec verify+merge happens in `/df:execute` step 8 automatically.
108
+
109
+ - **No `[ ]` remaining:** "All specs verified and merged. Workflow complete." → exit
110
+ - **Tasks remain:** "Cycle complete. {N} tasks remaining." → exit (next /loop invocation picks up)
111
+
112
+ ## Rules
113
+
114
+ | Rule | Detail |
115
+ |------|--------|
116
+ | One task per cycle | Fresh context each invocation — no multi-task batching |
117
+ | Bootstrap = sole task | No regular task runs in a bootstrap cycle |
118
+ | Idempotent | Safe to call with no work — reports "0 tasks remaining" |
119
+ | Never modifies PLAN.md | `/df:execute` handles PLAN.md updates |
120
+ | Auto-memory after every cycle | `task_results`, `revert_history`, `consecutive_reverts` always written |
121
+ | Circuit breaker halts loop | Default 3 consecutive reverts (configurable: `circuit_breaker_threshold` in config.yaml) |
122
+ | One optimize at a time | Defers other optimize tasks until active one terminates |
123
+ | Optimize resumes across contexts | `optimize_state.task_id` overrides normal scan |
124
+ | Optimize CB preserves state | On halt: task stays [ ], optimize_state kept for diagnosis |
125
+ | Secondary metric regression advisory | >5% = WARNING in report, never auto-revert |
126
+ | Optimize completion writes experiments | Failed hypotheses + summary to `.deepflow/experiments/` |
127
+
128
+ ## Example
129
+
130
+ ### Normal Cycle
131
+ ```
132
+ /df:auto-cycle
133
+ Loading PLAN.md... 3 tasks, 1 done, 2 pending
134
+ Next: T2 (T1 satisfied)
135
+ Running: /df:execute T2 → ✓ ratchet passed (abc1234)
136
+ Updated auto-report.md: cycles=2, committed=2
137
+ Cycle complete. 1 tasks remaining.
138
+ ```
139
+
140
+ ### Circuit Breaker Tripped
141
+ ```
142
+ /df:auto-cycle
143
+ Loading PLAN.md... 3 tasks, 1 done, 2 pending
144
+ Next: T3
145
+ Running: /df:execute T3 → ✗ ratchet failed — "2 tests regressed"
146
+ Circuit breaker: consecutive_reverts[T3] = 3 (threshold: 3)
147
+ Loop halted. Resolve T3 manually, then resume.
148
+ ```
@@ -82,9 +82,8 @@ quality:
82
82
  # Retry flaky tests once before failing (default: true)
83
83
  test_retry_on_fail: true
84
84
 
85
- # Enable L5 browser verification after tests pass (default: false)
86
- # When true, deepflow will start the dev server and run visual checks
87
- browser_verify: false
85
+ # Three-state: true (force L5), false (skip L5), absent/commented (auto-detect from package.json + browser_assertions)
86
+ # browser_verify:
88
87
 
89
88
  # Override the dev server start command for browser verification
90
89
  # If empty, deepflow will attempt to auto-detect (e.g., "npm run dev", "yarn dev")
@@ -1,67 +0,0 @@
1
- #!/usr/bin/env node
2
- /**
3
- * deepflow consolidation checker
4
- * Checks if decisions.md needs consolidation, outputs suggestion if overdue
5
- */
6
-
7
- const fs = require('fs');
8
- const path = require('path');
9
-
10
- const DAYS_THRESHOLD = 7;
11
- const LINES_THRESHOLD = 20;
12
- const DEEPFLOW_DIR = path.join(process.cwd(), '.deepflow');
13
- const DECISIONS_FILE = path.join(DEEPFLOW_DIR, 'decisions.md');
14
- const LAST_CONSOLIDATED_FILE = path.join(DEEPFLOW_DIR, 'last-consolidated.json');
15
-
16
- function checkConsolidation() {
17
- try {
18
- // Check if decisions.md exists
19
- if (!fs.existsSync(DECISIONS_FILE)) {
20
- process.exit(0);
21
- }
22
-
23
- // Check if decisions.md has more than LINES_THRESHOLD lines
24
- const decisionsContent = fs.readFileSync(DECISIONS_FILE, 'utf8');
25
- const lineCount = decisionsContent.split('\n').length;
26
- if (lineCount <= LINES_THRESHOLD) {
27
- process.exit(0);
28
- }
29
-
30
- // Get last consolidated timestamp
31
- let lastConsolidated;
32
- if (fs.existsSync(LAST_CONSOLIDATED_FILE)) {
33
- try {
34
- const data = JSON.parse(fs.readFileSync(LAST_CONSOLIDATED_FILE, 'utf8'));
35
- if (data.last_consolidated) {
36
- lastConsolidated = new Date(data.last_consolidated);
37
- }
38
- } catch (e) {
39
- // Fall through to use mtime
40
- }
41
- }
42
-
43
- // Fallback: use mtime of decisions.md
44
- if (!lastConsolidated || isNaN(lastConsolidated.getTime())) {
45
- const stat = fs.statSync(DECISIONS_FILE);
46
- lastConsolidated = stat.mtime;
47
- }
48
-
49
- // Calculate days since last consolidation
50
- const now = new Date();
51
- const diffMs = now - lastConsolidated;
52
- const diffDays = Math.floor(diffMs / (1000 * 60 * 60 * 24));
53
-
54
- if (diffDays >= DAYS_THRESHOLD) {
55
- process.stderr.write(
56
- `\u{1F4A1} decisions.md hasn't been consolidated in ${diffDays} days. Run /df:consolidate to clean up.\n`
57
- );
58
- }
59
-
60
- } catch (e) {
61
- // Fail silently
62
- }
63
-
64
- process.exit(0);
65
- }
66
-
67
- checkConsolidation();
@@ -1,42 +0,0 @@
1
- ---
2
- name: df:consolidate
3
- description: Remove duplicates and superseded entries from decisions file, promote stale provisionals
4
- ---
5
-
6
- # /df:consolidate — Consolidate Decisions
7
-
8
- Remove duplicates, superseded entries, and promote stale provisionals. Keep decisions.md dense and useful.
9
-
10
- **NEVER:** use EnterPlanMode, ExitPlanMode
11
-
12
- ## Behavior
13
-
14
- ### 1. LOAD
15
- Read `.deepflow/decisions.md` via `` !`cat .deepflow/decisions.md 2>/dev/null || echo 'NOT_FOUND'` ``. If missing/empty, report and exit.
16
-
17
- ### 2. ANALYZE (model-driven, not regex)
18
- - Identify duplicates (same meaning, different wording)
19
- - Identify superseded entries (later contradicts earlier)
20
- - Identify stale `[PROVISIONAL]` entries (>30 days old, no resolution)
21
-
22
- ### 3. CONSOLIDATE
23
- - Remove duplicates (keep more precise wording)
24
- - Remove superseded entries (later decision wins)
25
- - Promote stale `[PROVISIONAL]` → `[DEBT]`
26
- - Preserve `[APPROACH]` unless superseded, `[ASSUMPTION]` unless invalidated
27
- - Target: 200-500 lines if currently longer
28
- - When in doubt, keep both entries (conservative)
29
-
30
- ### 4. WRITE
31
- - Rewrite `.deepflow/decisions.md` with consolidated content
32
- - Write `{ "last_consolidated": "{ISO-8601}" }` to `.deepflow/last-consolidated.json`
33
-
34
- ### 5. REPORT
35
- `✓ Consolidated: {before} → {after} lines, {n} removed, {n} promoted to [DEBT]`
36
-
37
- ## Rules
38
-
39
- - Conservative: when in doubt, keep both entries
40
- - Never add new decisions — only remove, merge, or re-tag
41
- - `[DEBT]` is only produced by consolidation, never manually assigned
42
- - Preserve chronological ordering within sections
@@ -1,73 +0,0 @@
1
- ---
2
- name: df:note
3
- description: Capture decisions that emerged during free conversations outside of deepflow commands
4
- ---
5
-
6
- # /df:note — Capture Decisions from Free Conversations
7
-
8
- ## Orchestrator Role
9
-
10
- Scan conversation for candidate decisions, present for user confirmation, persist to `.deepflow/decisions.md`.
11
-
12
- **NEVER:** Spawn agents, use Task tool, use Glob/Grep on source code, run git, use TaskOutput, EnterPlanMode, ExitPlanMode
13
-
14
- **ONLY:** Read `.deepflow/decisions.md`, present candidates via `AskUserQuestion`, append confirmed decisions
15
-
16
- ## Behavior
17
-
18
- ### 1. EXTRACT CANDIDATES
19
-
20
- Scan prior messages for resolved choices, adopted approaches, or stated assumptions. Look for:
21
- - **Approaches chosen**: "we'll use X instead of Y"
22
- - **Provisional choices**: "for now we'll use X"
23
- - **Stated assumptions**: "assuming X is true"
24
- - **Constraints accepted**: "X is out of scope"
25
- - **Naming/structural choices**: "we'll call it X", "X goes in the Y layer"
26
-
27
- Extract **at most 4 candidates**. For each, determine:
28
-
29
- | Field | Value |
30
- |-------|-------|
31
- | Tag | `[APPROACH]` (deliberate choice), `[PROVISIONAL]` (revisit later), or `[ASSUMPTION]` (unvalidated) |
32
- | Decision | One concise line describing the choice |
33
- | Rationale | One sentence explaining why |
34
-
35
- If <2 clear candidates found, say so and exit.
36
-
37
- ### 2. CHECK FOR CONTRADICTIONS
38
-
39
- Read `.deepflow/decisions.md` if it exists. If a candidate contradicts a prior entry: keep prior entry unchanged, amend candidate rationale to `was "X", now "Y" because Z`.
40
-
41
- ### 3. PRESENT VIA AskUserQuestion
42
-
43
- Single multi-select call. Each option: `label` = tag + decision text, `description` = rationale.
44
-
45
- ### 4. APPEND CONFIRMED DECISIONS
46
-
47
- For each selected option:
48
- 1. Create `.deepflow/decisions.md` with `# Decisions` header if absent
49
- 2. Append a dated section: `### YYYY-MM-DD — note`
50
- 3. Group all confirmed decisions under one section: `- [TAG] Decision text — rationale`
51
- 4. Never modify or delete prior entries
52
-
53
- ### 5. CONFIRM
54
-
55
- Report: `Saved N decision(s) to .deepflow/decisions.md` or `No decisions saved.`
56
-
57
- ## Decision Tags
58
-
59
- | Tag | Meaning | Source |
60
- |-----|---------|--------|
61
- | `[APPROACH]` | Firm decision | /df:note, auto-extraction |
62
- | `[PROVISIONAL]` | Revisit later | /df:note, auto-extraction |
63
- | `[ASSUMPTION]` | Unverified | /df:note, auto-extraction |
64
- | `[DEBT]` | Needs revisiting | /df:consolidate only, never manually assigned |
65
-
66
- ## Rules
67
-
68
- - Max 4 candidates per invocation (AskUserQuestion tool limit)
69
- - multiSelect: true — user confirms any subset
70
- - Never invent decisions — only extract what was discussed and resolved
71
- - Never modify prior entries in `.deepflow/decisions.md`
72
- - Source is always `note`; date is today (YYYY-MM-DD)
73
- - One AskUserQuestion call — all candidates in a single call