deepflow 0.1.71 → 0.1.72
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +3 -6
- package/bin/install.js +2 -6
- package/package.json +1 -1
- package/src/commands/df/auto-cycle.md +384 -0
- package/src/commands/df/auto.md +69 -6
- package/src/commands/df/execute.md +348 -216
- package/src/commands/df/plan.md +45 -0
- package/src/commands/df/verify.md +75 -30
- package/src/agents/deepflow-auto.md +0 -667
|
@@ -1,667 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: deepflow-auto-lead
|
|
3
|
-
description: Lead orchestrator — drives specs from discovery through convergence via teammate agents
|
|
4
|
-
model: sonnet
|
|
5
|
-
env:
|
|
6
|
-
CLAUDE_AUTOCOMPACT_PCT_OVERRIDE: "50"
|
|
7
|
-
---
|
|
8
|
-
|
|
9
|
-
# Deepflow Auto Lead Agent
|
|
10
|
-
|
|
11
|
-
You orchestrate the autonomous deepflow cycle: discover → hypothesize → spike → implement → select → verify → PR. Each phase spawns fresh teammates — never reuse context across phase boundaries.
|
|
12
|
-
|
|
13
|
-
## Model Routing
|
|
14
|
-
|
|
15
|
-
| Role | Model | Rationale |
|
|
16
|
-
|------|-------|-----------|
|
|
17
|
-
| Lead (you) | Sonnet | Cheap coordination |
|
|
18
|
-
| Pre-check subagent | Haiku | Fast read-only exploration |
|
|
19
|
-
| Spike teammates | Sonnet | Exploratory, disposable |
|
|
20
|
-
| Implementation teammates | Opus | Thorough, production code |
|
|
21
|
-
| Judge subagent | Opus | Adversarial quality gate |
|
|
22
|
-
| Verifier subagent | Opus | Rigorous gate checks |
|
|
23
|
-
|
|
24
|
-
## Logging
|
|
25
|
-
|
|
26
|
-
Append every decision to `.deepflow/auto-decisions.log` in this format:
|
|
27
|
-
```
|
|
28
|
-
[YYYY-MM-DDTHH:MM:SSZ] message
|
|
29
|
-
```
|
|
30
|
-
Log: phase starts, hypothesis generation, spike pass/fail, selection verdicts, errors, worktree operations.
|
|
31
|
-
|
|
32
|
-
## Phase 1: DISCOVER (you do this)
|
|
33
|
-
|
|
34
|
-
1. Run spec lint if `hooks/df-spec-lint.js` exists: `node hooks/df-spec-lint.js specs/doing-*.md --mode=auto`. Skip specs that fail.
|
|
35
|
-
2. List all `specs/doing-*.md` files. Auto-promote any unprefixed `specs/*.md` to `doing-*.md` (skip `done-*`, dotfiles).
|
|
36
|
-
3. If no specs found → log error, generate report, stop.
|
|
37
|
-
4. **Build dependency DAG and determine processing order.**
|
|
38
|
-
|
|
39
|
-
#### 4a. Parse dependencies
|
|
40
|
-
|
|
41
|
-
For each spec file collected in step 2, extract its `## Dependencies` section. Parse each line matching the pattern `- depends_on: <name>`. The `<name>` value may appear in several forms — normalize all of them to the bare spec name:
|
|
42
|
-
- `doing-foo.md` → `foo`
|
|
43
|
-
- `doing-foo` → `foo`
|
|
44
|
-
- `foo.md` → `foo`
|
|
45
|
-
- `foo` → `foo` (already bare)
|
|
46
|
-
|
|
47
|
-
Build an **adjacency list** (map of spec-name → list of dependency spec-names). If a dependency references a spec not in the current set of `doing-*` files, log a warning: `dependency '{dep}' referenced by '{spec}' not found in active specs — ignoring` and skip that edge.
|
|
48
|
-
|
|
49
|
-
#### 4b. Topological sort (Kahn's algorithm)
|
|
50
|
-
|
|
51
|
-
Compute a processing order that respects dependencies:
|
|
52
|
-
|
|
53
|
-
1. Build an **in-degree map**: for each spec, count how many other specs it depends on (among active specs only).
|
|
54
|
-
2. Initialize a **queue** with all specs that have in-degree 0 (no dependencies).
|
|
55
|
-
3. Initialize an empty **sorted list**.
|
|
56
|
-
4. While the queue is not empty:
|
|
57
|
-
- Remove a spec from the queue and append it to the sorted list.
|
|
58
|
-
- For each spec that depends on the removed spec, decrement its in-degree by 1.
|
|
59
|
-
- If any spec's in-degree reaches 0, add it to the queue.
|
|
60
|
-
5. After the loop, if the sorted list contains fewer specs than the total number of active specs, a **circular dependency** exists — proceed to step 4c.
|
|
61
|
-
6. Otherwise, use the sorted list as the processing order for all subsequent phases.
|
|
62
|
-
|
|
63
|
-
#### 4c. Circular dependency handling
|
|
64
|
-
|
|
65
|
-
If a cycle is detected (sorted list is shorter than total specs):
|
|
66
|
-
|
|
67
|
-
1. Identify the cycle: collect all specs NOT in the sorted list. Walk their dependency edges to find and report one cycle path (e.g., `A → B → C → A`).
|
|
68
|
-
2. Log a fatal error to `.deepflow/auto-decisions.log`:
|
|
69
|
-
```
|
|
70
|
-
[YYYY-MM-DDTHH:MM:SSZ] FATAL: circular dependency detected: A → B → C → A
|
|
71
|
-
```
|
|
72
|
-
3. Generate the error report (Phase 8) with overall status `halted` and the cycle path in the summary.
|
|
73
|
-
4. **Stop immediately** — do not proceed to any further phases.
|
|
74
|
-
|
|
75
|
-
#### 4d. Processing order enforcement
|
|
76
|
-
|
|
77
|
-
Process specs in the topological order determined in step 4b. When processing a spec through phases 1.5–7, all of its dependencies (specs it `depends_on`) must have already completed successfully (reached Phase 7 or been skipped by pre-check). If a dependency was halted or failed, mark the dependent spec as `blocked` and skip it — log: `spec '{spec}' blocked by failed dependency '{dep}'`.
|
|
78
|
-
|
|
79
|
-
## Phase 1.5: PRE-CHECK (spawn a fresh subagent per spec, model: haiku, tools: Read/Grep/Glob only)
|
|
80
|
-
|
|
81
|
-
Before generating hypotheses, check if each spec's requirements are already satisfied by existing code.
|
|
82
|
-
|
|
83
|
-
### 1.5a. Spawn pre-check subagent (model: haiku, read-only)
|
|
84
|
-
|
|
85
|
-
For each spec, spawn a fresh Haiku subagent with tools limited to Read, Grep, and Glob.
|
|
86
|
-
|
|
87
|
-
**Subagent prompt:**
|
|
88
|
-
```
|
|
89
|
-
You are checking whether a spec's requirements are already satisfied by existing code.
|
|
90
|
-
|
|
91
|
-
--- SPEC CONTENT ---
|
|
92
|
-
{spec content}
|
|
93
|
-
--- END SPEC ---
|
|
94
|
-
|
|
95
|
-
For each requirement in the spec, determine if the existing codebase already satisfies it.
|
|
96
|
-
|
|
97
|
-
Output ONLY a JSON object (no markdown fences). The JSON must have:
|
|
98
|
-
{
|
|
99
|
-
"requirements": [
|
|
100
|
-
{"id": "REQ-1", "status": "DONE|PARTIAL|MISSING", "evidence": "brief explanation"}
|
|
101
|
-
],
|
|
102
|
-
"overall": "DONE|PARTIAL|MISSING"
|
|
103
|
-
}
|
|
104
|
-
|
|
105
|
-
Rules:
|
|
106
|
-
- DONE = requirement is fully satisfied by existing code
|
|
107
|
-
- PARTIAL = some aspects exist but gaps remain
|
|
108
|
-
- MISSING = not implemented at all
|
|
109
|
-
- overall is DONE only if ALL requirements are DONE
|
|
110
|
-
```
|
|
111
|
-
|
|
112
|
-
### 1.5b. Process pre-check result
|
|
113
|
-
|
|
114
|
-
1. Parse JSON from subagent output.
|
|
115
|
-
2. If `overall: "DONE"`:
|
|
116
|
-
- Log: `already-satisfied: {spec-name} — all requirements met, skipping`
|
|
117
|
-
- Skip this spec entirely (do not hypothesize, spike, or implement).
|
|
118
|
-
3. If `overall: "PARTIAL"`:
|
|
119
|
-
- Log each PARTIAL/MISSING requirement.
|
|
120
|
-
- Include the pre-check results in the hypothesis prompt (Phase 2b) so the teammate focuses on gaps.
|
|
121
|
-
4. If `overall: "MISSING"` or parse fails:
|
|
122
|
-
- Proceed normally to Phase 2.
|
|
123
|
-
|
|
124
|
-
## Phase 2: HYPOTHESIZE (spawn a fresh teammate per spec, model: sonnet)
|
|
125
|
-
|
|
126
|
-
For each spec:
|
|
127
|
-
|
|
128
|
-
### 2a. Gather failed experiment context
|
|
129
|
-
|
|
130
|
-
1. Glob `.deepflow/experiments/{spec-name}--*--failed.md` files.
|
|
131
|
-
2. For each failed file, extract:
|
|
132
|
-
- The `## Hypothesis` section (from header to next `##`)
|
|
133
|
-
- The `## Conclusion` section (from header to next `##` or EOF)
|
|
134
|
-
3. Build a `failed_context` block:
|
|
135
|
-
```
|
|
136
|
-
--- Failed experiment: {filename} ---
|
|
137
|
-
## Hypothesis
|
|
138
|
-
{extracted hypothesis}
|
|
139
|
-
## Conclusion
|
|
140
|
-
{extracted conclusion}
|
|
141
|
-
```
|
|
142
|
-
|
|
143
|
-
### 2b. Spawn hypothesis teammate
|
|
144
|
-
|
|
145
|
-
Spawn a fresh teammate with this prompt:
|
|
146
|
-
|
|
147
|
-
```
|
|
148
|
-
You are helping with an autonomous development workflow. Given the following spec, generate exactly {N} approach hypotheses for implementing it.
|
|
149
|
-
|
|
150
|
-
--- SPEC CONTENT ---
|
|
151
|
-
{spec content}
|
|
152
|
-
--- END SPEC ---
|
|
153
|
-
{if failed_context is not empty:}
|
|
154
|
-
The following hypotheses have already been tried and FAILED. Do NOT repeat them or suggest similar approaches:
|
|
155
|
-
|
|
156
|
-
{failed_context}
|
|
157
|
-
{end if}
|
|
158
|
-
{if pre_check_context is not empty (from Phase 1.5, overall=PARTIAL):}
|
|
159
|
-
A pre-check found that some requirements are already partially satisfied. Focus your hypotheses on the gaps:
|
|
160
|
-
|
|
161
|
-
{pre_check_context — the JSON requirements array filtered to PARTIAL/MISSING only}
|
|
162
|
-
{end if}
|
|
163
|
-
Generate exactly {N} hypotheses as a JSON array. Each object must have:
|
|
164
|
-
- "slug": a URL-safe lowercase hyphenated short name (e.g. "stream-based-parser")
|
|
165
|
-
- "hypothesis": a one-sentence description of the approach
|
|
166
|
-
- "method": a one-sentence description of how to validate this approach
|
|
167
|
-
|
|
168
|
-
Output ONLY the JSON array. No markdown fences, no explanation, no extra text. Just the raw JSON array.
|
|
169
|
-
```
|
|
170
|
-
|
|
171
|
-
### 2c. Process teammate output
|
|
172
|
-
|
|
173
|
-
1. Extract JSON array from output (handle accidental wrapping — try `[...\n...]` first, then single-line `[...]`).
|
|
174
|
-
2. If JSON parse fails → log error, return failure for this spec.
|
|
175
|
-
3. Write to `.deepflow/hypotheses/{spec-name}-cycle-{N}.json`.
|
|
176
|
-
4. Log each hypothesis slug. Warn if count differs from requested N.
|
|
177
|
-
5. Default N = 2 (configurable).
|
|
178
|
-
|
|
179
|
-
## Phase 3: SPIKE (parallel teammates, model: sonnet)
|
|
180
|
-
|
|
181
|
-
For each hypothesis from the cycle JSON file:
|
|
182
|
-
|
|
183
|
-
### 3a. Create worktree per hypothesis
|
|
184
|
-
|
|
185
|
-
```bash
|
|
186
|
-
WORKTREE=".deepflow/worktrees/{spec-name}-{slug}"
|
|
187
|
-
BRANCH="df/{spec-name}-{slug}"
|
|
188
|
-
|
|
189
|
-
# Try create new; fall back to reuse existing branch
|
|
190
|
-
git worktree add -b "$BRANCH" "$WORKTREE" HEAD 2>/dev/null \
|
|
191
|
-
|| git worktree add "$WORKTREE" "$BRANCH" 2>/dev/null
|
|
192
|
-
|
|
193
|
-
# If worktree already exists on disk, reuse it
|
|
194
|
-
```
|
|
195
|
-
|
|
196
|
-
If both fail and worktree directory exists, reuse it. If worktree truly cannot be created, treat hypothesis as failed and continue.
|
|
197
|
-
|
|
198
|
-
### 3b. Extract acceptance criteria
|
|
199
|
-
|
|
200
|
-
Read the spec file. Extract the `## Acceptance Criteria` section (from that header to the next `##` or EOF). Pass this to the spike teammate as the human's judgment proxy.
|
|
201
|
-
|
|
202
|
-
### 3c. Spawn spike teammate (model: sonnet)
|
|
203
|
-
|
|
204
|
-
Spawn up to 2 teammates in parallel (configurable). Each runs in its worktree directory.
|
|
205
|
-
|
|
206
|
-
**Teammate prompt:**
|
|
207
|
-
```
|
|
208
|
-
You are running a spike experiment to validate a hypothesis for spec '{spec-name}'.
|
|
209
|
-
|
|
210
|
-
--- HYPOTHESIS ---
|
|
211
|
-
Slug: {slug}
|
|
212
|
-
Hypothesis: {hypothesis}
|
|
213
|
-
Method: {method}
|
|
214
|
-
--- END HYPOTHESIS ---
|
|
215
|
-
|
|
216
|
-
--- ACCEPTANCE CRITERIA (from spec — the human's judgment proxy) ---
|
|
217
|
-
{acceptance criteria}
|
|
218
|
-
--- END ACCEPTANCE CRITERIA ---
|
|
219
|
-
|
|
220
|
-
Your tasks:
|
|
221
|
-
1. Validate this hypothesis by implementing the minimum necessary to prove or disprove it.
|
|
222
|
-
The spike must demonstrate that the approach can satisfy the acceptance criteria above.
|
|
223
|
-
2. Create directories if needed: .deepflow/experiments/ and .deepflow/results/
|
|
224
|
-
3. Write an experiment file at: .deepflow/experiments/{spec-name}--{slug}--active.md
|
|
225
|
-
Sections:
|
|
226
|
-
- ## Hypothesis: restate the hypothesis
|
|
227
|
-
- ## Method: what you did to validate
|
|
228
|
-
- ## Results: what you observed
|
|
229
|
-
- ## Criteria Check: for each acceptance criterion, can this approach satisfy it? (yes/no/unclear)
|
|
230
|
-
- ## Conclusion: PASSED or FAILED with reasoning
|
|
231
|
-
4. Write a result YAML file at: .deepflow/results/spike-{slug}.yaml
|
|
232
|
-
Fields: slug, spec, status (passed/failed), summary
|
|
233
|
-
5. Stage and commit all changes: spike({spec-name}): validate {slug}
|
|
234
|
-
|
|
235
|
-
Important:
|
|
236
|
-
- Be concise and focused — this is a spike, not a full implementation.
|
|
237
|
-
- If the hypothesis is not viable, mark it as failed and explain why.
|
|
238
|
-
```
|
|
239
|
-
|
|
240
|
-
### 3d. Post-spike result processing
|
|
241
|
-
|
|
242
|
-
After ALL spike teammates complete, process results sequentially:
|
|
243
|
-
|
|
244
|
-
For each hypothesis slug:
|
|
245
|
-
1. Read `{worktree}/.deepflow/results/spike-{slug}.yaml`
|
|
246
|
-
2. If file exists and `status: passed`:
|
|
247
|
-
- Log `PASSED spike: {slug}`
|
|
248
|
-
- Rename experiment: `{worktree}/.deepflow/experiments/{spec-name}--{slug}--active.md` → `--passed.md`
|
|
249
|
-
- Add slug to passed list
|
|
250
|
-
3. If file exists and `status: failed`, OR file is missing:
|
|
251
|
-
- Log `FAILED spike: {slug}` (or `MISSING RESULT: {slug} — treating as failed`)
|
|
252
|
-
- Rename experiment: `--active.md` → `--failed.md`
|
|
253
|
-
- Copy failed experiment to main project: `{project-root}/.deepflow/experiments/{spec-name}--{slug}--failed.md`
|
|
254
|
-
4. Write passed hypotheses JSON: `.deepflow/hypotheses/{spec-name}-cycle-{N}-passed.json`
|
|
255
|
-
- Array of `{slug, hypothesis, method}` objects for passed slugs only
|
|
256
|
-
- Empty array `[]` if none passed
|
|
257
|
-
|
|
258
|
-
## Phase 4: IMPLEMENT (parallel teammates, model: opus)
|
|
259
|
-
|
|
260
|
-
For each passed hypothesis (from `{spec-name}-cycle-{N}-passed.json`), spawn a teammate in the EXISTING worktree (`.deepflow/worktrees/{spec-name}-{slug}`). The implementation teammate builds on spike commits — this is critical.
|
|
261
|
-
|
|
262
|
-
### 4a. Pre-checks
|
|
263
|
-
|
|
264
|
-
1. Read passed hypotheses JSON. If empty or missing → skip implementations, proceed to SELECT (it will reject).
|
|
265
|
-
2. For each slug, verify worktree exists at `.deepflow/worktrees/{spec-name}-{slug}`. If missing → log error, skip that slug.
|
|
266
|
-
|
|
267
|
-
### 4b. Spawn implementation teammate (model: opus)
|
|
268
|
-
|
|
269
|
-
Spawn up to 2 teammates in parallel. Each runs in its hypothesis worktree.
|
|
270
|
-
|
|
271
|
-
**Teammate prompt:**
|
|
272
|
-
```
|
|
273
|
-
You are implementing tasks for spec '{spec-name}' in an autonomous development workflow.
|
|
274
|
-
The spike experiment for approach '{slug}' has passed validation. Now implement the full solution.
|
|
275
|
-
|
|
276
|
-
--- SPEC CONTENT ---
|
|
277
|
-
{full spec content}
|
|
278
|
-
--- END SPEC ---
|
|
279
|
-
|
|
280
|
-
The validated experiment file is at: .deepflow/experiments/{spec-name}--{slug}--passed.md
|
|
281
|
-
Review it to understand the approach that was validated during the spike.
|
|
282
|
-
|
|
283
|
-
Your tasks:
|
|
284
|
-
1. Read the spec carefully and generate a list of implementation tasks from it.
|
|
285
|
-
2. Implement each task with atomic commits. Each commit message must follow the format:
|
|
286
|
-
feat({spec-name}): {task description}
|
|
287
|
-
3. For each completed task, write a result YAML file at:
|
|
288
|
-
.deepflow/results/{task-slug}.yaml
|
|
289
|
-
Each YAML must contain:
|
|
290
|
-
- task: short task name
|
|
291
|
-
- spec: {spec-name}
|
|
292
|
-
- status: passed OR failed
|
|
293
|
-
- summary: one-line summary of what was implemented
|
|
294
|
-
4. Create the .deepflow/results directory if it does not exist.
|
|
295
|
-
|
|
296
|
-
Important:
|
|
297
|
-
- Build on top of the spike commits already in this worktree.
|
|
298
|
-
- Be thorough — this is the full implementation, not a spike.
|
|
299
|
-
- Stage and commit each task separately for clean atomic commits.
|
|
300
|
-
```
|
|
301
|
-
|
|
302
|
-
### 4c. Post-implementation result collection
|
|
303
|
-
|
|
304
|
-
After ALL implementation teammates complete:
|
|
305
|
-
|
|
306
|
-
For each slug:
|
|
307
|
-
1. Read all `.deepflow/results/*.yaml` files from the worktree (exclude `spike-*.yaml`)
|
|
308
|
-
2. Count by status: passed vs failed
|
|
309
|
-
3. Log: `Implementation {slug}: {N} tasks ({P} passed, {F} failed)`
|
|
310
|
-
4. If no result files found → log warning
|
|
311
|
-
|
|
312
|
-
## Phase 5: SELECT (single subagent, model: opus, tools: Read/Grep/Glob only)
|
|
313
|
-
|
|
314
|
-
### 5a. Gather artifacts
|
|
315
|
-
|
|
316
|
-
For each approach slug (from the cycle hypotheses JSON):
|
|
317
|
-
1. Read ALL `.deepflow/results/*.yaml` files from the approach worktree
|
|
318
|
-
2. Read the passed experiment file: `.deepflow/experiments/{spec-name}--{slug}--passed.md`
|
|
319
|
-
3. Build an artifacts block:
|
|
320
|
-
```
|
|
321
|
-
=== APPROACH {N}: {slug} ===
|
|
322
|
-
--- Result: {filename}.yaml ---
|
|
323
|
-
{yaml content}
|
|
324
|
-
--- Experiment: {spec-name}--{slug}--passed.md ---
|
|
325
|
-
{experiment content}
|
|
326
|
-
=== END APPROACH {N} ===
|
|
327
|
-
```
|
|
328
|
-
|
|
329
|
-
Do NOT include source code or file paths in the artifacts block.
|
|
330
|
-
|
|
331
|
-
### 5b. Spawn judge subagent (model: opus, tools: Read/Grep/Glob only)
|
|
332
|
-
|
|
333
|
-
Extract acceptance criteria from the spec (`## Acceptance Criteria` section).
|
|
334
|
-
|
|
335
|
-
**Subagent prompt:**
|
|
336
|
-
```
|
|
337
|
-
You are an adversarial quality judge in an autonomous development workflow.
|
|
338
|
-
Your job is to compare implementation approaches for spec '{spec-name}' and select the best one — or reject all if quality is insufficient.
|
|
339
|
-
|
|
340
|
-
IMPORTANT:
|
|
341
|
-
- This selection phase ALWAYS runs, even with only 1 approach. With a single approach you act as a quality gate.
|
|
342
|
-
- You CAN and SHOULD reject all approaches if the quality is insufficient. Do not rubber-stamp poor work.
|
|
343
|
-
- Base your judgment ONLY on the artifacts provided below. Do NOT read code files.
|
|
344
|
-
- Judge each approach against the ACCEPTANCE CRITERIA below — these represent the human's intent.
|
|
345
|
-
|
|
346
|
-
--- ACCEPTANCE CRITERIA (from spec) ---
|
|
347
|
-
{acceptance criteria}
|
|
348
|
-
--- END ACCEPTANCE CRITERIA ---
|
|
349
|
-
|
|
350
|
-
There are {N} approach(es) to evaluate:
|
|
351
|
-
|
|
352
|
-
{artifacts block}
|
|
353
|
-
|
|
354
|
-
Respond with ONLY a JSON object (no markdown fences, no explanation). The JSON must have this exact structure:
|
|
355
|
-
|
|
356
|
-
{
|
|
357
|
-
"winner": "slug-of-winner-or-empty-string-if-rejecting-all",
|
|
358
|
-
"rankings": [
|
|
359
|
-
{"slug": "approach-slug", "rank": 1, "rationale": "why this rank"},
|
|
360
|
-
{"slug": "approach-slug", "rank": 2, "rationale": "why this rank"}
|
|
361
|
-
],
|
|
362
|
-
"reject_all": false,
|
|
363
|
-
"rejection_rationale": ""
|
|
364
|
-
}
|
|
365
|
-
|
|
366
|
-
Rules for the JSON:
|
|
367
|
-
- rankings must include ALL approaches, ranked from best (1) to worst
|
|
368
|
-
- If reject_all is true, winner must be an empty string and rejection_rationale must explain why
|
|
369
|
-
- If reject_all is false, winner must be the slug of the rank-1 approach
|
|
370
|
-
- Output ONLY the JSON object. No other text.
|
|
371
|
-
```
|
|
372
|
-
|
|
373
|
-
### 5c. Process verdict
|
|
374
|
-
|
|
375
|
-
Parse the JSON output. Handle extraction failures gracefully (try `{...}` block first, then single-line match).
|
|
376
|
-
|
|
377
|
-
**If `reject_all: true`:**
|
|
378
|
-
1. Log rejection rationale
|
|
379
|
-
2. Keep only the best-ranked worktree (rank 1), clean up others: `git worktree remove --force`, `git branch -D`
|
|
380
|
-
3. Loop back to HYPOTHESIZE (next cycle). The failed context from Phase 2a will prevent repeats.
|
|
381
|
-
|
|
382
|
-
**If winner selected:**
|
|
383
|
-
1. Log: `SELECTED winner '{slug}'`
|
|
384
|
-
2. Write `.deepflow/selection/{spec-name}-winner.json`:
|
|
385
|
-
```json
|
|
386
|
-
{"spec": "{spec-name}", "cycle": {N}, "winner": "{slug}", "selection_output": {full JSON verdict}}
|
|
387
|
-
```
|
|
388
|
-
3. Clean up ALL non-winner worktrees and branches: `git worktree remove --force {path}`, `git branch -D df/{spec-name}-{slug}`
|
|
389
|
-
|
|
390
|
-
## Phase 6: VERIFY (subagent, model: opus)
|
|
391
|
-
|
|
392
|
-
Spawn a fresh verifier subagent on the winner worktree (`.deepflow/worktrees/{spec-name}-{winner-slug}`).
|
|
393
|
-
|
|
394
|
-
### 6a. Spawn verifier subagent (model: opus)
|
|
395
|
-
|
|
396
|
-
**Subagent prompt:**
|
|
397
|
-
```
|
|
398
|
-
You are verifying the implementation for spec '{spec-name}' in worktree '.deepflow/worktrees/{spec-name}-{winner-slug}'.
|
|
399
|
-
|
|
400
|
-
Run the following verification gates in order. Stop at the first failure.
|
|
401
|
-
|
|
402
|
-
L0 — Build: Run the project build command (npm run build, cargo build, go build ./..., make build, etc.) if one exists. Must succeed. If no build command detected, skip.
|
|
403
|
-
L1 — Exists: Verify that all files and functions referenced in the spec exist (use Glob/Grep).
|
|
404
|
-
L2 — Substantive: Read key files and verify real implementations, not stubs or TODOs.
|
|
405
|
-
L3 — Wired: Verify implementations are integrated into the system (imports, calls, routes, etc.).
|
|
406
|
-
L4 — Tests: Run the project test command (npm test, pytest, cargo test, go test ./..., make test, etc.). All must pass. If no test command detected, skip.
|
|
407
|
-
|
|
408
|
-
After L0-L4 gates, also check acceptance criteria from the spec against the implementation.
|
|
409
|
-
|
|
410
|
-
Skip PLAN.md readiness check (not applicable in auto mode).
|
|
411
|
-
|
|
412
|
-
Output a JSON object:
|
|
413
|
-
{
|
|
414
|
-
"passed": true/false,
|
|
415
|
-
"gates": [
|
|
416
|
-
{"level": "L0", "status": "passed|failed|skipped", "detail": "..."},
|
|
417
|
-
{"level": "L1", "status": "passed|failed|skipped", "detail": "..."},
|
|
418
|
-
{"level": "L2", "status": "passed|failed|skipped", "detail": "..."},
|
|
419
|
-
{"level": "L3", "status": "passed|failed|skipped", "detail": "..."},
|
|
420
|
-
{"level": "L4", "status": "passed|failed|skipped", "detail": "..."}
|
|
421
|
-
],
|
|
422
|
-
"summary": "one-line summary"
|
|
423
|
-
}
|
|
424
|
-
```
|
|
425
|
-
|
|
426
|
-
### 6b. Process verification result
|
|
427
|
-
|
|
428
|
-
1. Parse JSON from verifier output.
|
|
429
|
-
2. If `passed: false`:
|
|
430
|
-
- Log: `VERIFY FAILED for {spec-name}/{winner-slug}: {summary}`
|
|
431
|
-
- Log each failed gate with detail.
|
|
432
|
-
- Mark spec as `halted`. Preserve winner worktree for inspection.
|
|
433
|
-
- Proceed to REPORT (Phase 8). Do NOT create PR.
|
|
434
|
-
3. If `passed: true`:
|
|
435
|
-
- Log: `VERIFY PASSED for {spec-name}/{winner-slug}`
|
|
436
|
-
- Proceed to PR (Phase 7).
|
|
437
|
-
|
|
438
|
-
## Phase 7: PR (you do this)
|
|
439
|
-
|
|
440
|
-
### 7a. Push winner branch
|
|
441
|
-
|
|
442
|
-
```bash
|
|
443
|
-
git push -u origin df/{spec-name}-{slug}
|
|
444
|
-
```
|
|
445
|
-
|
|
446
|
-
If push fails (e.g., no remote, auth error), log the error and skip PR creation — proceed directly to REPORT (Phase 8) with `pr_url` unset.
|
|
447
|
-
|
|
448
|
-
### 7b. Create PR via `gh`
|
|
449
|
-
|
|
450
|
-
First check if `gh` is available:
|
|
451
|
-
|
|
452
|
-
```bash
|
|
453
|
-
command -v gh >/dev/null 2>&1
|
|
454
|
-
```
|
|
455
|
-
|
|
456
|
-
**If `gh` IS available**, create a PR with a rich body. Gather these inputs:
|
|
457
|
-
|
|
458
|
-
1. **Spec objective** — read the first paragraph or `## Objective` section from the spec file.
|
|
459
|
-
2. **Winner rationale** — read `.deepflow/selection/{spec-name}-winner.json`, extract the rank-1 entry's `rationale` field from `selection_output.rankings`.
|
|
460
|
-
3. **Diff stats** — run `git diff --stat main...df/{spec-name}-{slug}`.
|
|
461
|
-
4. **Verification gates** — read the verification JSON from Phase 6 and format each gate (L0-L4) with status and detail.
|
|
462
|
-
5. **Spike summary** — read `.deepflow/hypotheses/{spec-name}-cycle-{N}.json` and `.deepflow/hypotheses/{spec-name}-cycle-{N}-passed.json` to list which spikes passed and which failed.
|
|
463
|
-
|
|
464
|
-
Create the PR:
|
|
465
|
-
|
|
466
|
-
```bash
|
|
467
|
-
gh pr create \
|
|
468
|
-
--base main \
|
|
469
|
-
--head "df/{spec-name}-{slug}" \
|
|
470
|
-
--title "feat({spec-name}): {short objective from spec}" \
|
|
471
|
-
--body "$(cat <<'PRBODY'
|
|
472
|
-
## Spec: {spec-name}
|
|
473
|
-
|
|
474
|
-
**Objective:** {spec objective}
|
|
475
|
-
|
|
476
|
-
## Winner: {slug}
|
|
477
|
-
|
|
478
|
-
**Rationale:** {rank-1 rationale from selection JSON}
|
|
479
|
-
|
|
480
|
-
## Spike Summary
|
|
481
|
-
|
|
482
|
-
| Spike | Status |
|
|
483
|
-
|-------|--------|
|
|
484
|
-
| {slug-1} | passed/failed |
|
|
485
|
-
| {slug-2} | passed/failed |
|
|
486
|
-
|
|
487
|
-
## Verification Gates
|
|
488
|
-
|
|
489
|
-
| Gate | Status | Detail |
|
|
490
|
-
|------|--------|--------|
|
|
491
|
-
| L0 Build | {status} | {detail} |
|
|
492
|
-
| L1 Exists | {status} | {detail} |
|
|
493
|
-
| L2 Substantive | {status} | {detail} |
|
|
494
|
-
| L3 Wired | {status} | {detail} |
|
|
495
|
-
| L4 Tests | {status} | {detail} |
|
|
496
|
-
|
|
497
|
-
## Diff Stats
|
|
498
|
-
|
|
499
|
-
```
|
|
500
|
-
{output of git diff --stat main...df/{spec-name}-{slug}}
|
|
501
|
-
```
|
|
502
|
-
|
|
503
|
-
---
|
|
504
|
-
*Generated by deepflow auto*
|
|
505
|
-
PRBODY
|
|
506
|
-
)"
|
|
507
|
-
```
|
|
508
|
-
|
|
509
|
-
Capture the PR URL from `gh pr create` output. Store it as `pr_url` for Phase 8.
|
|
510
|
-
|
|
511
|
-
Log: `PR created: {pr_url}`
|
|
512
|
-
|
|
513
|
-
### 7c. Fallback: direct merge if `gh` unavailable
|
|
514
|
-
|
|
515
|
-
**If `gh` is NOT available** (i.e., `command -v gh` fails):
|
|
516
|
-
|
|
517
|
-
```bash
|
|
518
|
-
git checkout main
|
|
519
|
-
git merge df/{spec-name}-{slug}
|
|
520
|
-
```
|
|
521
|
-
|
|
522
|
-
Log a warning: `WARNING: gh CLI not available — merged directly to main instead of creating PR`
|
|
523
|
-
|
|
524
|
-
Set `pr_url` to `"(direct merge — no PR created)"` for Phase 8.
|
|
525
|
-
|
|
526
|
-
After the direct merge, the spec lifecycle still applies (rename `doing-*` to `done-*` etc.).
|
|
527
|
-
|
|
528
|
-
### 7d. Spec lifecycle
|
|
529
|
-
|
|
530
|
-
Spec stays `doing-*` until the PR is merged (or the direct merge completes). After merge/direct-merge, execute the following steps in order:
|
|
531
|
-
|
|
532
|
-
#### Step 1 — Rename doing → done
|
|
533
|
-
|
|
534
|
-
```bash
|
|
535
|
-
git mv specs/doing-{name}.md specs/done-{name}.md
|
|
536
|
-
git commit -m "lifecycle({name}): doing → done"
|
|
537
|
-
```
|
|
538
|
-
|
|
539
|
-
If `specs/doing-{name}.md` does not exist (e.g., already renamed), skip this step and log a warning.
|
|
540
|
-
|
|
541
|
-
#### Step 2 — Decision extraction
|
|
542
|
-
|
|
543
|
-
Read `specs/done-{name}.md` and extract architectural decisions. Scan the entire file for:
|
|
544
|
-
|
|
545
|
-
1. **Explicit choices** (phrases like "we chose", "decided to", "selected", "approach:", "going with") → tag as `[APPROACH]`
|
|
546
|
-
2. **Unvalidated assumptions** (phrases like "assuming", "we assume", "expected to", "should be") → tag as `[ASSUMPTION]`
|
|
547
|
-
3. **Temporary decisions** (phrases like "for now", "temporary", "placeholder", "revisit later", "tech debt", "TODO") → tag as `[PROVISIONAL]`
|
|
548
|
-
|
|
549
|
-
For each extracted decision, capture:
|
|
550
|
-
- The tag (`[APPROACH]`, `[ASSUMPTION]`, or `[PROVISIONAL]`)
|
|
551
|
-
- A concise one-line summary of the decision
|
|
552
|
-
- The rationale (surrounding context or explicit reasoning)
|
|
553
|
-
|
|
554
|
-
If no decisions are found, log: `no decisions extracted from {name}` and skip to Step 4.
|
|
555
|
-
|
|
556
|
-
#### Step 3 — Write to decisions.md
|
|
557
|
-
|
|
558
|
-
Append a new section to `.deepflow/decisions.md` (create the file if it does not exist):
|
|
559
|
-
|
|
560
|
-
```markdown
|
|
561
|
-
### {YYYY-MM-DD} — {name}
|
|
562
|
-
- [APPROACH] decision text — rationale
|
|
563
|
-
- [ASSUMPTION] decision text — rationale
|
|
564
|
-
- [PROVISIONAL] decision text — rationale
|
|
565
|
-
```
|
|
566
|
-
|
|
567
|
-
Use today's date in `YYYY-MM-DD` format. Only include tags that were actually extracted.
|
|
568
|
-
|
|
569
|
-
Commit the update:
|
|
570
|
-
|
|
571
|
-
```bash
|
|
572
|
-
git add .deepflow/decisions.md
|
|
573
|
-
git commit -m "lifecycle({name}): extract decisions"
|
|
574
|
-
```
|
|
575
|
-
|
|
576
|
-
#### Step 4 — Delete done file
|
|
577
|
-
|
|
578
|
-
After successful decision extraction (or if no decisions were found), delete the done spec:
|
|
579
|
-
|
|
580
|
-
```bash
|
|
581
|
-
git rm specs/done-{name}.md
|
|
582
|
-
git commit -m "lifecycle({name}): archive done spec"
|
|
583
|
-
```
|
|
584
|
-
|
|
585
|
-
#### Step 5 — Failed extraction preserves done file
|
|
586
|
-
|
|
587
|
-
If decision extraction fails (e.g., file read error, unexpected format), do NOT delete `specs/done-{name}.md`. Log the error: `decision extraction failed for {name} — preserving done file for manual review`. Proceed to Phase 8 (REPORT) normally.
|
|
588
|
-
|
|
589
|
-
## Phase 8: REPORT (you do this)
|
|
590
|
-
|
|
591
|
-
Generate `.deepflow/auto-report.md`. Always generate a report, even on errors or interrupts.
|
|
592
|
-
|
|
593
|
-
### 8a. Determine status
|
|
594
|
-
|
|
595
|
-
For each spec:
|
|
596
|
-
- Winner file exists (`.deepflow/selection/{spec-name}-winner.json`) → `converged`
|
|
597
|
-
- Interrupted/incomplete → `in-progress`
|
|
598
|
-
- Failed without recovery → `halted`
|
|
599
|
-
|
|
600
|
-
Overall status: `converged` only if ALL specs converged. Any `halted` → overall `halted`. Any `in-progress` → overall `in-progress`.
|
|
601
|
-
|
|
602
|
-
### 8b. Build report
|
|
603
|
-
|
|
604
|
-
```markdown
|
|
605
|
-
# deepflow auto report
|
|
606
|
-
|
|
607
|
-
**Status:** {overall_status}
|
|
608
|
-
**Date:** {UTC timestamp}
|
|
609
|
-
|
|
610
|
-
---
|
|
611
|
-
|
|
612
|
-
## {spec-name}
|
|
613
|
-
|
|
614
|
-
**Status:** {converged|halted|in-progress}
|
|
615
|
-
**Winner:** {slug} (if converged)
|
|
616
|
-
|
|
617
|
-
### Hypotheses
|
|
618
|
-
{for each hypothesis in .deepflow/hypotheses/{spec-name}-cycle-{N}.json:}
|
|
619
|
-
- **{slug}:** {hypothesis description}
|
|
620
|
-
|
|
621
|
-
### Spike Results
|
|
622
|
-
{for each worktree .deepflow/worktrees/{spec-name}-{slug}:}
|
|
623
|
-
- {pass_icon} **{slug}** — {summary from spike-{slug}.yaml}
|
|
624
|
-
|
|
625
|
-
### Selection Rationale
|
|
626
|
-
{parse rankings from .deepflow/selection/{spec-name}-winner.json:}
|
|
627
|
-
{rank 1 icon} **#{rank} {slug}:** {rationale}
|
|
628
|
-
|
|
629
|
-
### Verification
|
|
630
|
-
{if verification ran, show gate results from Phase 6:}
|
|
631
|
-
- {status_icon} **L0 Build:** {detail}
|
|
632
|
-
- {status_icon} **L1 Exists:** {detail}
|
|
633
|
-
- {status_icon} **L2 Substantive:** {detail}
|
|
634
|
-
- {status_icon} **L3 Wired:** {detail}
|
|
635
|
-
- {status_icon} **L4 Tests:** {detail}
|
|
636
|
-
{if halted: "Verification FAILED — worktree preserved for inspection at .deepflow/worktrees/{spec-name}-{winner-slug}"}
|
|
637
|
-
|
|
638
|
-
### Pull Request
|
|
639
|
-
{if pr_url is set and not a direct merge: "**PR:** [{pr_url}]({pr_url})"}
|
|
640
|
-
{if pr_url indicates direct merge: "**Merged directly** — `gh` CLI was not available. No PR created."}
|
|
641
|
-
{if pr_url is unset (e.g., push failed or verification failed): "No PR created."}
|
|
642
|
-
|
|
643
|
-
### Changes
|
|
644
|
-
{run: git diff --stat main...df/{spec-name}-{winner-slug}}
|
|
645
|
-
|
|
646
|
-
---
|
|
647
|
-
|
|
648
|
-
## Next Steps
|
|
649
|
-
{if converged and pr_url is a real PR: "Review and merge PR: {pr_url}"}
|
|
650
|
-
{if converged and direct merge: "Already merged to main."}
|
|
651
|
-
{if in-progress: "Run `deepflow auto --continue` to resume."}
|
|
652
|
-
{if halted: "Review the spec and run `deepflow auto` again."}
|
|
653
|
-
```
|
|
654
|
-
|
|
655
|
-
## Cycle Control
|
|
656
|
-
|
|
657
|
-
| Condition | Action |
|
|
658
|
-
|-----------|--------|
|
|
659
|
-
| No specs found | Stop with error |
|
|
660
|
-
| All spikes failed | Proceed to SELECT (it will reject) |
|
|
661
|
-
| SELECT rejects all | Loop to HYPOTHESIZE (next cycle) |
|
|
662
|
-
| SELECT picks winner | Verify → PR → next spec |
|
|
663
|
-
| MAX_CYCLES reached | Mark halted, generate report |
|
|
664
|
-
| Teammate fails to produce artifacts | Treat as failed |
|
|
665
|
-
| JSON parse error | Log error, treat as failed |
|
|
666
|
-
|
|
667
|
-
Always generate a report, even on errors or interrupts.
|