deepflow 0.1.107 → 0.1.109
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/install.js +25 -7
- package/bin/install.test.js +113 -0
- package/bin/plan-consolidator.js +19 -1
- package/bin/plan-consolidator.test.js +150 -0
- package/bin/ratchet.js +11 -6
- package/bin/ratchet.test.js +172 -0
- package/bin/worktree-deps.js +127 -0
- package/hooks/ac-coverage.js +213 -0
- package/hooks/df-explore-protocol.js +227 -28
- package/hooks/df-explore-protocol.test.js +460 -81
- package/hooks/df-spec-lint.js +13 -2
- package/hooks/df-spec-lint.test.js +133 -0
- package/package.json +4 -1
- package/src/commands/df/execute.md +112 -2
- package/src/commands/df/plan.md +244 -16
- package/src/commands/df/verify.md +46 -8
- package/templates/config-template.yaml +1 -0
- package/templates/explore-protocol.md.bak +69 -0
- package/templates/plan-template.md +11 -0
- package/templates/spec-template.md +15 -0
package/src/commands/df/plan.md
CHANGED
|
@@ -7,7 +7,8 @@ description: Compare specs against codebase and past experiments, generate prior
|
|
|
7
7
|
|
|
8
8
|
Compare specs against codebase and past experiments. Generate prioritized tasks.
|
|
9
9
|
|
|
10
|
-
**NEVER:**
|
|
10
|
+
**NEVER:** Read implementation source files, edit code, use TaskOutput, use EnterPlanMode, use ExitPlanMode
|
|
11
|
+
**ONLY:** Read specs/*.md, .deepflow/config.yaml, .deepflow/experiments/, PLAN.md, .deepflow/*.md state files, spawn agents, run health checks, update PLAN.md
|
|
11
12
|
|
|
12
13
|
## Usage
|
|
13
14
|
```
|
|
@@ -75,26 +76,108 @@ Glob `.deepflow/experiments/{topic}--*`. File naming: `{topic}--{hypothesis}--{s
|
|
|
75
76
|
|
|
76
77
|
Implementation tasks BLOCKED until spike validates.
|
|
77
78
|
|
|
78
|
-
### 3.
|
|
79
|
+
### 3. EXPLORE & IMPACT (PARALLEL AGENTS)
|
|
79
80
|
|
|
80
|
-
|
|
81
|
+
Spawn three parallel `Task(subagent_type="default", model="sonnet")` agents simultaneously. Collect all outputs before proceeding.
|
|
81
82
|
|
|
82
|
-
|
|
83
|
+
#### Agent A — Code Style & Conventions
|
|
83
84
|
|
|
84
|
-
|
|
85
|
+
```
|
|
86
|
+
## Objective
|
|
87
|
+
Identify code style, patterns, and integration points relevant to the spec under analysis.
|
|
88
|
+
|
|
89
|
+
## Acceptance Criteria
|
|
90
|
+
- Return a code style summary covering: naming conventions, error handling patterns, API structure, module boundaries
|
|
91
|
+
- List integration points (files/exports) that the spec's target files interact with
|
|
92
|
+
- Flag any implicit patterns not captured in the spec
|
|
93
|
+
|
|
94
|
+
## Spec and target files
|
|
95
|
+
{spec_file_path}
|
|
96
|
+
{files_list_from_spec}
|
|
97
|
+
|
|
98
|
+
## Output contract (return EXACTLY this structure — no deviations)
|
|
99
|
+
|
|
100
|
+
### Code Style Summary
|
|
101
|
+
- Naming: {convention observed}
|
|
102
|
+
- Error handling: {pattern observed}
|
|
103
|
+
- API structure: {pattern observed}
|
|
104
|
+
- Module boundaries: {pattern observed}
|
|
105
|
+
|
|
106
|
+
### Integration Points
|
|
107
|
+
- {file}: {how it integrates} [callee|caller|peer]
|
|
108
|
+
|
|
109
|
+
### Implicit Patterns
|
|
110
|
+
- {pattern}: {description}
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
#### Agent B — Blast Radius (LSP-first)
|
|
114
|
+
|
|
115
|
+
```
|
|
116
|
+
## Objective
|
|
117
|
+
Perform LSP-first impact analysis for each file in the spec's Files list. Produce a blast-radius map with caller counts and impact reasons.
|
|
118
|
+
|
|
119
|
+
## Acceptance Criteria
|
|
120
|
+
- PRIMARY: Run LSP `findReferences`/`incomingCalls` on every export being changed in scope
|
|
121
|
+
- FALLBACK: If LSP is unavailable for a file, log exactly `LSP unavailable for {file}: {reason}` then use grep
|
|
122
|
+
- Annotate each impacted file with WHY it is affected
|
|
123
|
+
- Classify duplicate logic files as [active] (consolidate) or [dead] (DELETE candidate)
|
|
124
|
+
- Trace data flow via LSP `outgoingCalls` for consumer mapping
|
|
125
|
+
- Skip impact analysis entirely for spike tasks
|
|
126
|
+
|
|
127
|
+
## Spec and target files
|
|
128
|
+
{spec_file_path}
|
|
129
|
+
{files_list_from_spec}
|
|
130
|
+
|
|
131
|
+
## Output contract (return EXACTLY this structure — no deviations)
|
|
132
|
+
|
|
133
|
+
### Blast Radius per File
|
|
134
|
+
- {file}: {caller_count} callers — {why_impacted}
|
|
135
|
+
- Callers: {file1}, {file2}
|
|
136
|
+
- Duplicates: {file} [active|dead]
|
|
137
|
+
- Consumers (outgoing): {file1}, {file2}
|
|
138
|
+
|
|
139
|
+
### Files Outside Original Scope
|
|
140
|
+
- {file}: (impact — verify/update) — {reason}
|
|
141
|
+
|
|
142
|
+
### LSP Fallback Log
|
|
143
|
+
- {file}: LSP unavailable — {reason} (grep used)
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
#### Agent C — Dead Code & TODOs
|
|
147
|
+
|
|
148
|
+
```
|
|
149
|
+
## Objective
|
|
150
|
+
Audit the codebase for incomplete work and dead code related to the spec's scope.
|
|
85
151
|
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
152
|
+
## Acceptance Criteria
|
|
153
|
+
- Use the `code-completeness` skill to surface TODOs/FIXMEs/HACKs, stubs, and skipped tests
|
|
154
|
+
- Flag any dead code (unreferenced exports, unused modules) within spec scope
|
|
155
|
+
- Inventory all stubs that must be implemented before tasks can be marked done
|
|
90
156
|
|
|
91
|
-
|
|
157
|
+
## Spec and target files
|
|
158
|
+
{spec_file_path}
|
|
159
|
+
{files_list_from_spec}
|
|
160
|
+
|
|
161
|
+
## Output contract (return EXACTLY this structure — no deviations)
|
|
162
|
+
|
|
163
|
+
### TODO / FIXME / HACK Inventory
|
|
164
|
+
- {file}:{line}: {tag} — {description}
|
|
92
165
|
|
|
93
|
-
###
|
|
166
|
+
### Stubs & Incomplete Implementations
|
|
167
|
+
- {file}:{symbol}: {reason incomplete}
|
|
94
168
|
|
|
95
|
-
|
|
169
|
+
### Dead Code Flags
|
|
170
|
+
- {file}:{symbol}: [dead] — {evidence}
|
|
96
171
|
|
|
97
|
-
|
|
172
|
+
### Skipped Tests
|
|
173
|
+
- {file}:{test_name}: [skipped] — {reason if known}
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
**Merged output contract** (consumed by §4.6 and §5):
|
|
177
|
+
- Code style summary (from Agent A)
|
|
178
|
+
- Blast radius per file with caller counts (from Agent B)
|
|
179
|
+
- Dead code flags (from Agent C)
|
|
180
|
+
- TODO/stub inventory (from Agent C)
|
|
98
181
|
|
|
99
182
|
### 4.6. CROSS-TASK FILE CONFLICT DETECTION
|
|
100
183
|
|
|
@@ -203,6 +286,96 @@ Continue processing remaining specs regardless of individual failures. Only succ
|
|
|
203
286
|
|
|
204
287
|
**Flow after fan-out:** The consolidator (§5B) reads mini-plans from `.deepflow/plans/` for consolidation (global renumbering, cross-spec conflict detection, prioritization). §5 handles both the single-spec monolithic path and the multi-spec consolidation path.
|
|
205
288
|
|
|
289
|
+
### 4.8. INTEGRATION TASK DETECTION (MULTI-SPEC)
|
|
290
|
+
|
|
291
|
+
**When:** >1 plannable spec found in §1. Skip for single-spec plans.
|
|
292
|
+
|
|
293
|
+
**Purpose:** Specs implemented in isolation often break at integration boundaries — mismatched API contracts, conflicting migrations, incompatible types. This step detects shared interfaces and auto-generates integration tasks to catch these gaps before they cascade into uncontrolled fix spirals.
|
|
294
|
+
|
|
295
|
+
#### 4.8.1. Detect Shared Interfaces
|
|
296
|
+
|
|
297
|
+
After §4.7.3 collects mini-plans, spawn a single `Task(subagent_type="default", model="sonnet")` to scan all plannable specs for interface overlap:
|
|
298
|
+
|
|
299
|
+
```
|
|
300
|
+
You are an integration analyst. Detect shared interfaces across specs AND the existing codebase.
|
|
301
|
+
|
|
302
|
+
## Spec files (being planned now)
|
|
303
|
+
{list all plannable spec file paths}
|
|
304
|
+
|
|
305
|
+
## Completed spec files (already implemented)
|
|
306
|
+
{list all specs/done-*.md file paths, or "(none)" if empty}
|
|
307
|
+
|
|
308
|
+
## Instructions
|
|
309
|
+
|
|
310
|
+
1. Read each plannable spec file
|
|
311
|
+
2. Read each done-* spec file (for their declared Interfaces/Produces sections)
|
|
312
|
+
3. **Ground-truth check on done-* specs:** For each interface a done-* spec claims to Produce, verify the ACTUAL implementation in the codebase matches the spec declaration:
|
|
313
|
+
- API routes: grep for the route handler, read the response struct/type to confirm the actual shape
|
|
314
|
+
- DB tables: read the latest migration files to confirm actual column names and types
|
|
315
|
+
- Shared types: read the type definition file to confirm actual fields
|
|
316
|
+
- If the code DIFFERS from the spec declaration, record the CODE's version as the real contract (the spec may be stale after fix cycles)
|
|
317
|
+
4. Extract interfaces from plannable specs (in priority order):
|
|
318
|
+
a. Explicit `## Interfaces` section (Produces/Consumes declarations)
|
|
319
|
+
b. `## Dependencies` section (depends_on references)
|
|
320
|
+
c. Implicit: API routes mentioned in Requirements/ACs, DB tables/migrations in Technical Notes, shared types/packages in Files
|
|
321
|
+
5. Build an interface map: for each interface, list who produces it, who consumes it, and whether the ground-truth matches the spec declaration
|
|
322
|
+
|
|
323
|
+
## OUTPUT FORMAT — MANDATORY
|
|
324
|
+
|
|
325
|
+
### Interface Map
|
|
326
|
+
|
|
327
|
+
- `{interface}` [{type: api|db|type|package}]
|
|
328
|
+
- Produces: {spec} — declared: `{shape from spec}` | actual: `{shape from code}`
|
|
329
|
+
- Consumes: {spec2}, {spec3}
|
|
330
|
+
|
|
331
|
+
### Stale Contracts (spec ≠ code)
|
|
332
|
+
|
|
333
|
+
- `{interface}`: {done-spec} declares `{spec_shape}` but code has `{actual_shape}` — {what changed and why it matters}
|
|
334
|
+
|
|
335
|
+
### Contract Risks
|
|
336
|
+
|
|
337
|
+
- {risk}: {spec_a} produces `{interface}` but {spec_b} consumes it with different assumptions — {detail}
|
|
338
|
+
|
|
339
|
+
### Migration Conflicts
|
|
340
|
+
|
|
341
|
+
- {migration_a} ({spec_a}) and {migration_b} ({spec_b}): {conflict description}
|
|
342
|
+
|
|
343
|
+
If no shared interfaces found, return:
|
|
344
|
+
### Interface Map
|
|
345
|
+
(none detected — specs are independent)
|
|
346
|
+
```
|
|
347
|
+
|
|
348
|
+
#### 4.8.2. Generate Integration Tasks
|
|
349
|
+
|
|
350
|
+
**Skip if:** Interface Map returns "(none detected — specs are independent)".
|
|
351
|
+
|
|
352
|
+
For each group of specs sharing interfaces, generate ONE integration task appended AFTER all spec tasks in the consolidated plan. Integration tasks are always the last wave.
|
|
353
|
+
|
|
354
|
+
**Integration task format:**
|
|
355
|
+
```markdown
|
|
356
|
+
- [ ] **T{N}** [INTEGRATION]: Verify {spec_a} ↔ {spec_b} contracts
|
|
357
|
+
- Files: {files at integration boundaries — API handlers, adapters, shared types, migrations}
|
|
358
|
+
- Integration ACs:
|
|
359
|
+
- End-to-end flow: {producer} → {consumer} works with real data
|
|
360
|
+
- Migration idempotency: all migrations run 001→N twice without error
|
|
361
|
+
- Contract match: {producer API response shape} matches {consumer expected shape}
|
|
362
|
+
- Type compatibility: shared types compile across all consuming packages
|
|
363
|
+
- Model: opus
|
|
364
|
+
- Effort: high
|
|
365
|
+
- Blocked by: {all implementation task IDs from both specs}
|
|
366
|
+
```
|
|
367
|
+
|
|
368
|
+
**Rules:**
|
|
369
|
+
- ONE integration task per interface cluster (group of specs connected by shared interfaces)
|
|
370
|
+
- Integration tasks are ALWAYS blocked by ALL implementation tasks of the connected specs
|
|
371
|
+
- Integration ACs are CONCRETE — derived from the actual interfaces detected, not generic
|
|
372
|
+
- **Stale Contracts from §4.8.1 are HIGH PRIORITY ACs** — if a done-spec declares shape X but code has shape Y, the integration task MUST include an AC verifying that the new spec's consumer uses shape Y (the real one), not shape X (the stale one). Include both shapes in the AC for clarity.
|
|
373
|
+
- Contract Risks from §4.8.1 become specific ACs (e.g., "verify endpoint returns byte-exact JSON" if a risk about serialization was detected)
|
|
374
|
+
- Migration Conflicts from §4.8.1 become idempotency ACs
|
|
375
|
+
- Integration tasks use `opus/high` — they require understanding multiple spec contexts
|
|
376
|
+
|
|
377
|
+
**Pass integration analysis output to §5B consolidator** (append to Opus prompt as `## Integration Analysis` section).
|
|
378
|
+
|
|
206
379
|
### 5. COMPARE & PRIORITIZE
|
|
207
380
|
|
|
208
381
|
**Two paths** — determined by spec count from §1/§4.7:
|
|
@@ -211,9 +384,47 @@ Continue processing remaining specs regardless of individual failures. Only succ
|
|
|
211
384
|
|
|
212
385
|
**When:** Exactly 1 plannable spec (§4.7 was skipped).
|
|
213
386
|
|
|
214
|
-
Spawn `Task(subagent_type="reasoner", model="opus")
|
|
387
|
+
Spawn `Task(subagent_type="reasoner", model="opus")` passing ONLY:
|
|
388
|
+
- `spec_path` — path to the spec file (e.g., `specs/feature.md`)
|
|
389
|
+
- `agent_summaries[]` — structured output blocks returned by §3 Agents A, B, C (their output contract sections verbatim — no raw source code, no inlined file contents)
|
|
390
|
+
- `experiment_results[]` — paths and conclusion excerpts from `.deepflow/experiments/` matches found in §2 (paths only, no full file content)
|
|
391
|
+
|
|
392
|
+
**NEVER pass to the reasoner:** raw source code, inlined file contents, or any implementation file text. The reasoner works from paths and summaries only.
|
|
393
|
+
|
|
394
|
+
The reasoner prompt:
|
|
395
|
+
|
|
396
|
+
```
|
|
397
|
+
You are the plan reasoner. Analyze this spec and produce a prioritized task plan.
|
|
398
|
+
|
|
399
|
+
## Spec file path
|
|
400
|
+
{spec_path}
|
|
401
|
+
|
|
402
|
+
Read the spec using the Read tool on the path above. Do NOT read any implementation files.
|
|
403
|
+
|
|
404
|
+
## Agent summaries (from §3 parallel agents)
|
|
405
|
+
|
|
406
|
+
### Code Style & Conventions (Agent A)
|
|
407
|
+
{agent_a_summary — verbatim output contract block}
|
|
408
|
+
|
|
409
|
+
### Blast Radius (Agent B)
|
|
410
|
+
{agent_b_summary — verbatim output contract block}
|
|
411
|
+
|
|
412
|
+
### Dead Code & TODOs (Agent C)
|
|
413
|
+
{agent_c_summary — verbatim output contract block}
|
|
414
|
+
|
|
415
|
+
## Experiment results (paths + conclusions)
|
|
416
|
+
{for each experiment: path and Conclusion section excerpt only}
|
|
417
|
+
|
|
418
|
+
## Your job
|
|
419
|
+
|
|
420
|
+
Map each requirement to DONE/PARTIAL/MISSING/CONFLICT. Check REQ-AC alignment. Flag spec gaps.
|
|
421
|
+
Scan ACs for metric patterns `{metric} {operator} {number}[unit]` — flag matches for §6.5 Optimize tasks, flag ambiguous thresholds ("fast", "small") as spec gaps.
|
|
422
|
+
Apply the §5.5 routing matrix to classify model + effort per task.
|
|
215
423
|
|
|
216
424
|
Priority: Dependencies → Impact → Risk
|
|
425
|
+
```
|
|
426
|
+
|
|
427
|
+
Then apply §5.5 routing matrix. Continue to §6.
|
|
217
428
|
|
|
218
429
|
##### Metric AC Detection
|
|
219
430
|
|
|
@@ -266,7 +477,11 @@ You are the plan prioritizer. The mechanical consolidation (global T-numbering,
|
|
|
266
477
|
|
|
267
478
|
{for each plannable spec: spec filename and its Requirements + Acceptance Criteria sections}
|
|
268
479
|
|
|
269
|
-
##
|
|
480
|
+
## Integration Analysis (from §4.8)
|
|
481
|
+
|
|
482
|
+
{paste integration analyst output here — Interface Map, Contract Risks, Migration Conflicts}
|
|
483
|
+
|
|
484
|
+
## Your job — FOUR things only
|
|
270
485
|
|
|
271
486
|
### 1. Cross-Spec Prioritization
|
|
272
487
|
Review the task ordering across specs. If a different spec ordering would reduce blocked tasks or improve parallelism, suggest reordering. Otherwise confirm the current ordering is optimal.
|
|
@@ -277,7 +492,14 @@ If reordering is needed, output the recommended spec order. The orchestrator wil
|
|
|
277
492
|
For each spec, map requirements to DONE/PARTIAL/MISSING/CONFLICT. Flag spec gaps.
|
|
278
493
|
Scan ACs for metric patterns `{metric} {operator} {number}[unit]` — flag matches for §6.5 Optimize tasks, flag ambiguous thresholds ("fast", "small") as spec gaps.
|
|
279
494
|
|
|
280
|
-
### 3.
|
|
495
|
+
### 3. Integration Task Validation
|
|
496
|
+
Review the Integration Analysis. For each Contract Risk and Migration Conflict:
|
|
497
|
+
- Confirm or refine the generated integration task ACs
|
|
498
|
+
- Add missing ACs if you detect interface assumptions not caught by the analyst (e.g., serialization format, column type mismatches, auth flow dependencies)
|
|
499
|
+
- Remove false positives (interfaces that look shared but are actually independent)
|
|
500
|
+
- If no integration tasks were generated but you detect cross-spec coupling, CREATE integration tasks following the §4.8.2 format
|
|
501
|
+
|
|
502
|
+
### 4. Model + Effort Classification
|
|
281
503
|
Apply routing matrix to each task:
|
|
282
504
|
|
|
283
505
|
| Task type | Model | Effort |
|
|
@@ -307,6 +529,7 @@ Defaults: sonnet / medium.
|
|
|
307
529
|
|--------|-------|
|
|
308
530
|
| Specs analyzed | {N} |
|
|
309
531
|
| Tasks created | {N} |
|
|
532
|
+
| Integration tasks | {N} |
|
|
310
533
|
| Ready (no blockers) | {N} |
|
|
311
534
|
| Blocked | {N} |
|
|
312
535
|
|
|
@@ -318,6 +541,10 @@ Defaults: sonnet / medium.
|
|
|
318
541
|
|
|
319
542
|
{Insert the consolidated tasks from plan-consolidator verbatim, adding ` — model/effort` to each task line per the routing matrix. Do NOT alter T-ids, descriptions, Blocked by, or conflict annotations.}
|
|
320
543
|
|
|
544
|
+
### integration
|
|
545
|
+
|
|
546
|
+
{Insert integration tasks from §4.8.2, validated/refined by step 3 above. Each task follows the standard format with [INTEGRATION] marker.}
|
|
547
|
+
|
|
321
548
|
Example transformation:
|
|
322
549
|
Input: `- [ ] **T3**: Create pkg/engine/go.mod | Blocked by: T8`
|
|
323
550
|
Output: `- [ ] **T3**: Create pkg/engine/go.mod — haiku/low | Blocked by: T8`
|
|
@@ -452,5 +679,6 @@ If any L0–L1 spec: `ℹ L0–L1 specs generate spikes only. Deepen with /df:sp
|
|
|
452
679
|
- **Learn from failures** — Extract next hypothesis, never repeat approach
|
|
453
680
|
- **Plan only** — Do NOT implement (except quick validation prototypes)
|
|
454
681
|
- **One task = one logical unit** — Atomic, committable
|
|
682
|
+
- **Context budget** — orchestrator reads ONLY specs, config, experiments, PLAN.md, .deepflow/ state; never implementation files
|
|
455
683
|
- Prefer existing utilities over new code; flag spec gaps
|
|
456
684
|
- Always use `Task` tool with explicit `subagent_type` and `model`
|
|
@@ -25,7 +25,7 @@ context: fork
|
|
|
25
25
|
|
|
26
26
|
When invoked with `--diagnostic`:
|
|
27
27
|
|
|
28
|
-
- Run **L0-L4 only** (skip L5 entirely, even if frontend detected).
|
|
28
|
+
- Run **L0-L4.5 only** (skip L5 entirely, even if frontend detected).
|
|
29
29
|
- Write results to `.deepflow/results/final-test-{spec}.yaml` under a `diagnostics:` key:
|
|
30
30
|
```yaml
|
|
31
31
|
diagnostics:
|
|
@@ -35,7 +35,8 @@ When invoked with `--diagnostic`:
|
|
|
35
35
|
L1: pass # or fail
|
|
36
36
|
L2: pass # or warn (no tool)
|
|
37
37
|
L4: fail # or pass
|
|
38
|
-
|
|
38
|
+
L4.5: pass # or fail or skip (no deps)
|
|
39
|
+
summary: "L0 ✓ | L1 ✓ | L2 ⚠ | L3 — | L4 ✗ | L4.5 ✓"
|
|
39
40
|
```
|
|
40
41
|
- Prefix all report output with `[DIAGNOSTIC]`.
|
|
41
42
|
- **Skip entirely:** Post-Verification merge (§4), fix task creation, spec rename, decision extraction, PLAN.md cleanup (step 6).
|
|
@@ -85,10 +86,44 @@ Nothing found → `⚠ No build/test commands detected. L0/L4 skipped. Set quali
|
|
|
85
86
|
|
|
86
87
|
No tool → pass with warning. When available: stash changes → run coverage on baseline → stash pop → run coverage on current → compare. Drop → FAIL. Same/improved → pass.
|
|
87
88
|
|
|
88
|
-
**L3:
|
|
89
|
+
**L3: AC coverage verification** — Verify that agent-reported acceptance criteria coverage matches the spec's acceptance criteria section. Parse spec file for `## Acceptance Criteria` section, extract all ACs. For each AC, verify that agent execution explicitly claimed coverage (via agent output or PLAN.md task completion notes). Missing or uncovered ACs → FAIL with list of uncovered ACs. All ACs claimed → pass.
|
|
89
90
|
|
|
90
91
|
**L4: Tests** — Run AFTER L0 passes. Run even if L1-L2 had issues. Exit 0 → pass. Non-zero → FAIL with last 50 lines + fix task. If `quality.test_retry_on_fail: true`: re-run once; second pass → warn (flaky); second fail → genuine failure.
|
|
91
92
|
|
|
93
|
+
**L4.5: Cross-Spec Integration** (if integration tasks exist)
|
|
94
|
+
|
|
95
|
+
**Trigger:** Current spec's PLAN.md section contains `[INTEGRATION]` tasks, OR spec has `depends_on` referencing `done-*` specs.
|
|
96
|
+
|
|
97
|
+
**Check:** Load dependent specs (`specs/done-*.md` referenced in `depends_on` or connected via integration tasks). For each:
|
|
98
|
+
1. Re-run L0 (build) — already covered by standard L0, skip
|
|
99
|
+
2. Re-run L4 (tests) — already covered by standard L4, skip
|
|
100
|
+
3. **Contract verification (code-first, not spec-first):**
|
|
101
|
+
- For each `Produces` interface in dependent specs, verify against the ACTUAL CODE, not the spec declaration:
|
|
102
|
+
- API routes: grep for the handler, read the response struct/type → this is the real contract
|
|
103
|
+
- DB tables: read the latest migration files → actual column names and types
|
|
104
|
+
- Shared types: read the type definition → actual fields
|
|
105
|
+
- If the spec declaration differs from the code, the CODE is the source of truth (specs may be stale after fix cycles)
|
|
106
|
+
- Then verify that the CURRENT spec's consumers match the code's actual shape
|
|
107
|
+
4. **Stale spec detection** — if a done-* spec's `## Interfaces` section doesn't match the code, emit advisory warning:
|
|
108
|
+
```
|
|
109
|
+
⚠ Stale interface: done-auth-spec declares POST /login → { access_token, refresh_token }
|
|
110
|
+
but code returns { token, refresh }. Spec should be updated.
|
|
111
|
+
```
|
|
112
|
+
5. **Migration idempotency** — if migrations exist: run `{build_command}` twice (the build already runs migrations in most Go/Node projects). If a dedicated migration command exists in config (`quality.migration_command`), run it twice and verify exit 0 both times.
|
|
113
|
+
|
|
114
|
+
**Outcome:** Pass if all contracts verified against code. Fail with specific mismatches:
|
|
115
|
+
```
|
|
116
|
+
✗ L4.5: Contract mismatch
|
|
117
|
+
- done-auth code returns POST /api/v1/auth/login → { token: string }
|
|
118
|
+
but operator SPA sends { api_key } in body (expected { token })
|
|
119
|
+
- done-backend code stores rounds.result_json as TEXT
|
|
120
|
+
but current spec reads it with JSONB operators
|
|
121
|
+
⚠ L4.5: Stale spec (advisory, not blocking)
|
|
122
|
+
- done-auth-spec declares { access_token } but code returns { token }
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
Fix task on L4.5 failure: prescriptive — names the exact contract from CODE (not spec), the producer, the consumer, and which side should change (prefer changing consumer to match producer's actual implementation).
|
|
126
|
+
|
|
92
127
|
**L5: Browser Verification** (if frontend detected)
|
|
93
128
|
|
|
94
129
|
Algorithm: detect frontend → resolve dev command/port → start server → poll readiness → read assertions from PLAN.md → auto-install Playwright Chromium → evaluate via `locator.ariaSnapshot()` → screenshot → retry once on failure → report.
|
|
@@ -148,24 +183,26 @@ All L5 outcomes: `✓` pass | `⚠` passed on retry | `✗` both failed (same) |
|
|
|
148
183
|
|
|
149
184
|
### 3. GENERATE REPORT
|
|
150
185
|
|
|
151
|
-
**Success:** `doing-upload.md: L0 ✓ | L1 ✓ (5/5 files) | L2 ⚠ (no coverage tool) | L3 — (subsumed) | L4 ✓ (12 tests) | L5 ✓ | 0 quality issues`
|
|
186
|
+
**Success:** `doing-upload.md: L0 ✓ | L1 ✓ (5/5 files) | L2 ⚠ (no coverage tool) | L3 — (subsumed) | L4 ✓ (12 tests) | L4.5 ✓ (3 contracts) | L5 ✓ | 0 quality issues`
|
|
152
187
|
|
|
153
188
|
**Failure:**
|
|
154
189
|
```
|
|
155
|
-
doing-upload.md: L0 ✓ | L1 ✗ (3/5 files) | L2 ⚠ | L3 — | L4 ✗ (3 failed) | L5 ✗ (2 assertions failed)
|
|
190
|
+
doing-upload.md: L0 ✓ | L1 ✗ (3/5 files) | L2 ⚠ | L3 — | L4 ✗ (3 failed) | L4.5 ✗ (1 mismatch) | L5 ✗ (2 assertions failed)
|
|
156
191
|
|
|
157
192
|
Issues:
|
|
158
193
|
✗ L1: Missing files: src/api/upload.ts, src/services/storage.ts
|
|
159
194
|
✗ L4: 3 test failures
|
|
160
195
|
FAIL src/upload.test.ts > should validate file type
|
|
196
|
+
✗ L4.5: Contract mismatch — done-auth produces { access_token } but operator sends { api_key }
|
|
161
197
|
|
|
162
198
|
Fix tasks added to PLAN.md:
|
|
163
199
|
T10: Implement missing upload endpoint and storage service
|
|
200
|
+
T11: Fix operator login to send access_token per auth spec contract
|
|
164
201
|
|
|
165
202
|
Run /df:execute --continue to fix in the same worktree.
|
|
166
203
|
```
|
|
167
204
|
|
|
168
|
-
**Gate conditions (ALL must pass to merge):** L0 build (or no command) | L1 all files in diff | L2 coverage held (or no tool) | L4 tests pass (or no command) | L5 assertions pass (or no frontend/assertions).
|
|
205
|
+
**Gate conditions (ALL must pass to merge):** L0 build (or no command) | L1 all files in diff | L2 coverage held (or no tool) | L4 tests pass (or no command) | L4.5 contracts match (or no dependencies/integration tasks) | L5 assertions pass (or no frontend/assertions).
|
|
169
206
|
|
|
170
207
|
**All pass →** Post-Verification merge. **Issues found →** Add fix tasks to worktree PLAN.md (IDs continue from last), register via TaskCreate/TaskUpdate, output report + "Run /df:execute --continue". Do NOT create new specs, worktrees, or merge with issues pending.
|
|
171
208
|
|
|
@@ -193,7 +230,8 @@ Objective: ... | Approach: ... | Why it worked: ... | Files: ...
|
|
|
193
230
|
2. **Merge:** `git checkout main && git merge ${BRANCH} --no-ff -m "feat({spec}): merge verified changes"`. On conflict → keep worktree, output "Resolve manually, run /df:verify --merge-only", exit.
|
|
194
231
|
3. **Cleanup:** `git worktree remove --force ${PATH} && git branch -d ${BRANCH} && rm -f .deepflow/checkpoint.json`
|
|
195
232
|
4. **Rename spec:** `mv specs/doing-${NAME}.md specs/done-${NAME}.md`
|
|
196
|
-
5. **
|
|
197
|
-
6. **
|
|
233
|
+
5. **Cleanup stale plans:** `rm -f .deepflow/plans/doing-${NAME}.md`
|
|
234
|
+
6. **Extract decisions (additive):** Read done spec, extract `[APPROACH]`/`[ASSUMPTION]`/`[PROVISIONAL]`/`[FUTURE]`/`[UPDATE]` decisions, append to `.deepflow/decisions.md` under `### {date} — {spec}` header. If the header already exists (decisions were captured incrementally during execution via §5.5.1), append only NEW decisions not already present (deduplicate by comparing decision text). Delete done spec after successful write; preserve on failure.
|
|
235
|
+
7. **Clean PLAN.md:** Find the `### {spec-name}` section (match on name stem, strip `doing-`/`done-` prefix). Delete from header through the line before the next `### ` header (or EOF). Recalculate Summary table (recount `### ` headers for spec count, `- [ ]`/`- [x]` for task counts). If no spec sections remain, delete PLAN.md entirely. Skip silently if PLAN.md missing or section already gone.
|
|
198
236
|
|
|
199
237
|
Output: `✓ Merged → main | ✓ Cleaned worktree | ✓ Spec → done | ✓ Decisions extracted | ✓ Cleaned PLAN.md | Workflow complete! Ready: /df:spec <name>`
|
|
@@ -38,6 +38,7 @@ models:
|
|
|
38
38
|
|
|
39
39
|
explore:
|
|
40
40
|
max_tokens: 500 # Controls Explore agent response length
|
|
41
|
+
explore_lsp_timeout_ms: 15000 # Timeout (ms) for the Phase 1 LSP subprocess; on timeout the static template is injected as fallback
|
|
41
42
|
|
|
42
43
|
commits:
|
|
43
44
|
format: "feat({spec}): {description}"
|
|
@@ -0,0 +1,69 @@
|
|
|
1
|
+
# Search Protocol
|
|
2
|
+
|
|
3
|
+
You MUST follow these phases. Do NOT search sequentially.
|
|
4
|
+
|
|
5
|
+
## DIVERSIFY
|
|
6
|
+
- Launch 5-8 parallel tool calls in a single message
|
|
7
|
+
- **Prefer LSP** when searching for symbols, types, or function usage:
|
|
8
|
+
- `workspaceSymbol` — find symbols by name across the project (faster + more precise than grep)
|
|
9
|
+
- `documentSymbol` — list all symbols in a file (returns line ranges natively)
|
|
10
|
+
- `findReferences` — find all usages of a symbol
|
|
11
|
+
- **Fallback to Grep/Glob** for string patterns, config values, or when LSP is unavailable
|
|
12
|
+
- Narrow down to 2-5 candidate files
|
|
13
|
+
|
|
14
|
+
## CONVERGE
|
|
15
|
+
- **Prefer LSP** to validate and extract precise ranges:
|
|
16
|
+
- `goToDefinition` — jump to source without reading the whole file
|
|
17
|
+
- `hover` — get type info and docs in one call
|
|
18
|
+
- `documentSymbol` — get all symbols with line ranges
|
|
19
|
+
- Fallback: `Read` with `offset`/`limit` for only the relevant line range
|
|
20
|
+
- Eliminate false positives, confirm relevance
|
|
21
|
+
|
|
22
|
+
## EARLY STOP
|
|
23
|
+
- Stop as soon as >= 2 relevant files answer the question
|
|
24
|
+
- Exception: searching for a single unique thing → find just 1
|
|
25
|
+
|
|
26
|
+
## Return Format
|
|
27
|
+
|
|
28
|
+
```
|
|
29
|
+
filepath:startLine-endLine -- why relevant
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
## Good Example (2 turns)
|
|
33
|
+
|
|
34
|
+
**Turn 1 (DIVERSIFY):**
|
|
35
|
+
```
|
|
36
|
+
- LSP workspaceSymbol: "Config" (find all Config-related symbols)
|
|
37
|
+
- LSP workspaceSymbol: "Database" (find DB-related symbols)
|
|
38
|
+
- Grep: pattern="export.*config", type="ts" (catch non-symbol patterns)
|
|
39
|
+
- Glob: "src/**/*config*" (catch config files by name)
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
**Turn 2 (CONVERGE):**
|
|
43
|
+
```
|
|
44
|
+
- LSP hover on top matches (get type + docs without reading file)
|
|
45
|
+
- Read: src/config/app.ts offset=1 limit=45 (only the relevant range)
|
|
46
|
+
```
|
|
47
|
+
Result:
|
|
48
|
+
```
|
|
49
|
+
src/config/app.ts:1-45 -- main config export
|
|
50
|
+
src/config/types.ts:10-30 -- Config interface
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
## Antipattern (5+ turns)
|
|
54
|
+
|
|
55
|
+
```
|
|
56
|
+
Turn 1: Glob for config files
|
|
57
|
+
Turn 2: Read the first file
|
|
58
|
+
Turn 3: Grep for config patterns
|
|
59
|
+
Turn 4: Read results
|
|
60
|
+
Turn 5: Another Grep search
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
This wastes tokens. Never do this.
|
|
64
|
+
|
|
65
|
+
DO NOT: narrate your search process, make recommendations, propose solutions, generate tables.
|
|
66
|
+
|
|
67
|
+
Fallback: search `node_modules/`/`vendor/` ONLY when not found in app code.
|
|
68
|
+
|
|
69
|
+
Max response: 500 tokens.
|
|
@@ -45,6 +45,17 @@ When no experiments exist to validate an approach, start with a minimal validati
|
|
|
45
45
|
|
|
46
46
|
Spike tasks are 1-2 tasks to validate an approach before committing to full implementation.
|
|
47
47
|
|
|
48
|
+
### integration
|
|
49
|
+
|
|
50
|
+
Auto-generated when multiple specs share interfaces (APIs, DB tables, types).
|
|
51
|
+
|
|
52
|
+
- [ ] **T5** [INTEGRATION]: Verify auth ↔ operator contracts — opus/high | Blocked by: T2, T4
|
|
53
|
+
- Files: internal/auth/login.go, apps/operator/src/auth/AuthProvider.tsx
|
|
54
|
+
- Integration ACs:
|
|
55
|
+
- End-to-end: operator login → token → player bootstrap works
|
|
56
|
+
- Contract: POST /api/v1/auth/login response matches operator SPA expectations
|
|
57
|
+
- Migrations: 001→005 run twice without error (idempotent)
|
|
58
|
+
|
|
48
59
|
---
|
|
49
60
|
|
|
50
61
|
<!--
|
|
@@ -24,6 +24,21 @@
|
|
|
24
24
|
<!-- Optional. List specs that must be completed before this one. -->
|
|
25
25
|
<!-- - depends_on: doing-other-spec-name -->
|
|
26
26
|
|
|
27
|
+
## Interfaces
|
|
28
|
+
|
|
29
|
+
<!-- Optional but RECOMMENDED for multi-spec projects. Declare what this spec produces and consumes.
|
|
30
|
+
/df:plan uses these to auto-generate integration tasks when specs share contracts. -->
|
|
31
|
+
|
|
32
|
+
<!-- ### Produces
|
|
33
|
+
- `POST /api/v1/auth/login` → `{ access_token: string, refresh_token: string }`
|
|
34
|
+
- `table: operators` columns: `id, api_key_hash, scopes`
|
|
35
|
+
- `type: SessionState` from `packages/shared/types.ts` -->
|
|
36
|
+
|
|
37
|
+
<!-- ### Consumes
|
|
38
|
+
- `POST /api/v1/auth/login` from done-auth-spec (expects `{ access_token }`)
|
|
39
|
+
- `table: operators` expects column `api_key_hash`
|
|
40
|
+
- `type: SessionState` from packages/shared -->
|
|
41
|
+
|
|
27
42
|
## Out of Scope
|
|
28
43
|
|
|
29
44
|
- [Explicitly excluded: e.g., "Video upload is NOT included"]
|