deepflow 0.1.110 → 0.1.112

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -40,44 +40,55 @@ Each task = one background agent. **NEVER use TaskOutput** (100KB+ transcripts e
40
40
  `--continue` → load `.deepflow/checkpoint.json`, verify worktree exists (else error "Use --fresh"), skip completed. `--fresh` → delete checkpoint. Checkpoint exists → prompt "Resume? (y/n)".
41
41
  Shell: `` !`cat .deepflow/checkpoint.json 2>/dev/null || echo 'NOT_FOUND'` `` / `` !`git diff --quiet && echo 'CLEAN' || echo 'DIRTY'` ``
42
42
 
43
- ### 1.5. CREATE WORKTREE
43
+ ### 1.5. CREATE WORKTREES (per spec)
44
44
 
45
- Require clean HEAD. Derive SPEC_NAME from `specs/doing-*.md`. Create `.deepflow/worktrees/{spec}` on branch `df/{spec}`. Reuse if exists; `--fresh` deletes first. If `worktree.sparse_paths` non-empty: `git worktree add --no-checkout`, `sparse-checkout set {paths}`, checkout.
45
+ Require clean HEAD. Discover **all** specs in execution scope:
46
+ ```
47
+ DOING_SPECS=!`ls specs/doing-*.md 2>/dev/null | sed 's|specs/doing-||;s|\.md$||' | tr '\n' ' ' || echo 'NOT_FOUND'`
48
+ ```
49
+
50
+ For **each** `{spec}` in `DOING_SPECS`, create `.deepflow/worktrees/{spec}` on branch `df/{spec}`. Reuse if exists; `--fresh` deletes first. If `worktree.sparse_paths` non-empty: `git worktree add --no-checkout`, `sparse-checkout set {paths}`, checkout.
51
+
52
+ Build an in-memory map `SPEC_WORKTREES = {spec → {path, branch}}`. This map drives per-task routing in §5 and §5.5 and is persisted in `.deepflow/checkpoint.json` under `spec_worktrees`. Tasks from spec A run in worktree A; tasks from spec B run in worktree B. No cross-spec commits share a branch.
53
+
54
+ Then run §1.5.1, §1.6, and §1.7 **per worktree** before any wave spawns.
46
55
 
47
- ### 1.5.1. SYMLINK DEPENDENCIES
56
+ ### 1.5.1. SYMLINK DEPENDENCIES (per worktree)
48
57
 
49
- After worktree creation, symlink `node_modules` from the main repo so TypeScript/LSP/build can resolve dependencies without a full install:
58
+ After each worktree is created, symlink `node_modules` from the main repo so TypeScript/LSP/build can resolve dependencies without a full install:
50
59
  ```bash
51
- node "${HOME}/.claude/bin/worktree-deps.js" --source "$(git rev-parse --show-toplevel)" --worktree "${WORKTREE_PATH}"
60
+ node "${HOME}/.claude/bin/worktree-deps.js" --source "$(git rev-parse --show-toplevel)" --worktree "${SPEC_WORKTREES[spec].path}"
52
61
  ```
53
62
  The script finds `node_modules` at root and inside monorepo directories (`packages/`, `apps/`, etc.) and creates symlinks in the worktree. Outputs JSON: `{"linked": N, "total": M}`. Errors are non-fatal — log and continue.
54
63
 
55
- ### 1.6. RATCHET SNAPSHOT
64
+ ### 1.6. RATCHET SNAPSHOT (per worktree)
56
65
 
57
- Snapshot pre-existing test files — only these count for ratchet (agent-created excluded):
66
+ For each spec worktree, snapshot pre-existing test files — only these count for ratchet (agent-created excluded):
58
67
  ```bash
59
- git -C ${WORKTREE_PATH} ls-files | grep -E '\.(test|spec)\.[^/]+$|^test_|_test\.[^/]+$|^tests/|__tests__/' > .deepflow/auto-snapshot.txt
68
+ git -C ${SPEC_WORKTREES[spec].path} ls-files | grep -E '\.(test|spec)\.[^/]+$|^test_|_test\.[^/]+$|^tests/|__tests__/' > .deepflow/auto-snapshot-{spec}.txt
60
69
  ```
61
70
 
71
+ Each spec has its own snapshot file. Ratchet checks in §5.5 pass the snapshot file matching the task's spec.
72
+
62
73
  ### 1.7. NO-TESTS BOOTSTRAP
63
74
 
64
75
  <!-- AC-1: zero test files triggers bootstrap before wave 1 -->
65
76
  <!-- AC-2: bootstrap success re-snapshots auto-snapshot.txt; subsequent tasks use updated snapshot -->
66
77
  <!-- AC-3: bootstrap failure with default model retries with Opus; double failure halts with specific message -->
67
78
 
68
- **Gate:** After §1.6 snapshot, check `auto-snapshot.txt`:
79
+ **Gate (per spec):** After §1.6 snapshot, check each spec's snapshot file independently:
69
80
  ```bash
70
- SNAPSHOT_COUNT=$(wc -l < .deepflow/auto-snapshot.txt | tr -d ' ')
81
+ SNAPSHOT_COUNT=$(wc -l < .deepflow/auto-snapshot-{spec}.txt | tr -d ' ')
71
82
  ```
72
- If `SNAPSHOT_COUNT` is `0` (zero test files found), MUST spawn bootstrap agent before wave 1. No implementation tasks may start until bootstrap completes successfully.
83
+ If `SNAPSHOT_COUNT` is `0` for a given spec (zero test files found), MUST spawn a bootstrap agent for **that spec** before any implementation task from that spec runs. Other specs with non-empty snapshots proceed normally.
73
84
 
74
- **Bootstrap flow:**
75
- 1. Spawn `Agent(model="{default_model}", ...)` with Bootstrap prompt (§6). End turn, wait for notification.
76
- 2. **On success (TASK_STATUS:pass):** Re-snapshot immediately:
85
+ **Bootstrap flow (per empty-snapshot spec):**
86
+ 1. Spawn `Agent(model="{default_model}", ...)` with Bootstrap prompt (§6), `Working directory: ${SPEC_WORKTREES[spec].path}`. End turn, wait for notification.
87
+ 2. **On success (TASK_STATUS:pass):** Re-snapshot immediately for that spec:
77
88
  ```bash
78
- git -C ${WORKTREE_PATH} ls-files | grep -E '\.(test|spec)\.[^/]+$|^test_|_test\.[^/]+$|^tests/|__tests__/' > .deepflow/auto-snapshot.txt
89
+ git -C ${SPEC_WORKTREES[spec].path} ls-files | grep -E '\.(test|spec)\.[^/]+$|^test_|_test\.[^/]+$|^tests/|__tests__/' > .deepflow/auto-snapshot-{spec}.txt
79
90
  ```
80
- All subsequent tasks use this updated snapshot as their ratchet baseline. Proceed to wave 1.
91
+ All subsequent tasks for that spec use this updated snapshot as their ratchet baseline. Proceed to wave 1.
81
92
  3. **On failure (TASK_STATUS:fail) with default model:** Retry ONCE with `Agent(model="opus", ...)` using the same Bootstrap prompt.
82
93
  - Opus success → re-snapshot (same command above) → proceed to wave 1.
83
94
  - Opus failure → halt with message: `"Bootstrap failed with both default and Opus — manual intervention required"`. Do not proceed.
@@ -137,17 +148,34 @@ Context ≥50% → checkpoint and exit. Before spawning: `TaskUpdate(status: "in
137
148
 
138
149
  **Token tracking start:** Store `start_percentage` (from context.json) and `start_timestamp` (ISO 8601) keyed by task_id. Omit if unavailable.
139
150
 
140
- **NEVER use `isolation: "worktree"`.** Deepflow manages a shared worktree so wave 2 sees wave 1 commits. **Spawn ALL ready tasks in ONE message** except file conflicts.
151
+ **Intra-wave isolation:** Each task in a wave runs with `isolation: "worktree"` tasks from the same spec share that spec's worktree branch so wave 2 sees wave 1 commits; tasks from different specs run in different worktrees and never interleave. **Spawn ALL ready tasks in ONE message** except file conflicts.
152
+
153
+ **Per-spec routing (CRITICAL):** Each task in `WAVE_JSON` carries a `spec` field (from `bin/wave-runner.js`). When building the agent prompt (§6), you MUST set `Working directory: ${SPEC_WORKTREES[task.spec].path}` — the worktree for that task's spec, NOT the first spec in the map. Cross-spec contamination (spawning a task from spec B into spec A's worktree) corrupts branch history and breaks `/df:verify`. If `task.spec` is absent from the JSON, fall back to deriving it from the task's mini-plan file `.deepflow/plans/doing-{specName}.md`; if still unresolvable, defer the task and log `"⚠ T{N} deferred — cannot resolve spec"`.
154
+
155
+ **File conflicts (1 file = 1 writer):** Check `Files:` from wave-runner JSON output or from mini-plan detail files (`.deepflow/plans/doing-{specName}.md`). File-conflict rule applies **only within the same spec** — two tasks from different specs touching files with identical paths are actually in different worktrees and cannot collide. Overlap within a spec → spawn lowest-numbered only; rest stay pending. Log: `"⏳ T{N} deferred — file conflict with T{M} on {filename}"`
156
+
157
+ **≥2 [SPIKE] tasks same problem →** Parallel Spike Probes (§5.7). **[OPTIMIZE] tasks →** Optimize Cycle (§5.9), one at a time. **[INTEGRATION] tasks** (`task.isIntegration === true` in WAVE_JSON) **→** use the Integration Task prompt template (§6 Integration Task), not the Standard Task template. Integration tasks always land in the final wave via `Blocked by:` — wave-runner guarantees this, so they execute after all producer/consumer implementation tasks have committed. Route them to the **consumer spec's** worktree via `SPEC_WORKTREES[task.spec].path` (plan.md §4.8.2 places the integration task under the consumer's section header, so `task.spec` is already the consumer).
141
158
 
142
- **File conflicts (1 file = 1 writer):** Check `Files:` from wave-runner JSON output or from mini-plan detail files (`.deepflow/plans/doing-{specName}.md`). Overlap → spawn lowest-numbered only; rest stay pending. Log: `"⏳ T{N} deferred — file conflict with T{M} on {filename}"`
159
+ ### 5.1. INTRA-WAVE CHERRY-PICK MERGE
143
160
 
144
- **≥2 [SPIKE] tasks same problem →** Parallel Spike Probes (§5.7). **[OPTIMIZE] tasks →** Optimize Cycle (§5.9), one at a time.
161
+ After ALL wave-N agents complete, cherry-pick each wave-N commit back to the main branch BEFORE wave N+1 begins. This ensures wave N+1 agents see all wave-N changes regardless of which worktree they run in.
162
+
163
+ **Wave gate:** Wave N+1 MUST NOT start until all wave-N cherry-picks complete.
164
+
165
+ **Ordering:** Apply cherry-picks in ascending task-number order (e.g., T1 before T2 before T3) for determinism.
166
+
167
+ **Steps (per wave completion):**
168
+ 1. Collect all task commits from wave N (from ratchet PASS records).
169
+ 2. Sort commits by ascending task-number order.
170
+ 3. For each commit, spawn haiku context-fork (§5.8): `git cherry-pick {sha}`. Receive one-line summary.
171
+ 4. On conflict: log `"⚠ cherry-pick conflict: {sha} — {file}"`, abort cherry-pick, mark task as needing manual resolution.
172
+ 5. Only after all wave-N cherry-picks finish → proceed to spawn wave N+1 agents.
145
173
 
146
174
  ### 5.5. RATCHET CHECK
147
175
 
148
- Run `node "${HOME}/.claude/bin/ratchet.js"` in the worktree directory after each agent completes:
176
+ Run `node bin/ratchet.js` in the **task's spec worktree** after each agent completes, using that spec's snapshot file:
149
177
  ```bash
150
- node "${HOME}/.claude/bin/ratchet.js" --worktree ${WORKTREE_PATH} --snapshot .deepflow/auto-snapshot.txt --task T{N}
178
+ node bin/ratchet.js --worktree ${SPEC_WORKTREES[task.spec].path} --snapshot .deepflow/auto-snapshot-{task.spec}.txt --task T{N}
151
179
  ```
152
180
 
153
181
  The script handles all health checks internally and outputs structured JSON:
@@ -174,7 +202,7 @@ The script handles all health checks internally and outputs structured JSON:
174
202
  ```
175
203
  (Fall back to text mode if `--json` is unavailable: `node "${HOME}/.claude/bin/wave-runner.js" --plan PLAN.md --recalc --failed T{N}`)
176
204
  Report: `"✗ T{n}: reverted"`.
177
- - **Exit 2 (SALVAGEABLE):** Spawn `Agent(model="sonnet")` to fix lint/typecheck issues. Re-run `node "${HOME}/.claude/bin/ratchet.js"`. If still non-zero → revert both commits, set status pending.
205
+ - **Exit 2 (SALVAGEABLE):** Spawn `Agent(model="sonnet")` to fix lint/typecheck issues. Re-run `node bin/ratchet.js`. If still non-zero → revert both commits, set status pending.
178
206
 
179
207
  #### 5.5.1. AC COVERAGE CHECK (after ratchet pass)
180
208
 
@@ -194,18 +222,19 @@ where `{spec_path}` is the path to `specs/doing-{spec_name}.md` and `{agent_outp
194
222
 
195
223
  Parse the agent's response for `DECISIONS:` line. If present:
196
224
  1. Split by ` | ` to get individual decisions
197
- 2. Each decision has format `[TAG] description — rationale` where TAG ∈ {APPROACH, PROVISIONAL, ASSUMPTION, FUTURE, UPDATE}
198
- 3. Append to `.deepflow/decisions.md` under `### {date}{spec_name}` header (create header if first decision for this spec today, reuse if exists)
199
- 4. Format: `- [TAG] descriptionrationale`
225
+ 2. If any entry does not start with `[TAG]` where TAG ∈ {APPROACH, PROVISIONAL, ASSUMPTION, FUTURE, UPDATE}, emit SALVAGEABLE and skip writing that entry to decisions.md (valid entries still get written).
226
+ 3. Each decision has format `[TAG] descriptionrationale` where TAG {APPROACH, PROVISIONAL, ASSUMPTION, FUTURE, UPDATE}
227
+ 4. Append to `.deepflow/decisions.md` under `### {date} {spec_name}` header (create header if first decision for this spec today, reuse if exists)
228
+ 5. Format: `- [TAG] description — rationale`
200
229
 
201
- If no `DECISIONS:` line in agent output → skip silently (mechanical tasks don't produce decisions).
230
+ If no `DECISIONS:` line in agent output and the task effort is not `low` emit SALVAGEABLE (non-trivial tasks without a decision line may indicate the agent skipped documenting architectural choices). For tasks with effort `low`, skip silently (mechanical tasks don't produce decisions).
202
231
 
203
232
  **This runs on every ratchet pass, not just at verify time.** Decisions are captured incrementally as tasks complete, so they're never lost even if verify fails or merge is manual.
204
233
 
205
234
  **Edit scope validation:** `git diff HEAD~1 --name-only` vs allowed globs. Violation → revert, report.
206
235
  **Impact completeness:** diff vs Impact callers/duplicates. Gap → advisory warning (no revert).
207
236
 
208
- **Metric gate (Optimize only):** Run `eval "${metric_command}"` with cwd=`${WORKTREE_PATH}` (never `cd && eval`). Parse float (non-numeric → revert). Compare using `direction`+`min_improvement_threshold`. Both ratchet AND metric must pass → keep. Ratchet pass + metric stagnant → revert. Secondary metrics: regression > `regression_threshold` (5%) → WARNING in auto-report.md (no revert).
237
+ **Metric gate (Optimize only):** Run `eval "${metric_command}"` with cwd=`${SPEC_WORKTREES[task.spec].path}` (never `cd && eval`). Parse float (non-numeric → revert). Compare using `direction`+`min_improvement_threshold`. Both ratchet AND metric must pass → keep. Ratchet pass + metric stagnant → revert. Secondary metrics: regression > `regression_threshold` (5%) → WARNING in auto-report.md (no revert).
209
238
 
210
239
  **Token tracking result (on pass):** Read `end_percentage`. Sum token fields from `.deepflow/token-history.jsonl` between start/end timestamps (awk ISO 8601 compare). Write to `.deepflow/results/T{N}.yaml`:
211
240
  ```yaml
@@ -219,6 +248,20 @@ tokens:
219
248
  ```
220
249
  Omit if context.json/token-history.jsonl/awk unavailable. Never fail ratchet for tracking errors.
221
250
 
251
+ ### 5.6. WAVE TEST AGENT
252
+
253
+ Trigger: task type is [TEST] or orchestrator spawns a dedicated test-writing agent for a wave.
254
+
255
+ Before spawning the test agent, collect context:
256
+ ```bash
257
+ SNAPSHOT_FILES=!`cat .deepflow/auto-snapshot.txt 2>/dev/null || echo ''`
258
+ EXISTING_TEST_NAMES=!`grep -h -E "^\s*(it|test|describe)\(" ${SNAPSHOT_FILES} 2>/dev/null | sed "s/^[[:space:]]*//" || echo ''`
259
+ ```
260
+
261
+ Pass `SNAPSHOT_FILES` and `EXISTING_TEST_NAMES` into the agent prompt so it can avoid duplication.
262
+
263
+ **Implementation diff:** The wave test agent reads the implementation diff itself using the `Read` tool or `git diff` — do NOT capture or pass the raw diff to the wave test prompt inline. Injecting large diffs inflates context and causes rot.
264
+
222
265
  ### 5.7. PARALLEL SPIKE PROBES
223
266
 
224
267
  Trigger: ≥2 [SPIKE] tasks with same blocker or identical hypothesis.
@@ -255,7 +298,7 @@ Git operations that produce large output (diff, stash, cherry-pick conflict outp
255
298
  **Pattern:**
256
299
  ```
257
300
  Spawn Agent(model="haiku", run_in_background=false):
258
- Working directory: {WORKTREE_PATH}
301
+ Working directory: ${SPEC_WORKTREES[task.spec].path}
259
302
  Run: {git command}
260
303
  Return exactly ONE line: "{operation}: {N lines changed / N files / outcome}"
261
304
  Do NOT output the raw diff or full command output.
@@ -330,7 +373,9 @@ REPEAT:
330
373
 
331
374
  ### 6. PER-TASK (agent prompt)
332
375
 
333
- **Common preamble (all):** `Working directory: {worktree_absolute_path}. All file ops use this path. Commit format: {type}({spec}): {desc}`
376
+ **Common preamble (all):** `Working directory: ${SPEC_WORKTREES[task.spec].path}. All file ops use this path. Commit format: {type}({spec}): {desc}`
377
+
378
+ Resolve `task.spec` from the `WAVE_JSON` entry for this task (fallback: scan `.deepflow/plans/doing-*.md` for the task's block). Never hand an agent a worktree path that belongs to a different spec.
334
379
 
335
380
  **Task detail loading (before building agent prompt):** Check for `.deepflow/plans/doing-{task_id}.md` (shell injection):
336
381
  ```
@@ -357,6 +402,17 @@ Steps (only when `Files:` list is non-empty):
357
402
 
358
403
  <!-- AC-6: Backward-compatible no-op — when neither Domain Model section exists in the spec nor Existing Types extraction yields content (EXISTING_TYPES is empty string), the Standard Task prompt contains no extra context blocks and is identical to the pre-injection baseline. Zero prompt overhead, zero tool calls for tasks that lack these context sources. -->
359
404
 
405
+ **Template selection (deterministic, from WAVE_JSON):**
406
+
407
+ | Flag | Template |
408
+ |-----------------------|------------------------------------|
409
+ | `isIntegration: true` | Integration Task (below) |
410
+ | `isSpike: true` | Spike |
411
+ | `isOptimize: true` | Optimize Task |
412
+ | (none) | Standard Task |
413
+
414
+ Read these fields from `WAVE_JSON` entries. Do NOT re-parse the task description for tags — the flags are authoritative. If `isIntegration` is true, skip Standard Task entirely and jump to Integration Task (below).
415
+
360
416
  **Standard Task** (`Agent(model="{Model}", ...)`):
361
417
  ```
362
418
  --- START ---
@@ -373,7 +429,7 @@ Success criteria: {ACs from spec relevant to this task}
373
429
  {If spec contains ## Domain Model section:
374
430
  --- CONTEXT: Domain Model ---
375
431
  {Domain Model section content from doing-*.md, extracted via shell injection:
376
- DOMAIN_MODEL=!`sed -n '/^## Domain Model$/,/^## [^D]/p' specs/doing-{spec_name}.md | head -n -1 2>/dev/null || echo 'NOT_FOUND'`
432
+ DOMAIN_MODEL=!`sed -n '/^## Domain Model$/,/^## /p' specs/doing-{spec_name}.md | head -n -1 2>/dev/null || echo 'NOT_FOUND'`
377
433
  }
378
434
  }
379
435
  {If EXISTING_TYPES is non-empty:
@@ -395,7 +451,7 @@ AC-2:skip:reason here (if applicable)
395
451
  AC_COVERAGE_END
396
452
  ```
397
453
  Format: one line per AC with either `AC-N:done` or `AC-N:skip:reason`. Omit this block if the spec has no acceptance criteria.
398
- DECISIONS: If you made non-obvious choices, append to the LAST LINE BEFORE TASK_STATUS:
454
+ DECISIONS: If you made non-obvious choices, cite with [APPROACH]. Append to the LAST LINE BEFORE TASK_STATUS:
399
455
  DECISIONS: [TAG] {decision} — {rationale} | [TAG] {decision2} — {rationale2}
400
456
  Tags:
401
457
  [APPROACH] — chose X over Y (architectural/design choice)
@@ -404,6 +460,7 @@ Tags:
404
460
  [FUTURE] — deferred X because Y; revisit when Z
405
461
  [UPDATE] — changed prior decision from X to Y because Z
406
462
  Skip for trivial/mechanical changes.
463
+ Files: List every file you modified or created, one per line, in the format `Files: path/to/file.ts, path/to/other.ts`. This is required so the orchestrator can detect file conflicts across concurrent tasks.
407
464
  Last line of your response MUST be: TASK_STATUS:pass (if successful) or TASK_STATUS:fail (if failed) or TASK_STATUS:revert (if reverted)
408
465
  ```
409
466
 
@@ -416,6 +473,7 @@ Integration ACs: {list from PLAN.md}
416
473
  Specs involved: {spec file paths}
417
474
  Interface Map: {from integration task detail}
418
475
  Contract Risks: {from integration task detail}
476
+ LSP documentSymbol on Impact files → Read with offset/limit on relevant ranges only (never read full files)
419
477
  --- END ---
420
478
  RULES:
421
479
  - Fix the CONSUMER to match the PRODUCER's declared interface. Never weaken the producer.
@@ -438,7 +496,28 @@ Last line: TASK_STATUS:pass or TASK_STATUS:fail
438
496
 
439
497
  **Bootstrap:** `BOOTSTRAP: Write tests for edit_scope files. Do NOT change implementation. Commit as test({spec}): bootstrap. Last line: TASK_STATUS:pass or TASK_STATUS:fail`
440
498
 
441
- **Spike:** `{task_id} [SPIKE]: {hypothesis}. Files+Spec. {reverted warnings}. Minimal spike. Commit as spike({spec}): {desc}. If you discovered constraints, rejected approaches, or made assumptions, report: DECISIONS: [TAG] {finding} — {why it matters} (use PROVISIONAL for "works but needs revisit", ASSUMPTION for "assumed X; if wrong Y breaks", APPROACH for definitive choices). Last line: TASK_STATUS:pass or TASK_STATUS:fail`
499
+ **Wave Test** (`Agent(model="sonnet")`):
500
+ ```
501
+ --- START ---
502
+ {task_id} [TEST]: Write tests for {spec_name}. Files+Spec.
503
+ Pre-existing test files:
504
+ {SNAPSHOT_FILES}
505
+
506
+ Existing test function names (do NOT duplicate these):
507
+ {EXISTING_TEST_NAMES}
508
+ --- MIDDLE ---
509
+ Spec: {spec_path}
510
+ Edit scope: {edit_scope}
511
+ --- END ---
512
+ RULES:
513
+ - Use the `Read` tool (or `git diff HEAD~1`) to inspect what the implementation changed before writing tests.
514
+ - Do not duplicate tests that already exist in the pre-existing test files listed above.
515
+ - Do not modify pre-existing test files — write new test files only.
516
+ - Commit as test({spec}): {description}.
517
+ Last line of your response MUST be: TASK_STATUS:pass (if successful) or TASK_STATUS:fail (if failed)
518
+ ```
519
+
520
+ **Spike**: `{task_id} [SPIKE]: {hypothesis}. Files+Spec. {reverted warnings}. Minimal spike. Commit as spike({spec}): {desc}. If you discovered constraints, rejected approaches, or made assumptions, report: DECISIONS: [TAG] {finding} — {why it matters} (use PROVISIONAL for "works but needs revisit", ASSUMPTION for "assumed X; if wrong Y breaks", APPROACH for definitive choices). Last line: TASK_STATUS:pass or TASK_STATUS:fail`
442
521
 
443
522
  **Optimize Task** (`Agent(model="opus")`):
444
523
  ```
@@ -448,6 +527,7 @@ Current: {val} (baseline: {b}, best: {best}). Target: {t} ({dir}). Metric: {cmd}
448
527
  CONSTRAINT: ONE atomic change.
449
528
  --- MIDDLE ---
450
529
  Last 5 cycles + failed hypotheses + Impact/deps.
530
+ LSP documentSymbol on Impact files → Read with offset/limit on relevant ranges only (never read full files)
451
531
  --- END ---
452
532
  {Learnings}. ONE change + commit. No metric run, no multiple changes.
453
533
  Last line of your response MUST be: TASK_STATUS:pass or TASK_STATUS:fail or TASK_STATUS:revert
@@ -463,6 +543,7 @@ Current/Target. Role instruction:
463
543
  ingenua: "Ignore prior. Fresh approach."
464
544
  --- MIDDLE ---
465
545
  Full history + all failed hypotheses.
546
+ LSP documentSymbol on Impact files → Read with offset/limit on relevant ranges only (never read full files)
466
547
  --- END ---
467
548
  ONE atomic change. Commit. STOP.
468
549
  Last line of your response MUST be: TASK_STATUS:pass or TASK_STATUS:fail or TASK_STATUS:revert
@@ -498,7 +579,18 @@ Skills: `atomic-commits`, `browse-fetch`. Agents: Implementation (`general-purpo
498
579
  | sonnet/medium | `Agent(model="sonnet")` | `Direct and efficient. Explain only non-obvious logic.` |
499
580
  | opus/high | `Agent(model="opus")` | _(none)_ |
500
581
 
501
- **Checkpoint:** `.deepflow/checkpoint.json`: `{"completed_tasks":["T1"],"current_wave":2,"worktree_path":"...","worktree_branch":"df/..."}`
582
+ **Checkpoint:** `.deepflow/checkpoint.json`:
583
+ ```json
584
+ {
585
+ "completed_tasks": ["T1"],
586
+ "current_wave": 2,
587
+ "spec_worktrees": {
588
+ "upload": {"path": ".deepflow/worktrees/upload", "branch": "df/upload"},
589
+ "auth": {"path": ".deepflow/worktrees/auth", "branch": "df/auth"}
590
+ }
591
+ }
592
+ ```
593
+ One entry per `doing-*` spec in scope. `--continue` rehydrates this map before wave scheduling.
502
594
 
503
595
  ## Failure Handling
504
596
 
@@ -230,7 +230,7 @@ You are a spec planner. Your job is to independently analyze a spec and produce
230
230
  2. **Compute spec layer** — determine L0–L3 based on sections present (see layer rules below)
231
231
  3. **Check experiments** — glob `.deepflow/experiments/{topic}--*` for past spikes
232
232
  4. **Explore the codebase** — detect code style, patterns, integration points relevant to this spec
233
- 5. **Impact analysis** (L3 only) — LSP-first blast radius for files in scope
233
+ 5. **Impact analysis** (L3 only) — LSP documentSymbol on impact files Read with offset/limit on relevant ranges only (never read full files)
234
234
  6. **Targeted exploration** — follow `templates/explore-agent.md` spawn rules for post-LSP gaps
235
235
  7. **Generate tasks** — produce a mini-plan following the output format below
236
236
 
@@ -349,11 +349,17 @@ If no shared interfaces found, return:
349
349
 
350
350
  **Skip if:** Interface Map returns "(none detected — specs are independent)".
351
351
 
352
- For each group of specs sharing interfaces, generate ONE integration task appended AFTER all spec tasks in the consolidated plan. Integration tasks are always the last wave.
352
+ For each group of specs sharing interfaces, generate ONE integration task per interface cluster.
353
+
354
+ **Placement (CRITICAL for worktree routing):** Integration tasks must be placed under the **consumer spec's** `### {consumer-spec-name}` section in the consolidated PLAN.md, NOT at the end of the file and NOT under their own header. `bin/wave-runner.js` assigns `task.spec` from the nearest preceding `### ` header, and `/df:execute` uses that field to route the task to the correct per-spec worktree (`SPEC_WORKTREES[task.spec].path`). If an integration task lands under a header that is not a real spec (e.g. `### Integration`), execute will fail to resolve a worktree and defer the task.
355
+
356
+ **Consumer selection:** The "consumer" is the spec that reads/calls the interface (e.g. frontend consumes API produced by backend → frontend is consumer). The fix-the-consumer rule in execute.md §6 Integration Task template means the integration agent will modify consumer-side code, which matches the consumer's worktree. If a cluster has multiple consumers, emit one integration task per consumer under each consumer's section.
357
+
358
+ The `[INTEGRATION]` tag is parsed deterministically by `bin/wave-runner.js` and surfaced as `isIntegration: true` in its JSON output; execute.md §6 uses that flag (not the task description) to pick the Integration Task prompt.
353
359
 
354
360
  **Integration task format:**
355
361
  ```markdown
356
- - [ ] **T{N}** [INTEGRATION]: Verify {spec_a} ↔ {spec_b} contracts
362
+ - [ ] **T{N}** [INTEGRATION]: Verify {producer_spec} ↔ {consumer_spec} contracts
357
363
  - Files: {files at integration boundaries — API handlers, adapters, shared types, migrations}
358
364
  - Integration ACs:
359
365
  - End-to-end flow: {producer} → {consumer} works with real data
@@ -396,10 +402,9 @@ The reasoner prompt:
396
402
  ```
397
403
  You are the plan reasoner. Analyze this spec and produce a prioritized task plan.
398
404
 
399
- ## Spec file path
400
- {spec_path}
401
-
402
- Read the spec using the Read tool on the path above. Do NOT read any implementation files.
405
+ ## Spec content
406
+ <!-- {spec_content} — injected by orchestrator before spawning; do NOT use Read tool on the spec -->
407
+ {spec_content}
403
408
 
404
409
  ## Agent summaries (from §3 parallel agents)
405
410
 
@@ -55,6 +55,7 @@ Spawn reasoner agent (`subagent_type: "reasoner"`, `model: "opus"`). The reasone
55
55
  - Flags conflicts with existing code
56
56
  - Verifies every REQ-N has a corresponding AC; flags uncovered requirements
57
57
  - Flags vague/untestable requirements (e.g., "should be fast" without a metric)
58
+ - If Explore agents found type definitions or interfaces relevant to this spec, include a ## Domain Model section with Key Types (signatures only) and Ubiquitous Language (domain terms). Omit if no relevant types found.
58
59
 
59
60
  ### 4. GENERATE SPEC
60
61
 
@@ -221,6 +221,8 @@ Objective: ... | Approach: ... | Why it worked: ... | Files: ...
221
221
  - Don't auto-fix — add fix tasks to PLAN.md, then `/df:execute --continue`
222
222
  - Capture learnings for significant approaches
223
223
  - **Terse output** — Output ONLY the compact report format (section 3)
224
+ - **No LSP diagnostics** — Use ONLY build/test command exit codes and output for L0/L4. Do NOT use the LSP tool to collect TypeScript diagnostics — worktree environments have incomplete `node_modules` symlinks that produce false-positive module-resolution errors (2307, 2875). If the build command exits 0, L0 passes — do not second-guess it with LSP.
225
+ - **No narration of false positives** — Never output diagnostics and then explain they are false positives. If you know they are false positives, suppress them entirely. Wasted output tokens cost money.
224
226
 
225
227
  ## Post-Verification: Worktree Merge & Cleanup
226
228
 
@@ -0,0 +1,205 @@
1
+ ---
2
+ name: repo-inspect
3
+ description: Produces structured JSON intelligence for a remote GitHub repo — fetches metadata and file tree via gh api, reads key files via WebFetch. No local clone. Use when evaluating an unfamiliar repo before planning integration work.
4
+ context: fork
5
+ allowed-tools: [Bash, WebFetch]
6
+ ---
7
+
8
+ # Repo-Inspect
9
+
10
+ Inspect a GitHub repository and emit a single JSON object describing its architecture. No clones, no tmpdir, no local filesystem writes.
11
+
12
+ **Input:** `{owner}/{repo}` or a full GitHub URL (e.g., `https://github.com/owner/repo`).
13
+ **Output:** Raw JSON only — no markdown, no commentary.
14
+
15
+ ---
16
+
17
+ ## Protocol
18
+
19
+ ### Step 0 — Parse Input
20
+
21
+ Strip `https://github.com/` prefix if present. Extract `{owner}` and `{repo}` from the remaining `owner/repo` string.
22
+
23
+ ### Step 1 — Fetch Repo Metadata (1 Bash call)
24
+
25
+ ```bash
26
+ gh api repos/{owner}/{repo}
27
+ ```
28
+
29
+ Extract: `description`, `language`, `topics`, `default_branch`, `stargazers_count`, `forks_count`.
30
+
31
+ On error (non-zero exit or JSON with `message` field indicating 404/403):
32
+ ```json
33
+ {"error": "api_failed", "message": "<gh api error text>"}
34
+ ```
35
+ Stop and return this error JSON immediately.
36
+
37
+ ### Step 2 — Fetch Full File Tree (1 Bash call)
38
+
39
+ ```bash
40
+ gh api "repos/{owner}/{repo}/git/trees/{default_branch}?recursive=1"
41
+ ```
42
+
43
+ Parse `tree[]` array. Each item has: `path`, `type` (`blob`|`tree`), `size`.
44
+
45
+ If tree is truncated (`truncated: true`), note it but proceed — the tree API returns up to ~100K entries which covers virtually all repos.
46
+
47
+ ### Step 3 — Language Detection
48
+
49
+ Scan tree paths for manifest files in priority order:
50
+
51
+ | Manifest | Language |
52
+ |---|---|
53
+ | `Cargo.toml` | Rust |
54
+ | `package.json` | JavaScript/TypeScript |
55
+ | `pyproject.toml` or `setup.py` or `requirements.txt` | Python |
56
+ | `go.mod` | Go |
57
+ | `pom.xml` or `build.gradle` | Java |
58
+ | `mix.exs` | Elixir |
59
+ | `Gemfile` | Ruby |
60
+ | `build.zig` | Zig |
61
+ | `CMakeLists.txt` | C/C++ |
62
+
63
+ Use the **first match** (highest priority). If no manifest found, fall back to `language` field from Step 1 metadata.
64
+
65
+ Record: `detected_language`, `manifest_path` (path of matched manifest, or null).
66
+
67
+ ### Step 4 — File Selection (3–6 files)
68
+
69
+ Build a prioritized list of files to fetch. Select 3–6 total:
70
+
71
+ 1. **README** — find `README.md` or `README.rst` or `README` in tree root (depth 0). Always include if present.
72
+ 2. **Manifest** — the manifest file detected in Step 3. Always include if present.
73
+ 3. **Primary entry point** — search tree for (in order): `src/main.*`, `src/lib.*`, `src/index.*`, `index.*`, `app.*`, `main.*`. Pick the first match at the shallowest depth.
74
+ 4. **Supplemental files** — from remaining blobs: prefer shallowest paths, then largest `size`. Pick source files (`.rs`, `.ts`, `.js`, `.py`, `.go`, `.java`, `.ex`, `.rb`, `.zig`, `.c`, `.cpp`, `.h`). Fill up to 6 total.
75
+
76
+ For monorepos (detected when tree contains `packages/*/`, `crates/*/`, `apps/*/` directories, or manifest workspace field): select 1-2 representative sub-package manifests/entry points instead of generic supplemental files.
77
+
78
+ ### Step 5 — Fetch File Contents (3–6 WebFetch calls)
79
+
80
+ For each selected file path, fetch:
81
+
82
+ ```
83
+ https://raw.githubusercontent.com/{owner}/{repo}/{default_branch}/{path}
84
+ ```
85
+
86
+ Use WebFetch. If a fetch fails (404 or network error), skip that file and note it. Do not retry.
87
+
88
+ Collect: list of `{path, content}` pairs for all successfully fetched files.
89
+
90
+ ### Step 6 — Extract Intelligence from Fetched Content
91
+
92
+ From manifest content (if fetched):
93
+
94
+ - **dependency_count**: Count entries in `[dependencies]` (Cargo.toml), `dependencies` + `devDependencies` keys (package.json), `[tool.poetry.dependencies]` (pyproject.toml), `require` directives (go.mod/Gemfile), `<dependency>` tags (pom.xml). Use 0 if manifest not fetched.
95
+ - **test_framework**: Check dev-dependencies for known test frameworks:
96
+ - JS/TS: `jest`, `vitest`, `mocha`, `jasmine`, `tap`, `ava`
97
+ - Python: `pytest`, `unittest` (stdlib), `nose`
98
+ - Rust: built-in (`#[test]`), `rstest`, `proptest`
99
+ - Go: built-in (`testing` package)
100
+ - Java: `junit`, `testng`
101
+ - Ruby: `rspec`, `minitest`
102
+ - Elixir: built-in (`ExUnit`)
103
+ Also check tree for `test/`, `tests/`, `spec/`, `__tests__/` directories as corroboration.
104
+ - **monorepo**: true if tree contains at least 2 of `packages/`, `crates/`, `apps/`, `libs/` top-level dirs, OR if manifest has workspace/workspaces field.
105
+
106
+ From README content (if fetched):
107
+ - Extract the first non-heading paragraph as a candidate for `purpose`. Trim to ≤ 200 chars.
108
+
109
+ Fallback for `purpose`: use repo `description` from Step 1 metadata.
110
+
111
+ ### Step 7 — Derive key_modules
112
+
113
+ From the tree blob paths, identify directories containing 2+ source files (files with extensions `.rs`, `.ts`, `.js`, `.tsx`, `.jsx`, `.py`, `.go`, `.java`, `.ex`, `.rb`, `.zig`, `.c`, `.cpp`, `.h`, `.swift`, `.kt`).
114
+
115
+ Algorithm:
116
+ 1. For each blob, extract parent directory path.
117
+ 2. Count source files per directory.
118
+ 3. Keep directories with count >= 2.
119
+ 4. Sort by file count descending, then by path depth ascending (shallower = more significant).
120
+ 5. Take up to 10 modules.
121
+ 6. Strip common prefixes (e.g., if all modules share `src/`, keep `src/` as a module too).
122
+
123
+ Return directory names (last path segment) for the `key_modules` array. If fewer than 3 candidate directories exist, include directories with 1 source file to reach 3, or return what's available.
124
+
125
+ ### Step 8 — Derive concepts_applicable
126
+
127
+ Based on language, test framework, monorepo status, and key module names, suggest applicable engineering concepts. Examples:
128
+
129
+ - Monorepo → `"workspace-management"`, `"cross-package-testing"`
130
+ - Rust → `"ownership-model"`, `"cargo-workspace"` (if monorepo)
131
+ - TypeScript → `"type-safety"`, `"module-resolution"`
132
+ - Has `auth` module → `"authentication-patterns"`
133
+ - Has `db` or `models` module → `"data-modeling"`
134
+ - Has `api` or `routes` module → `"rest-api-design"`
135
+ - Has tests → `"tdd"` or `"bdd"` (if rspec/jasmine)
136
+
137
+ Limit to 3–7 concepts. These are suggestions for the caller — not exhaustive.
138
+
139
+ ### Step 9 — Confidence Score
140
+
141
+ Set `confidence` based on data quality:
142
+
143
+ | Condition | Confidence |
144
+ |---|---|
145
+ | README + manifest + entry point all fetched | `high` |
146
+ | README or manifest fetched, but not both | `medium` |
147
+ | Neither README nor manifest fetched | `low` |
148
+
149
+ ### Step 10 — Emit JSON Output
150
+
151
+ Output **exactly one JSON object** with no surrounding text, no markdown code fences, no comments:
152
+
153
+ ```json
154
+ {
155
+ "repo": "{owner}/{repo}",
156
+ "purpose": "<first non-heading README paragraph or repo description, ≤200 chars>",
157
+ "architecture": {
158
+ "language": "<detected language>",
159
+ "entry_points": ["<relative paths of main/lib/index files>"],
160
+ "key_modules": ["<directory names with 2+ source files>"],
161
+ "dependencies_count": 0,
162
+ "test_framework": "<framework name or 'unknown'>"
163
+ },
164
+ "concepts_applicable": ["<concept1>", "<concept2>"],
165
+ "files_inspected": ["<path1>", "<path2>"],
166
+ "confidence": "high|medium|low"
167
+ }
168
+ ```
169
+
170
+ **Critical:** The very last thing you output must be this JSON object and nothing else. Do not wrap in code blocks. Do not add explanation.
171
+
172
+ ---
173
+
174
+ ## Error Handling
175
+
176
+ | Scenario | Action |
177
+ |---|---|
178
+ | `gh api` returns non-zero exit for metadata | Return `{"error": "api_failed", "message": "<stderr>"}` and stop |
179
+ | `gh api` returns 404 JSON | Return `{"error": "api_failed", "message": "Repository not found or not accessible"}` |
180
+ | Tree fetch fails | Return `{"error": "tree_failed", "message": "<stderr>"}` and stop |
181
+ | All WebFetch calls fail | Set confidence to "low", proceed with tree-only analysis |
182
+ | Single WebFetch fails | Skip file, continue |
183
+
184
+ ---
185
+
186
+ ## Efficiency Budget
187
+
188
+ - `gh api` calls: exactly 2 (metadata + tree)
189
+ - WebFetch calls: 3–6 (selected files)
190
+ - Analysis steps: ~5 (no extra Bash calls needed)
191
+ - **Total tool calls: ≤ 20**
192
+ - **Wall time: ≤ 60s**
193
+ - **Tokens: ≤ 30K**
194
+
195
+ Do not make extra `gh api` calls. Do not fetch files not in the selection list. The tree endpoint returns all paths in one call — no Glob, no Read, no additional listing needed.
196
+
197
+ ---
198
+
199
+ ## Rules
200
+
201
+ - Never write to local filesystem (no `> file`, no `mktemp`, no `git clone`).
202
+ - Never use Read, Glob, or Grep tools — this skill operates on remote data only.
203
+ - Output raw JSON only — the caller parses it, not reads it as prose.
204
+ - Private repos work automatically via `gh auth` stored token.
205
+ - Strip `context: fork` means this skill's token usage doesn't pollute the caller's context.
@@ -96,6 +96,9 @@ quality:
96
96
  # Timeout in seconds to wait for the dev server to become ready (default: 30)
97
97
  browser_timeout: 30
98
98
 
99
+ # Minimum quality score threshold for harness verification (0.0-1.0, default: 0.6)
100
+ harness_min_score: 0.6
101
+
99
102
  # Ratchet configuration for /df:verify health gate
100
103
  # Ratchet snapshots baseline metrics (tests passing, coverage, type checks) before execution
101
104
  # and ensures subsequent runs don't regress. These overrides control which commands ratchet monitors.
@@ -43,6 +43,23 @@
43
43
 
44
44
  - [Explicitly excluded: e.g., "Video upload is NOT included"]
45
45
 
46
+ ## Domain Model
47
+
48
+ <!-- Optional. Define the core entities and vocabulary. -->
49
+
50
+ ### Key Types
51
+
52
+ ```typescript
53
+ // Core domain types and entities
54
+ ```
55
+
56
+ ### Ubiquitous Language
57
+
58
+ - **Term**: Definition
59
+ - **Term**: Definition
60
+
61
+ _Note: Keep to max 15 terms for clarity._
62
+
46
63
  ## Acceptance Criteria
47
64
 
48
65
  - [ ] [Testable criterion: e.g., "User can upload jpg/png/webp files"]