deepflow 0.1.81 → 0.1.82

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "deepflow",
3
- "version": "0.1.81",
3
+ "version": "0.1.82",
4
4
  "description": "Doing reveals what thinking can't predict — spec-driven iterative development for Claude Code",
5
5
  "keywords": [
6
6
  "claude",
@@ -20,8 +20,8 @@ Each task = one background agent. Completion notifications drive the loop.
20
20
  3. On EACH notification:
21
21
  a. Run ratchet check (section 5.5)
22
22
  b. Passed → TaskUpdate(status: "completed"), update PLAN.md [x] + commit hash
23
- c. Failed → git revert HEAD --no-edit, TaskUpdate(status: "pending")
24
- d. Report ONE line: "✓ T1: ratchet passed (abc123)" or "✗ T1: ratchet failed, reverted"
23
+ c. Failed → run partial salvage protocol (section 5.5). If salvaged → treat as passed. If not → git revert, TaskUpdate(status: "pending")
24
+ d. Report ONE line: "✓ T1: ratchet passed (abc123)" or "⚕ T1: salvaged lint fix (abc124)" or "✗ T1: ratchet failed, reverted"
25
25
  e. NOT all done → end turn, wait | ALL done → next wave or finish
26
26
  4. Between waves: check context %. If ≥50% → checkpoint and exit.
27
27
  5. Repeat until: all done, all blocked, or context ≥50%.
@@ -57,6 +57,14 @@ Require clean HEAD (`git diff --quiet`). Derive SPEC_NAME from `specs/doing-*.md
57
57
  Create worktree: `.deepflow/worktrees/{spec}` on branch `df/{spec}`.
58
58
  Reuse if exists. `--fresh` deletes first.
59
59
 
60
+ If `worktree.sparse_paths` is non-empty in config, enable sparse checkout:
61
+ ```bash
62
+ git worktree add --no-checkout -b df/{spec} .deepflow/worktrees/{spec}
63
+ cd .deepflow/worktrees/{spec}
64
+ git sparse-checkout set {sparse_paths...}
65
+ git checkout df/{spec}
66
+ ```
67
+
60
68
  ### 1.6. RATCHET SNAPSHOT
61
69
 
62
70
  Snapshot pre-existing test files in worktree — only these count for ratchet (agent-created tests excluded):
@@ -136,7 +144,15 @@ Run Build → Test → Typecheck → Lint (stop on first failure).
136
144
  Compare `git diff HEAD~1 --name-only` against Impact callers/duplicates list.
137
145
  File listed but not modified → **advisory warning**: "Impact gap: {file} listed as {caller|duplicate} but not modified — verify manually". Not auto-revert (callers sometimes don't need changes), but flags the risk.
138
146
 
139
- **Evaluate:** All pass + no violations → commit stands. Any failure → `git revert HEAD --no-edit`.
147
+ **Evaluate:** All pass + no violations → commit stands. Any failure → attempt partial salvage before reverting:
148
+
149
+ **Partial salvage protocol:**
150
+ 1. Run `git diff HEAD~1 --stat` to see what the agent changed
151
+ 2. If failure is lint-only or typecheck-only (build + tests passed):
152
+ - Spawn `Agent(model="haiku", subagent_type="general-purpose")` with prompt: `Fix the {lint|typecheck} errors in the worktree. Only fix what's broken, change nothing else. Files changed: {diff stat}. Error output: {error}`
153
+ - Run ratchet again on the fix commit
154
+ - If passes → both commits stand. If fails → `git revert HEAD --no-edit && git revert HEAD --no-edit` (revert both)
155
+ 3. If failure is build or test → `git revert HEAD --no-edit` (no salvage, too risky)
140
156
 
141
157
  Ratchet uses ONLY pre-existing test files from `.deepflow/auto-snapshot.txt`.
142
158
 
@@ -188,45 +204,75 @@ Trigger: ≥2 [SPIKE] tasks with same "Blocked by:" target or identical hypothes
188
204
 
189
205
  ### 6. PER-TASK (agent prompt)
190
206
 
207
+ > **Context engineering rationale:** Prompt order follows the attention U-curve (start/end = high attention, middle = low).
208
+ > Critical instructions go at start and end. Navigable data goes in the middle.
209
+ > See: Chroma "Context Rot" (2025) — performance degrades ~2%/100K tokens; distractors and semantic ambiguity compound degradation.
210
+
191
211
  **Common preamble (include in all agent prompts):**
192
212
  ```
193
213
  Working directory: {worktree_absolute_path}
194
214
  All file operations MUST use this absolute path as base. Do NOT write files to the main project directory.
195
215
  Commit format: {commit_type}({spec}): {description}
196
-
197
- {If .deepflow/auto-memory.yaml exists and has probe_learnings, include:}
198
- Spike results (follow these approaches):
199
- {each probe_learning with outcome "winner" → "- {insight}"}
200
- {Omit this block if no probe_learnings exist.}
201
-
202
- STOP after committing. Do NOT merge branches, rename spec files, remove worktrees, or run git checkout on main.
203
216
  ```
204
217
 
205
218
  **Standard Task** (spawn with `Agent(model="{Model from PLAN.md}", ...)`):
219
+
220
+ Prompt sections in order (START = high attention, MIDDLE = navigable data, END = high attention):
221
+
206
222
  ```
223
+ --- START (high attention zone) ---
224
+
207
225
  {task_id}: {description from PLAN.md}
208
226
  Files: {target files} Spec: {spec_name}
209
- {Impact block from PLAN.md — include verbatim if present}
210
227
 
211
228
  {Prior failure context — include ONLY if task was previously reverted. Read from .deepflow/auto-memory.yaml revert_history for this task_id:}
212
- Previous attempts (DO NOT repeat these approaches):
213
- - Cycle {N}: reverted — "{reason from revert_history}"
229
+ DO NOT repeat these approaches:
214
230
  - Cycle {N}: reverted — "{reason from revert_history}"
215
231
  {Omit this entire block if task has no revert history.}
216
232
 
217
- CRITICAL: If Impact lists duplicates or callers, you MUST verify each one is consistent with your changes.
218
- - [active] duplicates → consolidate into single source of truth (e.g., local generateYAML → use shared buildConfigData)
219
- - [dead] duplicates DELETE the dead code entirely. Dead code pollutes context and causes drift.
233
+ {Acceptance criteria excerpt extract 2-3 key ACs from the spec file (specs/doing-*.md). Include only the criteria relevant to THIS task, not the full spec.}
234
+ Success criteria:
235
+ - {AC relevant to this task}
236
+ - {AC relevant to this task}
237
+ {Omit if spec has no structured ACs.}
238
+
239
+ --- MIDDLE (navigable data zone) ---
240
+
241
+ {Impact block from PLAN.md — include verbatim if present. Annotate each caller with WHY it's impacted:}
242
+ Impact:
243
+ - Callers: {file} ({why — e.g. "imports validateToken which you're changing"})
244
+ - Duplicates:
245
+ - {file} [active — consolidate]
246
+ - {file} [dead — DELETE]
247
+ - Data flow: {consumers}
248
+ {Omit if no Impact in PLAN.md.}
249
+
250
+ {Dependency context — for each completed blocker task, include a one-liner summary:}
251
+ Prior tasks:
252
+ - {dep_task_id}: {one-line summary of what changed — e.g. "refactored validateToken to async, changed signature (string) → (string, opts)"}
253
+ {Omit if task has no dependencies or all deps are bootstrap/spike tasks.}
220
254
 
221
255
  Steps:
222
256
  1. External APIs/SDKs → chub search "<library>" --json → chub get <id> --lang <lang> (skip if chub unavailable or internal code only)
223
- 2. Read ALL files in Impact before implementing understand the full picture
224
- 3. Implement the task, updating all impacted files
225
- 4. Commit as feat({spec}): {description}
257
+ 2. LSP freshness check: run `findReferences` on each function/type you're about to change. If callers exist beyond the Impact list, add them to your scope before implementing.
258
+ 3. Read ALL files in Impact (+ any new callers from step 2) before implementing — understand the full picture
259
+ 4. Implement the task, updating all impacted files
260
+ 5. Commit as feat({spec}): {description}
261
+
262
+ --- END (high attention zone) ---
263
+
264
+ {If .deepflow/auto-memory.yaml exists and has probe_learnings, include:}
265
+ Spike results (follow these approaches):
266
+ {each probe_learning with outcome "winner" → "- {insight}"}
267
+ {Omit this block if no probe_learnings exist.}
226
268
 
269
+ If Impact lists duplicates: [active] → consolidate into single source of truth. [dead] → DELETE entirely.
227
270
  Your ONLY job is to write code and commit. Orchestrator runs health checks after.
271
+ STOP after committing. Do NOT merge branches, rename spec files, remove worktrees, or run git checkout on main.
228
272
  ```
229
273
 
274
+ **Effort-aware context budget:** For `Effort: low` tasks, omit the MIDDLE section entirely (no Impact, no dependency context, no steps). For `Effort: medium`, include Impact but omit dependency context. For `Effort: high`, include everything.
275
+
230
276
  **Bootstrap Task:**
231
277
  ```
232
278
  BOOTSTRAP: Write tests for files in edit_scope
@@ -242,7 +288,7 @@ Commit as test({spec}): bootstrap tests for edit_scope
242
288
  Files: {target files} Spec: {spec_name}
243
289
 
244
290
  {Prior failure context — include ONLY if this spike was previously reverted. Read from .deepflow/auto-memory.yaml revert_history + spike_insights for this task_id:}
245
- Previous attempts (DO NOT repeat these approaches):
291
+ DO NOT repeat these approaches:
246
292
  - Cycle {N}: reverted — "{reason}"
247
293
  {Omit this entire block if no revert history.}
248
294
 
@@ -282,14 +328,19 @@ When all tasks done for a `doing-*` spec:
282
328
  | Implementation | `general-purpose` | Task implementation |
283
329
  | Debugger | `reasoner` | Debugging failures |
284
330
 
285
- **Model routing:** Read `Model:` field from each task block in PLAN.md. Pass as `model:` parameter when spawning the agent. Default: `sonnet` if field is missing.
331
+ **Model + effort routing:** Read `Model:` and `Effort:` fields from each task block in PLAN.md. Pass `model:` parameter when spawning the agent. Prepend effort instruction to the agent prompt. Defaults: `Model: sonnet`, `Effort: medium`.
332
+
333
+ | Task fields | Agent call | Prompt preamble |
334
+ |-------------|-----------|-----------------|
335
+ | `Model: haiku, Effort: low` | `Agent(model="haiku", ...)` | `You MUST be maximally efficient: skip explanations, minimize tool calls, go straight to implementation.` |
336
+ | `Model: sonnet, Effort: medium` | `Agent(model="sonnet", ...)` | `Be direct and efficient. Explain only when the logic is non-obvious.` |
337
+ | `Model: opus, Effort: high` | `Agent(model="opus", ...)` | _(no preamble — default behavior)_ |
338
+ | (missing) | `Agent(model="sonnet", ...)` | `Be direct and efficient. Explain only when the logic is non-obvious.` |
286
339
 
287
- | Task field | Agent call |
288
- |------------|-----------|
289
- | `Model: haiku` | `Agent(model="haiku", ...)` |
290
- | `Model: sonnet` | `Agent(model="sonnet", ...)` |
291
- | `Model: opus` | `Agent(model="opus", ...)` |
292
- | (missing) | `Agent(model="sonnet", ...)` |
340
+ **Effort preamble rules:**
341
+ - `low` → Prepend efficiency instruction. Agent should make fewest possible tool calls.
342
+ - `medium` Prepend balanced instruction. Agent skips preamble but explains non-obvious decisions.
343
+ - `high` No preamble added. Agent uses full reasoning capabilities.
293
344
 
294
345
  **Checkpoint schema:** `.deepflow/checkpoint.json` in worktree:
295
346
  ```json
@@ -75,13 +75,13 @@ Use `code-completeness` skill to search for: implementations matching spec requi
75
75
 
76
76
  For each file in a task's "Files:" list, find the full blast radius.
77
77
 
78
- **Search for:**
78
+ **Search for (prefer LSP, fallback to grep):**
79
79
 
80
- 1. **Callers:** `grep -r "{exported_function}" --include="*.{ext}" -l` — files that import/call what's being changed
80
+ 1. **Callers:** Use LSP `findReferences` / `incomingCalls` on each exported function/type being changed. Annotate each caller with WHY it's impacted (e.g. "imports validateToken which this task changes"). Fallback: `grep -r "{exported_function}" --include="*.{ext}" -l`
81
81
  2. **Duplicates:** Files with similar logic (same function name, same transformation). Classify:
82
82
  - `[active]` — used in production → must consolidate
83
83
  - `[dead]` — bypassed/unreachable → must delete
84
- 3. **Data flow:** If file produces/transforms data, find ALL consumers of that shape across languages
84
+ 3. **Data flow:** If file produces/transforms data, use LSP `outgoingCalls` to trace consumers. Fallback: grep across languages
85
85
 
86
86
  **Embed as `Impact:` block in each task:**
87
87
  ```markdown
@@ -133,24 +133,46 @@ Spawn `Task(subagent_type="reasoner", model="opus")`. Map each requirement to DO
133
133
 
134
134
  Priority: Dependencies → Impact → Risk
135
135
 
136
- ### 5.5. CLASSIFY MODEL PER TASK
136
+ ### 5.5. CLASSIFY MODEL + EFFORT PER TASK
137
137
 
138
- For each task, assign `Model:` based on complexity signals:
138
+ For each task, assign `Model:` and `Effort:` based on the routing matrix:
139
139
 
140
- | Model | When | Signals |
141
- |-------|------|---------|
142
- | `haiku` | Mechanical / low-risk | Single file, config changes, renames, formatting, browse-fetch, simple additions with clear pattern to follow |
143
- | `sonnet` | Standard implementation | Feature work, bug fixes, refactoring, multi-file changes with clear specs |
144
- | `opus` | High complexity | Architecture changes, complex multi-file refactors, ambiguous specs, unfamiliar APIs, >5 files in Impact |
140
+ #### Routing matrix
145
141
 
146
- **Decision inputs:**
147
- 1. **File count** — 1 file → likely haiku/sonnet, >5 files → sonnet/opus
148
- 2. **Impact blast radius** many callers/duplicates raise complexity
149
- 3. **Spec clarity** clear ACs with patterns lower, ambiguous requirements → raise
150
- 4. **Type** spikes always `sonnet` (need reasoning but scoped), bootstrap `haiku`
151
- 5. **Has prior failures** reverted tasks raise one level (min `sonnet`)
142
+ | Task type | Model | Effort | Rationale |
143
+ |-----------|-------|--------|-----------|
144
+ | Bootstrap (scaffold, config, rename) | `haiku` | `low` | Mechanical, pattern-following, zero ambiguity |
145
+ | browse-fetch (doc retrieval) | `haiku` | `low` | Just fetching and extracting, no reasoning |
146
+ | Single-file simple addition | `haiku` | `high` | Small scope but needs to get it right |
147
+ | Multi-file with clear specs | `sonnet` | `medium` | Standard work, specs remove need for deep thinking |
148
+ | Bug fix (clear repro) | `sonnet` | `medium` | Diagnosis done, just apply fix |
149
+ | Bug fix (unclear cause) | `sonnet` | `high` | Needs reasoning to find root cause |
150
+ | Spike / validation | `sonnet` | `high` | Scoped but needs reasoning to validate hypothesis |
151
+ | Feature work (well-specced) | `sonnet` | `medium` | Clear ACs reduce thinking overhead |
152
+ | Feature work (ambiguous ACs) | `opus` | `medium` | Needs intelligence but effort can be moderate with good specs |
153
+ | Refactor (>5 files, many callers) | `opus` | `medium` | Blast radius needs intelligence, patterns are repetitive |
154
+ | Architecture change | `opus` | `high` | High complexity + high ambiguity |
155
+ | Unfamiliar API integration | `opus` | `high` | Needs deep reasoning about unknown patterns |
156
+ | Retried after revert | _(raise one level)_ | `high` | Prior failure means harder than expected |
152
157
 
153
- Add `Model: haiku|sonnet|opus` to each task block. Default: `sonnet` if unclear.
158
+ #### Decision inputs
159
+
160
+ 1. **File count** — 1 file → haiku/sonnet, 2-5 → sonnet, >5 → sonnet/opus
161
+ 2. **Impact blast radius** — many callers/duplicates → raise model
162
+ 3. **Spec clarity** — clear ACs → lower effort, ambiguous → raise effort
163
+ 4. **Type** — spikes → `sonnet high`, bootstrap → `haiku low`
164
+ 5. **Has prior failures** — raise model one level AND set effort to `high`
165
+ 6. **Repetitiveness** — repetitive pattern across files → lower effort even at higher model
166
+
167
+ #### Effort economics
168
+
169
+ Effort controls ALL token spend (text, tool calls, thinking). Lower effort = fewer tool calls, less preamble, shorter reasoning.
170
+
171
+ - `low` → ~60-70% token reduction vs high. Use when task is mechanical.
172
+ - `medium` → ~30-40% token reduction. Use when specs are clear.
173
+ - `high` → full spend (default). Use when ambiguity or risk is high.
174
+
175
+ Add `Model: haiku|sonnet|opus` and `Effort: low|medium|high` to each task block. Defaults: `Model: sonnet`, `Effort: medium`.
154
176
 
155
177
  ### 6. GENERATE SPIKE TASKS (IF NEEDED)
156
178
 
@@ -62,6 +62,12 @@ worktree:
62
62
  # Keep worktree after failed execution for debugging
63
63
  cleanup_on_fail: false
64
64
 
65
+ # Sparse checkout paths (for large monorepos only)
66
+ # When set, worktrees checkout only these directories via git sparse-checkout
67
+ # Leave empty for full checkout (default, works for most repos)
68
+ # Example: ["src/", "tests/", "package.json", "tsconfig.json"]
69
+ sparse_paths: []
70
+
65
71
  # Quality gates for /df:verify
66
72
  quality:
67
73
  # Override auto-detected build command (e.g., "npm run build", "cargo build")