deepflow 0.1.80 → 0.1.81

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "deepflow",
3
- "version": "0.1.80",
3
+ "version": "0.1.81",
4
4
  "description": "Doing reveals what thinking can't predict — spec-driven iterative development for Claude Code",
5
5
  "keywords": [
6
6
  "claude",
@@ -153,9 +153,17 @@ Trigger: ≥2 [SPIKE] tasks with same "Blocked by:" target or identical hypothes
153
153
  - Rank: fewer regressions > higher coverage_delta > fewer files_changed > first to complete
154
154
  - No passes → reset all to pending for retry with debugger
155
155
  6. **Preserve all worktrees.** Losers: rename branch + `-failed` suffix. Record in checkpoint.json under `"spike_probes"`
156
- 7. **Log failed probes** to `.deepflow/auto-memory.yaml` (main tree):
156
+ 7. **Log ALL probe outcomes** to `.deepflow/auto-memory.yaml` (main tree):
157
157
  ```yaml
158
158
  spike_insights:
159
+ - date: "YYYY-MM-DD"
160
+ spec: "{spec_name}"
161
+ spike_id: "SPIKE_A"
162
+ hypothesis: "{from PLAN.md}"
163
+ outcome: "winner"
164
+ approach: "{one-sentence summary of what the winning probe chose}"
165
+ ratchet_metrics: {regressions: N, coverage_delta: N, files_changed: N}
166
+ branch: "df/{spec}--probe-SPIKE_A"
159
167
  - date: "YYYY-MM-DD"
160
168
  spec: "{spec_name}"
161
169
  spike_id: "SPIKE_B"
@@ -165,12 +173,15 @@ Trigger: ≥2 [SPIKE] tasks with same "Blocked by:" target or identical hypothes
165
173
  ratchet_metrics: {regressions: N, coverage_delta: N, files_changed: N}
166
174
  worktree: ".deepflow/worktrees/{spec}/probe-SPIKE_B-failed"
167
175
  branch: "df/{spec}--probe-SPIKE_B-failed"
168
- probe_learnings: # read by /df:auto-cycle each start
176
+ probe_learnings: # read by /df:auto-cycle each start AND included in per-task preamble
177
+ - spike: "SPIKE_A"
178
+ probe: "probe-SPIKE_A"
179
+ insight: "{one-sentence summary of winning approach — e.g. 'Use Node.js over Bun for Playwright'}"
169
180
  - spike: "SPIKE_B"
170
181
  probe: "probe-SPIKE_B"
171
182
  insight: "{one-sentence summary from failure_reason}"
172
183
  ```
173
- Create file if missing. Preserve existing keys when merging.
184
+ Create file if missing. Preserve existing keys when merging. Log BOTH winners and losers — downstream tasks need to know what was chosen, not just what failed.
174
185
  8. **Promote winner:** Cherry-pick into shared worktree. Winner → `[x] [PROBE_WINNER]`, losers → `[~] [PROBE_FAILED]`. Resume standard loop.
175
186
 
176
187
  ---
@@ -183,10 +194,15 @@ Working directory: {worktree_absolute_path}
183
194
  All file operations MUST use this absolute path as base. Do NOT write files to the main project directory.
184
195
  Commit format: {commit_type}({spec}): {description}
185
196
 
197
+ {If .deepflow/auto-memory.yaml exists and has probe_learnings, include:}
198
+ Spike results (follow these approaches):
199
+ {each probe_learning with outcome "winner" → "- {insight}"}
200
+ {Omit this block if no probe_learnings exist.}
201
+
186
202
  STOP after committing. Do NOT merge branches, rename spec files, remove worktrees, or run git checkout on main.
187
203
  ```
188
204
 
189
- **Standard Task:**
205
+ **Standard Task** (spawn with `Agent(model="{Model from PLAN.md}", ...)`):
190
206
  ```
191
207
  {task_id}: {description from PLAN.md}
192
208
  Files: {target files} Spec: {spec_name}
@@ -266,7 +282,14 @@ When all tasks done for a `doing-*` spec:
266
282
  | Implementation | `general-purpose` | Task implementation |
267
283
  | Debugger | `reasoner` | Debugging failures |
268
284
 
269
- **Model routing:** Use `model:` from command/agent/skill frontmatter. Default: `sonnet`.
285
+ **Model routing:** Read `Model:` field from each task block in PLAN.md. Pass as `model:` parameter when spawning the agent. Default: `sonnet` if field is missing.
286
+
287
+ | Task field | Agent call |
288
+ |------------|-----------|
289
+ | `Model: haiku` | `Agent(model="haiku", ...)` |
290
+ | `Model: sonnet` | `Agent(model="sonnet", ...)` |
291
+ | `Model: opus` | `Agent(model="opus", ...)` |
292
+ | (missing) | `Agent(model="sonnet", ...)` |
270
293
 
271
294
  **Checkpoint schema:** `.deepflow/checkpoint.json` in worktree:
272
295
  ```json
@@ -133,6 +133,25 @@ Spawn `Task(subagent_type="reasoner", model="opus")`. Map each requirement to DO
133
133
 
134
134
  Priority: Dependencies → Impact → Risk
135
135
 
136
+ ### 5.5. CLASSIFY MODEL PER TASK
137
+
138
+ For each task, assign `Model:` based on complexity signals:
139
+
140
+ | Model | When | Signals |
141
+ |-------|------|---------|
142
+ | `haiku` | Mechanical / low-risk | Single file, config changes, renames, formatting, browse-fetch, simple additions with clear pattern to follow |
143
+ | `sonnet` | Standard implementation | Feature work, bug fixes, refactoring, multi-file changes with clear specs |
144
+ | `opus` | High complexity | Architecture changes, complex multi-file refactors, ambiguous specs, unfamiliar APIs, >5 files in Impact |
145
+
146
+ **Decision inputs:**
147
+ 1. **File count** — 1 file → likely haiku/sonnet, >5 files → sonnet/opus
148
+ 2. **Impact blast radius** — many callers/duplicates → raise complexity
149
+ 3. **Spec clarity** — clear ACs with patterns → lower, ambiguous requirements → raise
150
+ 4. **Type** — spikes always `sonnet` (need reasoning but scoped), bootstrap → `haiku`
151
+ 5. **Has prior failures** — reverted tasks → raise one level (min `sonnet`)
152
+
153
+ Add `Model: haiku|sonnet|opus` to each task block. Default: `sonnet` if unclear.
154
+
136
155
  ### 6. GENERATE SPIKE TASKS (IF NEEDED)
137
156
 
138
157
  **Spike Task Format:**
@@ -228,6 +247,7 @@ Always use `Task` tool with explicit `subagent_type` and `model`.
228
247
 
229
248
  - [ ] **T2**: Create upload endpoint
230
249
  - Files: src/api/upload.ts
250
+ - Model: sonnet
231
251
  - Impact:
232
252
  - Callers: src/routes/index.ts:5
233
253
  - Duplicates: backend/legacy-upload.go [dead — DELETE]
@@ -235,5 +255,6 @@ Always use `Task` tool with explicit `subagent_type` and `model`.
235
255
 
236
256
  - [ ] **T3**: Add S3 service with streaming
237
257
  - Files: src/services/storage.ts
258
+ - Model: opus
238
259
  - Blocked by: T1, T2
239
260
  ```
@@ -29,8 +29,8 @@ This protocol is the reusable foundation for all browser-based skills (browse-fe
29
29
  Before launching, verify Playwright is available:
30
30
 
31
31
  ```bash
32
- # Prefer bun if available, fall back to node
33
- if which bun > /dev/null 2>&1; then RUNTIME=bun; else RUNTIME=node; fi
32
+ # Prefer Node.js; fall back to Bun
33
+ if which node > /dev/null 2>&1; then RUNTIME=node; elif which bun > /dev/null 2>&1; then RUNTIME=bun; else echo "Error: neither node nor bun found" && exit 1; fi
34
34
 
35
35
  $RUNTIME -e "require('playwright')" 2>/dev/null \
36
36
  || npx --yes playwright install chromium --with-deps 2>&1 | tail -5
@@ -41,8 +41,8 @@ If installation fails, fall back to WebFetch (see Fallback section below).
41
41
  ### 2. Launch Command
42
42
 
43
43
  ```bash
44
- # Detect runtime
45
- if which bun > /dev/null 2>&1; then RUNTIME=bun; else RUNTIME=node; fi
44
+ # Detect runtime — prefer Node.js per decision
45
+ if which node > /dev/null 2>&1; then RUNTIME=node; elif which bun > /dev/null 2>&1; then RUNTIME=bun; else echo "Error: neither node nor bun found" && exit 1; fi
46
46
 
47
47
  $RUNTIME -e "
48
48
  const { chromium } = require('playwright');
@@ -74,13 +74,100 @@ await page.waitForTimeout(1500);
74
74
 
75
75
  ### 4. Content Extraction
76
76
 
77
- Extract the main readable text, not raw HTML:
77
+ Extract content as **structured Markdown** optimized for LLM consumption (not raw HTML or flat text).
78
78
 
79
79
  ```js
80
- // Primary: semantic content containers
81
- let text = await page.innerText('main, article, [role="main"]').catch(() => '');
80
+ // Convert DOM to Markdown inside the browser context — zero dependencies
81
+ let text = await page.evaluate(() => {
82
+ // Remove noise elements
83
+ const noise = 'nav, footer, header, aside, script, style, noscript, svg, [role="navigation"], [role="banner"], [role="contentinfo"], .cookie-banner, #cookie-consent';
84
+ document.querySelectorAll(noise).forEach(el => el.remove());
85
+
86
+ // Pick main content container
87
+ const root = document.querySelector('main, article, [role="main"]') || document.body;
88
+
89
+ function md(node, listDepth = 0) {
90
+ if (node.nodeType === 3) return node.textContent;
91
+ if (node.nodeType !== 1) return '';
92
+ const tag = node.tagName.toLowerCase();
93
+ const children = () => Array.from(node.childNodes).map(c => md(c, listDepth)).join('');
94
+
95
+ // Skip hidden elements
96
+ if (node.getAttribute('aria-hidden') === 'true' || node.hidden) return '';
97
+
98
+ switch (tag) {
99
+ case 'h1': case 'h2': case 'h3': case 'h4': case 'h5': case 'h6': {
100
+ const level = '#'.repeat(parseInt(tag[1]));
101
+ const text = node.textContent.trim();
102
+ return text ? '\n\n' + level + ' ' + text + '\n\n' : '';
103
+ }
104
+ case 'p': return '\n\n' + children().trim() + '\n\n';
105
+ case 'br': return '\n';
106
+ case 'hr': return '\n\n---\n\n';
107
+ case 'strong': case 'b': { const t = children().trim(); return t ? '**' + t + '**' : ''; }
108
+ case 'em': case 'i': { const t = children().trim(); return t ? '*' + t + '*' : ''; }
109
+ case 'code': {
110
+ const t = node.textContent;
111
+ return node.parentElement && node.parentElement.tagName.toLowerCase() === 'pre' ? t : '`' + t + '`';
112
+ }
113
+ case 'pre': {
114
+ const code = node.querySelector('code');
115
+ const lang = code ? (code.className.match(/language-(\w+)/)||[])[1] || '' : '';
116
+ const t = (code || node).textContent.trim();
117
+ return '\n\n```' + lang + '\n' + t + '\n```\n\n';
118
+ }
119
+ case 'a': {
120
+ const href = node.getAttribute('href');
121
+ const t = children().trim();
122
+ return (href && t && !href.startsWith('#')) ? '[' + t + '](' + href + ')' : t;
123
+ }
124
+ case 'img': {
125
+ const alt = node.getAttribute('alt') || '';
126
+ return alt ? '[image: ' + alt + ']' : '';
127
+ }
128
+ case 'ul': case 'ol': return '\n\n' + children() + '\n';
129
+ case 'li': {
130
+ const indent = ' '.repeat(listDepth);
131
+ const bullet = node.parentElement && node.parentElement.tagName.toLowerCase() === 'ol'
132
+ ? (Array.from(node.parentElement.children).indexOf(node) + 1) + '. '
133
+ : '- ';
134
+ const content = Array.from(node.childNodes).map(c => {
135
+ const t = c.tagName && (c.tagName.toLowerCase() === 'ul' || c.tagName.toLowerCase() === 'ol')
136
+ ? md(c, listDepth + 1) : md(c, listDepth);
137
+ return t;
138
+ }).join('').trim();
139
+ return indent + bullet + content + '\n';
140
+ }
141
+ case 'table': {
142
+ const rows = Array.from(node.querySelectorAll('tr'));
143
+ if (!rows.length) return '';
144
+ const matrix = rows.map(r => Array.from(r.querySelectorAll('th, td')).map(c => c.textContent.trim()));
145
+ const cols = Math.max(...matrix.map(r => r.length));
146
+ const widths = Array.from({length: cols}, (_, i) => Math.max(...matrix.map(r => (r[i]||'').length), 3));
147
+ let out = '\n\n';
148
+ matrix.forEach((row, ri) => {
149
+ out += '| ' + Array.from({length: cols}, (_, i) => (row[i]||'').padEnd(widths[i])).join(' | ') + ' |\n';
150
+ if (ri === 0) out += '| ' + widths.map(w => '-'.repeat(w)).join(' | ') + ' |\n';
151
+ });
152
+ return out + '\n';
153
+ }
154
+ case 'blockquote': return '\n\n> ' + children().trim().replace(/\n/g, '\n> ') + '\n\n';
155
+ case 'dl': return '\n\n' + children() + '\n';
156
+ case 'dt': return '**' + children().trim() + '**\n';
157
+ case 'dd': return ': ' + children().trim() + '\n';
158
+ case 'div': case 'section': case 'span': case 'figure': case 'figcaption':
159
+ return children();
160
+ default: return children();
161
+ }
162
+ }
163
+
164
+ let result = md(root);
165
+ // Collapse excessive whitespace
166
+ result = result.replace(/\n{3,}/g, '\n\n').trim();
167
+ return result;
168
+ });
82
169
 
83
- // Fallback: full body text
170
+ // Fallback if extraction is too short
84
171
  if (!text || text.trim().length < 100) {
85
172
  text = await page.innerText('body').catch(() => '');
86
173
  }
@@ -134,11 +221,13 @@ await browser.close();
134
221
 
135
222
  ## Fetch Workflow
136
223
 
137
- **Goal:** retrieve and return the text content of a single URL.
224
+ **Goal:** retrieve and return structured Markdown content of a single URL.
225
+
226
+ The full inline script uses `page.evaluate()` to convert DOM → Markdown inside the browser (zero Node dependencies). Adapt the URL per query.
138
227
 
139
228
  ```bash
140
- # Full inline script — adapt URL and selector per query
141
- if which bun > /dev/null 2>&1; then RUNTIME=bun; else RUNTIME=node; fi
229
+ # Full inline script — adapt URL per query
230
+ if which node > /dev/null 2>&1; then RUNTIME=node; elif which bun > /dev/null 2>&1; then RUNTIME=bun; else echo "Error: neither node nor bun found" && exit 1; fi
142
231
 
143
232
  $RUNTIME -e "
144
233
  const { chromium } = require('playwright');
@@ -157,14 +246,83 @@ const { chromium } = require('playwright');
157
246
  await page.waitForTimeout(1500);
158
247
 
159
248
  const title = await page.title();
160
- const url = page.url();
249
+ const url = page.url();
161
250
 
162
251
  if (/sign.?in|log.?in|auth/i.test(title) || url.includes('/login')) {
163
252
  console.log('[browse-fetch] Blocked by login wall at ' + url);
164
253
  return;
165
254
  }
166
255
 
167
- let text = await page.innerText('main, article, [role=\"main\"]').catch(() => '');
256
+ let text = await page.evaluate(() => {
257
+ const noise = 'nav, footer, header, aside, script, style, noscript, svg, [role=\"navigation\"], [role=\"banner\"], [role=\"contentinfo\"], .cookie-banner, #cookie-consent';
258
+ document.querySelectorAll(noise).forEach(el => el.remove());
259
+ const root = document.querySelector('main, article, [role=\"main\"]') || document.body;
260
+
261
+ function md(node, listDepth) {
262
+ listDepth = listDepth || 0;
263
+ if (node.nodeType === 3) return node.textContent;
264
+ if (node.nodeType !== 1) return '';
265
+ var tag = node.tagName.toLowerCase();
266
+ var kids = function() { return Array.from(node.childNodes).map(function(c) { return md(c, listDepth); }).join(''); };
267
+ if (node.getAttribute('aria-hidden') === 'true' || node.hidden) return '';
268
+ switch (tag) {
269
+ case 'h1': case 'h2': case 'h3': case 'h4': case 'h5': case 'h6':
270
+ var level = '#'.repeat(parseInt(tag[1]));
271
+ var t = node.textContent.trim();
272
+ return t ? '\\n\\n' + level + ' ' + t + '\\n\\n' : '';
273
+ case 'p': return '\\n\\n' + kids().trim() + '\\n\\n';
274
+ case 'br': return '\\n';
275
+ case 'hr': return '\\n\\n---\\n\\n';
276
+ case 'strong': case 'b': var s = kids().trim(); return s ? '**' + s + '**' : '';
277
+ case 'em': case 'i': var e = kids().trim(); return e ? '*' + e + '*' : '';
278
+ case 'code':
279
+ var ct = node.textContent;
280
+ return node.parentElement && node.parentElement.tagName.toLowerCase() === 'pre' ? ct : '\`' + ct + '\`';
281
+ case 'pre':
282
+ var codeEl = node.querySelector('code');
283
+ var lang = codeEl ? ((codeEl.className.match(/language-(\\w+)/) || [])[1] || '') : '';
284
+ var pt = (codeEl || node).textContent.trim();
285
+ return '\\n\\n\`\`\`' + lang + '\\n' + pt + '\\n\`\`\`\\n\\n';
286
+ case 'a':
287
+ var href = node.getAttribute('href');
288
+ var at = kids().trim();
289
+ return (href && at && !href.startsWith('#')) ? '[' + at + '](' + href + ')' : at;
290
+ case 'img':
291
+ var alt = node.getAttribute('alt') || '';
292
+ return alt ? '[image: ' + alt + ']' : '';
293
+ case 'ul': case 'ol': return '\\n\\n' + kids() + '\\n';
294
+ case 'li':
295
+ var indent = ' '.repeat(listDepth);
296
+ var bullet = node.parentElement && node.parentElement.tagName.toLowerCase() === 'ol'
297
+ ? (Array.from(node.parentElement.children).indexOf(node) + 1) + '. ' : '- ';
298
+ var content = Array.from(node.childNodes).map(function(c) {
299
+ var tg = c.tagName && c.tagName.toLowerCase();
300
+ return (tg === 'ul' || tg === 'ol') ? md(c, listDepth + 1) : md(c, listDepth);
301
+ }).join('').trim();
302
+ return indent + bullet + content + '\\n';
303
+ case 'table':
304
+ var rows = Array.from(node.querySelectorAll('tr'));
305
+ if (!rows.length) return '';
306
+ var matrix = rows.map(function(r) { return Array.from(r.querySelectorAll('th, td')).map(function(c) { return c.textContent.trim(); }); });
307
+ var cols = Math.max.apply(null, matrix.map(function(r) { return r.length; }));
308
+ var widths = Array.from({length: cols}, function(_, i) { return Math.max.apply(null, matrix.map(function(r) { return (r[i]||'').length; }).concat([3])); });
309
+ var out = '\\n\\n';
310
+ matrix.forEach(function(row, ri) {
311
+ out += '| ' + Array.from({length: cols}, function(_, i) { return (row[i]||'').padEnd(widths[i]); }).join(' | ') + ' |\\n';
312
+ if (ri === 0) out += '| ' + widths.map(function(w) { return '-'.repeat(w); }).join(' | ') + ' |\\n';
313
+ });
314
+ return out + '\\n';
315
+ case 'blockquote': return '\\n\\n> ' + kids().trim().replace(/\\n/g, '\\n> ') + '\\n\\n';
316
+ case 'dt': return '**' + kids().trim() + '**\\n';
317
+ case 'dd': return ': ' + kids().trim() + '\\n';
318
+ default: return kids();
319
+ }
320
+ }
321
+
322
+ var result = md(root);
323
+ return result.replace(/\\n{3,}/g, '\\n\\n').trim();
324
+ });
325
+
168
326
  if (!text || text.trim().length < 100) {
169
327
  text = await page.innerText('body').catch(() => '');
170
328
  }
@@ -182,7 +340,7 @@ const { chromium } = require('playwright');
182
340
  "
183
341
  ```
184
342
 
185
- Adapt the URL and selector per query. The agent inlines the full script via `node -e` or `bun -e` so no temp files are needed for extractions under ~4000 tokens.
343
+ The agent inlines the full script via `node -e` or `bun -e` so no temp files are needed for extractions under ~4000 tokens.
186
344
 
187
345
  ---
188
346
 
@@ -250,7 +408,7 @@ If WebFetch also fails, return the URL with an explanation and continue the task
250
408
  ## Rules
251
409
 
252
410
  - Always run the install check before the first browser launch in a session.
253
- - Detect runtime with `which bun` first; use `node` if bun is absent.
411
+ - Detect runtime with `which node` first; fall back to `bun` if node is absent.
254
412
  - Never navigate to Google or DuckDuckGo with Playwright — use WebSearch tool or direct URLs.
255
413
  - Truncate output at ~4000 tokens (~16 000 chars) to protect context budget.
256
414
  - On login wall or CAPTCHA, log the block, skip, and continue — never retry infinitely.