deepflow 0.1.87 → 0.1.89
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/install.js +73 -7
- package/hooks/df-dashboard-push.js +170 -0
- package/hooks/df-execution-history.js +120 -0
- package/hooks/df-invariant-check.js +126 -0
- package/hooks/df-spec-lint.js +78 -4
- package/hooks/df-statusline.js +77 -5
- package/hooks/df-tool-usage-spike.js +41 -0
- package/hooks/df-tool-usage.js +86 -0
- package/hooks/df-worktree-guard.js +101 -0
- package/package.json +1 -1
- package/src/commands/df/auto-cycle.md +75 -558
- package/src/commands/df/auto.md +9 -48
- package/src/commands/df/consolidate.md +14 -38
- package/src/commands/df/dashboard.md +35 -0
- package/src/commands/df/debate.md +27 -156
- package/src/commands/df/discover.md +35 -181
- package/src/commands/df/execute.md +283 -563
- package/src/commands/df/note.md +37 -176
- package/src/commands/df/plan.md +80 -210
- package/src/commands/df/report.md +29 -184
- package/src/commands/df/resume.md +18 -101
- package/src/commands/df/spec.md +49 -145
- package/src/commands/df/verify.md +59 -606
- package/src/skills/browse-fetch/SKILL.md +32 -257
- package/src/skills/browse-verify/SKILL.md +40 -174
- package/src/skills/code-completeness/SKILL.md +2 -9
- package/src/skills/gap-discovery/SKILL.md +19 -86
- package/templates/config-template.yaml +10 -0
- package/templates/spec-template.md +12 -1
package/src/commands/df/note.md
CHANGED
|
@@ -7,206 +7,67 @@ description: Capture decisions that emerged during free conversations outside of
|
|
|
7
7
|
|
|
8
8
|
## Orchestrator Role
|
|
9
9
|
|
|
10
|
-
|
|
10
|
+
Scan conversation for candidate decisions, present for user confirmation, persist to `.deepflow/decisions.md`.
|
|
11
11
|
|
|
12
|
-
**NEVER:** Spawn agents, use Task tool, use Glob/Grep on source code, run git, use TaskOutput,
|
|
12
|
+
**NEVER:** Spawn agents, use Task tool, use Glob/Grep on source code, run git, use TaskOutput, EnterPlanMode, ExitPlanMode
|
|
13
13
|
|
|
14
|
-
**ONLY:** Read `.deepflow/decisions.md
|
|
15
|
-
|
|
16
|
-
---
|
|
17
|
-
|
|
18
|
-
## Purpose
|
|
19
|
-
|
|
20
|
-
Capture decisions that emerged during free conversations outside of deepflow commands. Surfaces candidate decisions from the current conversation, lets the user confirm or discard each, and persists confirmed ones to the shared decisions log.
|
|
21
|
-
|
|
22
|
-
## Usage
|
|
23
|
-
|
|
24
|
-
```
|
|
25
|
-
/df:note
|
|
26
|
-
```
|
|
27
|
-
|
|
28
|
-
No arguments required. Operates on the current conversation context.
|
|
29
|
-
|
|
30
|
-
---
|
|
14
|
+
**ONLY:** Read `.deepflow/decisions.md`, present candidates via `AskUserQuestion`, append confirmed decisions
|
|
31
15
|
|
|
32
16
|
## Behavior
|
|
33
17
|
|
|
34
18
|
### 1. EXTRACT CANDIDATES
|
|
35
19
|
|
|
36
|
-
Scan
|
|
20
|
+
Scan prior messages for resolved choices, adopted approaches, or stated assumptions. Look for:
|
|
21
|
+
- **Approaches chosen**: "we'll use X instead of Y"
|
|
22
|
+
- **Provisional choices**: "for now we'll use X"
|
|
23
|
+
- **Stated assumptions**: "assuming X is true"
|
|
24
|
+
- **Constraints accepted**: "X is out of scope"
|
|
25
|
+
- **Naming/structural choices**: "we'll call it X", "X goes in the Y layer"
|
|
37
26
|
|
|
38
|
-
|
|
39
|
-
- **Provisional choices**: "for now we'll use X", "assuming X until we know more"
|
|
40
|
-
- **Stated assumptions**: "assuming X is true", "treating X as given"
|
|
41
|
-
- **Constraints accepted**: "we won't do X", "X is out of scope"
|
|
42
|
-
- **Naming or structural choices**: "we'll call it X", "X goes in the Y layer"
|
|
27
|
+
Extract **at most 4 candidates**. For each, determine:
|
|
43
28
|
|
|
44
|
-
|
|
29
|
+
| Field | Value |
|
|
30
|
+
|-------|-------|
|
|
31
|
+
| Tag | `[APPROACH]` (deliberate choice), `[PROVISIONAL]` (revisit later), or `[ASSUMPTION]` (unvalidated) |
|
|
32
|
+
| Decision | One concise line describing the choice |
|
|
33
|
+
| Rationale | One sentence explaining why |
|
|
45
34
|
|
|
46
|
-
|
|
47
|
-
- **Tag**: one of `[APPROACH]`, `[PROVISIONAL]`, or `[ASSUMPTION]`
|
|
48
|
-
- `[APPROACH]` — a deliberate design or implementation choice
|
|
49
|
-
- `[PROVISIONAL]` — works for now, expected to revisit
|
|
50
|
-
- `[ASSUMPTION]` — treating something as true without full validation
|
|
51
|
-
- **Decision text**: one concise line describing the choice
|
|
52
|
-
- **Rationale**: one sentence explaining why this was chosen
|
|
53
|
-
|
|
54
|
-
If fewer than 2 clear candidates are found, say so briefly and exit without calling `AskUserQuestion`.
|
|
35
|
+
If <2 clear candidates found, say so and exit.
|
|
55
36
|
|
|
56
37
|
### 2. CHECK FOR CONTRADICTIONS
|
|
57
38
|
|
|
58
|
-
Read `.deepflow/decisions.md` if it exists.
|
|
59
|
-
|
|
60
|
-
If a contradiction is found:
|
|
61
|
-
- Keep the prior entry — never delete or modify it
|
|
62
|
-
- Amend the candidate's rationale to reference the prior decision: `was "X", now "Y" because Z`
|
|
39
|
+
Read `.deepflow/decisions.md` if it exists. If a candidate contradicts a prior entry: keep prior entry unchanged, amend candidate rationale to `was "X", now "Y" because Z`.
|
|
63
40
|
|
|
64
41
|
### 3. PRESENT VIA AskUserQuestion
|
|
65
42
|
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
```json
|
|
69
|
-
{
|
|
70
|
-
"questions": [
|
|
71
|
-
{
|
|
72
|
-
"question": "These decisions were detected in your conversation. Which should be saved to .deepflow/decisions.md?",
|
|
73
|
-
"header": "Save notes?",
|
|
74
|
-
"multiSelect": true,
|
|
75
|
-
"options": [
|
|
76
|
-
{
|
|
77
|
-
"label": "[APPROACH] <decision text>",
|
|
78
|
-
"description": "<rationale>"
|
|
79
|
-
},
|
|
80
|
-
{
|
|
81
|
-
"label": "[PROVISIONAL] <decision text>",
|
|
82
|
-
"description": "<rationale>"
|
|
83
|
-
}
|
|
84
|
-
]
|
|
85
|
-
}
|
|
86
|
-
]
|
|
87
|
-
}
|
|
88
|
-
```
|
|
89
|
-
|
|
90
|
-
Each option's `label` is the tag + decision text. Each `description` is the rationale (one sentence).
|
|
43
|
+
Single multi-select call. Each option: `label` = tag + decision text, `description` = rationale.
|
|
91
44
|
|
|
92
45
|
### 4. APPEND CONFIRMED DECISIONS
|
|
93
46
|
|
|
94
|
-
For each option
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
```
|
|
100
|
-
|
|
101
|
-
2. Append a new dated section using today's date in `YYYY-MM-DD` format and source `note`:
|
|
102
|
-
|
|
103
|
-
```markdown
|
|
104
|
-
### 2026-02-22 — note
|
|
105
|
-
- [APPROACH] Use event sourcing over CRUD — append-only log matches audit requirements
|
|
106
|
-
- [PROVISIONAL] Batch size = 50 — works for 4-game dataset, revisit at scale
|
|
107
|
-
```
|
|
108
|
-
|
|
109
|
-
3. If multiple decisions are confirmed in one invocation, group them under a single dated section.
|
|
110
|
-
|
|
111
|
-
4. Never modify or delete any prior entries.
|
|
47
|
+
For each selected option:
|
|
48
|
+
1. Create `.deepflow/decisions.md` with `# Decisions` header if absent
|
|
49
|
+
2. Append a dated section: `### YYYY-MM-DD — note`
|
|
50
|
+
3. Group all confirmed decisions under one section: `- [TAG] Decision text — rationale`
|
|
51
|
+
4. Never modify or delete prior entries
|
|
112
52
|
|
|
113
53
|
### 5. CONFIRM
|
|
114
54
|
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
```
|
|
118
|
-
Saved N decision(s) to .deepflow/decisions.md
|
|
119
|
-
```
|
|
120
|
-
|
|
121
|
-
If the user selected nothing, respond:
|
|
122
|
-
|
|
123
|
-
```
|
|
124
|
-
No decisions saved.
|
|
125
|
-
```
|
|
126
|
-
|
|
127
|
-
---
|
|
128
|
-
|
|
129
|
-
## Decision Format
|
|
130
|
-
|
|
131
|
-
```
|
|
132
|
-
### YYYY-MM-DD — note
|
|
133
|
-
- [TAG] Decision text — rationale
|
|
134
|
-
```
|
|
55
|
+
Report: `Saved N decision(s) to .deepflow/decisions.md` or `No decisions saved.`
|
|
135
56
|
|
|
136
|
-
|
|
137
|
-
- `[APPROACH]` — deliberate design or implementation choice
|
|
138
|
-
- `[PROVISIONAL]` — works for now, will revisit at scale or with more information
|
|
139
|
-
- `[ASSUMPTION]` — treating something as true without full confirmation
|
|
140
|
-
- `[DEBT]` — needs revisiting; produced only by `/df:consolidate`, never manually assigned
|
|
57
|
+
## Decision Tags
|
|
141
58
|
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
|
|
59
|
+
| Tag | Meaning | Source |
|
|
60
|
+
|-----|---------|--------|
|
|
61
|
+
| `[APPROACH]` | Firm decision | /df:note, auto-extraction |
|
|
62
|
+
| `[PROVISIONAL]` | Revisit later | /df:note, auto-extraction |
|
|
63
|
+
| `[ASSUMPTION]` | Unverified | /df:note, auto-extraction |
|
|
64
|
+
| `[DEBT]` | Needs revisiting | /df:consolidate only, never manually assigned |
|
|
145
65
|
|
|
146
66
|
## Rules
|
|
147
67
|
|
|
148
|
-
-
|
|
149
|
-
-
|
|
150
|
-
-
|
|
151
|
-
-
|
|
152
|
-
-
|
|
153
|
-
-
|
|
154
|
-
- **Date is today's date** — use the current date in `YYYY-MM-DD` format
|
|
155
|
-
- **One AskUserQuestion call** — present all candidates in a single call, not sequentially
|
|
156
|
-
|
|
157
|
-
---
|
|
158
|
-
|
|
159
|
-
## Example
|
|
160
|
-
|
|
161
|
-
```
|
|
162
|
-
USER: /df:note
|
|
163
|
-
|
|
164
|
-
CLAUDE: Let me scan our conversation for candidate decisions.
|
|
165
|
-
|
|
166
|
-
[Reads .deepflow/decisions.md — finds prior entry:
|
|
167
|
-
"### 2026-02-10 — spec
|
|
168
|
-
- [APPROACH] Use PostgreSQL for primary storage — team already knows it"]
|
|
169
|
-
|
|
170
|
-
[Extracts from conversation:
|
|
171
|
-
1. [APPROACH] Use Redis for session state — avoids DB round-trips on hot path
|
|
172
|
-
2. [PROVISIONAL] Skip auth for internal APIs — only 2 internal services for now
|
|
173
|
-
3. [ASSUMPTION] Peak load is 500 req/s — based on current traffic, unvalidated
|
|
174
|
-
]
|
|
175
|
-
|
|
176
|
-
[No contradictions found with prior entries]
|
|
177
|
-
|
|
178
|
-
[Uses AskUserQuestion]
|
|
179
|
-
{
|
|
180
|
-
"questions": [
|
|
181
|
-
{
|
|
182
|
-
"question": "These decisions were detected in your conversation. Which should be saved to .deepflow/decisions.md?",
|
|
183
|
-
"header": "Save notes?",
|
|
184
|
-
"multiSelect": true,
|
|
185
|
-
"options": [
|
|
186
|
-
{
|
|
187
|
-
"label": "[APPROACH] Use Redis for session state",
|
|
188
|
-
"description": "Avoids DB round-trips on hot path"
|
|
189
|
-
},
|
|
190
|
-
{
|
|
191
|
-
"label": "[PROVISIONAL] Skip auth for internal APIs",
|
|
192
|
-
"description": "Only 2 internal services for now, revisit when we open to more consumers"
|
|
193
|
-
},
|
|
194
|
-
{
|
|
195
|
-
"label": "[ASSUMPTION] Peak load is 500 req/s",
|
|
196
|
-
"description": "Based on current traffic patterns, not load-tested"
|
|
197
|
-
}
|
|
198
|
-
]
|
|
199
|
-
}
|
|
200
|
-
]
|
|
201
|
-
}
|
|
202
|
-
|
|
203
|
-
USER: [Selects: Use Redis for session state, Peak load is 500 req/s]
|
|
204
|
-
|
|
205
|
-
CLAUDE: [Appends to .deepflow/decisions.md:]
|
|
206
|
-
|
|
207
|
-
### 2026-02-22 — note
|
|
208
|
-
- [APPROACH] Use Redis for session state — avoids DB round-trips on hot path
|
|
209
|
-
- [ASSUMPTION] Peak load is 500 req/s — based on current traffic patterns, not load-tested
|
|
210
|
-
|
|
211
|
-
Saved 2 decision(s) to .deepflow/decisions.md
|
|
212
|
-
```
|
|
68
|
+
- Max 4 candidates per invocation (AskUserQuestion tool limit)
|
|
69
|
+
- multiSelect: true — user confirms any subset
|
|
70
|
+
- Never invent decisions — only extract what was discussed and resolved
|
|
71
|
+
- Never modify prior entries in `.deepflow/decisions.md`
|
|
72
|
+
- Source is always `note`; date is today (YYYY-MM-DD)
|
|
73
|
+
- One AskUserQuestion call — all candidates in a single call
|
package/src/commands/df/plan.md
CHANGED
|
@@ -5,7 +5,6 @@ description: Compare specs against codebase and past experiments, generate prior
|
|
|
5
5
|
|
|
6
6
|
# /df:plan — Generate Task Plan from Specs
|
|
7
7
|
|
|
8
|
-
## Purpose
|
|
9
8
|
Compare specs against codebase and past experiments. Generate prioritized tasks.
|
|
10
9
|
|
|
11
10
|
**NEVER:** use EnterPlanMode, use ExitPlanMode — this command IS the planning phase
|
|
@@ -37,22 +36,35 @@ Load: specs/*.md (exclude doing-*/done-*), PLAN.md (if exists), .deepflow/config
|
|
|
37
36
|
Determine source_dir from config or default to src/
|
|
38
37
|
```
|
|
39
38
|
|
|
40
|
-
Shell injection
|
|
39
|
+
Shell injection:
|
|
41
40
|
- `` !`ls specs/*.md 2>/dev/null || echo 'NOT_FOUND'` ``
|
|
42
41
|
- `` !`cat PLAN.md 2>/dev/null || echo 'NOT_FOUND'` ``
|
|
43
42
|
|
|
44
|
-
Run `validateSpec` on each spec. Hard failures → skip + error. Advisory → include
|
|
43
|
+
Run `validateSpec` on each spec. Hard failures → skip + error. Advisory → include.
|
|
44
|
+
Record each spec's computed layer (gates task generation per §1.5).
|
|
45
45
|
No new specs → report counts, suggest `/df:execute`.
|
|
46
46
|
|
|
47
|
-
###
|
|
47
|
+
### 1.5. LAYER-GATED TASK GENERATION
|
|
48
48
|
|
|
49
|
-
|
|
49
|
+
| Layer | Sections present | Allowed task types |
|
|
50
|
+
|-------|------------------|--------------------|
|
|
51
|
+
| L0 | Objective | Spikes only |
|
|
52
|
+
| L1 | + Requirements | Spikes only (better targeted) |
|
|
53
|
+
| L2 | + Acceptance Criteria | Spikes + Implementation |
|
|
54
|
+
| L3 | + Constraints, Out of Scope, Technical Notes | Spikes + Implementation + Impact analysis + Optimize |
|
|
50
55
|
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
56
|
+
**Rules:**
|
|
57
|
+
- L0–L1: ONLY spike tasks. Implementation blocked until spec deepens to L2+.
|
|
58
|
+
- L2: spikes + implementation, skip impact analysis.
|
|
59
|
+
- L3: full planning — spikes, implementation, impact analysis, optimize.
|
|
60
|
+
- Spike results deepen specs: findings incorporated back via user or `/df:spec`, raising layer.
|
|
61
|
+
- Report layer: `"Spec {name}: L{N} ({label}) — {task_types_generated}"`
|
|
62
|
+
|
|
63
|
+
### 2. CHECK PAST EXPERIMENTS (SPIKE-FIRST)
|
|
64
|
+
|
|
65
|
+
**CRITICAL**: Check experiments BEFORE generating tasks.
|
|
54
66
|
|
|
55
|
-
File naming: `{topic}--{hypothesis}--{status}.md`
|
|
67
|
+
Glob `.deepflow/experiments/{topic}--*`. File naming: `{topic}--{hypothesis}--{status}.md`
|
|
56
68
|
|
|
57
69
|
| Result | Action |
|
|
58
70
|
|--------|--------|
|
|
@@ -61,140 +73,79 @@ File naming: `{topic}--{hypothesis}--{status}.md` (active/passed/failed)
|
|
|
61
73
|
| `--active.md` | Wait for completion |
|
|
62
74
|
| No matches | New topic, generate initial spike |
|
|
63
75
|
|
|
64
|
-
|
|
76
|
+
Implementation tasks BLOCKED until spike validates.
|
|
65
77
|
|
|
66
78
|
### 3. DETECT PROJECT CONTEXT
|
|
67
79
|
|
|
68
80
|
Identify code style, patterns (error handling, API structure), integration points. Include in task descriptions.
|
|
69
81
|
|
|
70
|
-
### 4. IMPACT ANALYSIS (
|
|
82
|
+
### 4. IMPACT ANALYSIS (L3 specs only)
|
|
71
83
|
|
|
72
|
-
For each file in a task's
|
|
84
|
+
Skip for L0–L2 specs. For each file in a task's `Files:` list, find blast radius.
|
|
73
85
|
|
|
74
|
-
**Search
|
|
86
|
+
**Search (prefer LSP, fallback grep):**
|
|
87
|
+
1. **Callers:** LSP `findReferences`/`incomingCalls` on exports being changed. Annotate WHY impacted. Fallback: grep.
|
|
88
|
+
2. **Duplicates:** Similar logic files. Classify: `[active]` → consolidate, `[dead]` → DELETE.
|
|
89
|
+
3. **Data flow:** LSP `outgoingCalls` to trace consumers.
|
|
75
90
|
|
|
76
|
-
|
|
77
|
-
2. **Duplicates:** Files with similar logic (same function name, same transformation). Classify:
|
|
78
|
-
- `[active]` — used in production → must consolidate
|
|
79
|
-
- `[dead]` — bypassed/unreachable → must delete
|
|
80
|
-
3. **Data flow:** If file produces/transforms data, use LSP `outgoingCalls` to trace consumers. Fallback: grep across languages
|
|
81
|
-
|
|
82
|
-
**Embed as `Impact:` block in each task:**
|
|
83
|
-
```markdown
|
|
84
|
-
- [ ] **T2**: Add new features to YAML export
|
|
85
|
-
- Files: src/utils/buildConfigData.ts
|
|
86
|
-
- Impact:
|
|
87
|
-
- Callers: src/routes/index.ts:12, src/api/handler.ts:45
|
|
88
|
-
- Duplicates:
|
|
89
|
-
- src/components/YamlViewer.tsx:19 (own generateYAML) [active — consolidate]
|
|
90
|
-
- backend/yaml_gen.go (generateYAMLFromConfig) [dead — DELETE]
|
|
91
|
-
- Data flow: buildConfigData → YamlViewer, SimControls, RoleplayPage
|
|
92
|
-
- Blocked by: T1
|
|
93
|
-
```
|
|
94
|
-
|
|
95
|
-
Files outside original "Files:" → add with `(impact — verify/update)`.
|
|
96
|
-
Skip for spike tasks.
|
|
91
|
+
Embed as `Impact:` block in each task. Files outside original `Files:` → add with `(impact — verify/update)`. Skip for spikes.
|
|
97
92
|
|
|
98
93
|
### 4.5. TARGETED EXPLORATION
|
|
99
94
|
|
|
100
|
-
Follow `templates/explore-agent.md` for spawn rules
|
|
95
|
+
Follow `templates/explore-agent.md` for spawn rules. 3-5 agents cover post-LSP gaps: conventions, dead code, implicit patterns.
|
|
101
96
|
|
|
102
|
-
|
|
103
|
-
|--------------|--------|
|
|
104
|
-
| Post-LSP gaps | 3-5 |
|
|
105
|
-
|
|
106
|
-
Use `code-completeness` skill to search for: implementations matching spec requirements, TODOs/FIXMEs/HACKs, stubs, skipped tests.
|
|
97
|
+
Use `code-completeness` skill: implementations matching spec, TODOs/FIXMEs/HACKs, stubs, skipped tests.
|
|
107
98
|
|
|
108
99
|
### 4.6. CROSS-TASK FILE CONFLICT DETECTION
|
|
109
100
|
|
|
110
|
-
After all tasks have
|
|
111
|
-
|
|
112
|
-
**Algorithm:**
|
|
113
|
-
1. Build a map: `file → [task IDs that list it]`
|
|
114
|
-
2. For each file with >1 task: add `Blocked by` edge from later task → earlier task (by task number)
|
|
115
|
-
3. If a dependency already exists (direct or transitive), skip (no redundant edges)
|
|
101
|
+
After all tasks have `Files:` lists, detect overlaps requiring sequential execution.
|
|
116
102
|
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
T3: Files: config.go — Blocked by: none
|
|
121
|
-
T5: Files: config.go — Blocked by: none
|
|
122
|
-
```
|
|
123
|
-
After conflict detection:
|
|
124
|
-
```
|
|
125
|
-
T1: Blocked by: none
|
|
126
|
-
T3: Blocked by: T1 (file conflict: config.go)
|
|
127
|
-
T5: Blocked by: T3 (file conflict: config.go)
|
|
128
|
-
```
|
|
103
|
+
1. Build map: `file → [task IDs]`
|
|
104
|
+
2. For files with >1 task: add `Blocked by` from later → earlier task
|
|
105
|
+
3. Skip if dependency already exists (direct or transitive)
|
|
129
106
|
|
|
130
|
-
**Rules:**
|
|
131
|
-
- Only add the minimum edges needed (chain, not full mesh — T5 blocks on T3, not T1+T3)
|
|
132
|
-
- Append `(file conflict: {filename})` to the Blocked by reason for traceability
|
|
133
|
-
- If a logical dependency already covers the ordering, don't add a redundant conflict edge
|
|
134
|
-
- Cross-spec conflicts: tasks from different specs sharing files get the same treatment
|
|
107
|
+
**Rules:** Chain only (T5→T3, not T5→T1+T3). Append `(file conflict: {filename})`. Logical deps override conflict edges. Cross-spec conflicts get same treatment.
|
|
135
108
|
|
|
136
109
|
### 5. COMPARE & PRIORITIZE
|
|
137
110
|
|
|
138
|
-
Spawn `Task(subagent_type="reasoner", model="opus")`. Map each requirement to DONE
|
|
111
|
+
Spawn `Task(subagent_type="reasoner", model="opus")`. Map each requirement to DONE/PARTIAL/MISSING/CONFLICT. Check REQ-AC alignment. Flag spec gaps.
|
|
139
112
|
|
|
140
113
|
Priority: Dependencies → Impact → Risk
|
|
141
114
|
|
|
142
115
|
#### Metric AC Detection
|
|
143
116
|
|
|
144
|
-
|
|
117
|
+
Scan ACs for pattern `{metric} {operator} {number}[unit]` (e.g., `coverage > 85%`, `latency < 200ms`). Operators: `>`, `<`, `>=`, `<=`, `==`.
|
|
145
118
|
|
|
146
|
-
- **
|
|
147
|
-
- **
|
|
148
|
-
- **
|
|
149
|
-
- **On match**: flag the AC as a **metric AC** and generate an `Optimize:` task (see section 6.5)
|
|
150
|
-
- **Non-match**: treat as standard functional AC → standard implementation task
|
|
151
|
-
- **Ambiguous ACs** (qualitative terms like "fast", "small", "improved"): flag as spec gap, request numeric threshold before planning
|
|
119
|
+
- **Match:** flag as metric AC → generate `Optimize:` task (§6.5)
|
|
120
|
+
- **Non-match:** standard implementation task
|
|
121
|
+
- **Ambiguous** ("fast", "small"): flag as spec gap, request numeric threshold
|
|
152
122
|
|
|
153
123
|
### 5.5. CLASSIFY MODEL + EFFORT PER TASK
|
|
154
124
|
|
|
155
|
-
For each task, assign `Model:` and `Effort:` based on the routing matrix:
|
|
156
|
-
|
|
157
125
|
#### Routing matrix
|
|
158
126
|
|
|
159
|
-
| Task type | Model | Effort |
|
|
160
|
-
|
|
161
|
-
| Bootstrap (scaffold, config, rename) | `haiku` | `low` |
|
|
162
|
-
| browse-fetch (doc retrieval) | `haiku` | `low` |
|
|
163
|
-
| Single-file simple addition | `haiku` | `high` |
|
|
164
|
-
| Multi-file with clear specs | `sonnet` | `medium` |
|
|
165
|
-
| Bug fix (clear repro) | `sonnet` | `medium` |
|
|
166
|
-
| Bug fix (unclear cause) | `sonnet` | `high` |
|
|
167
|
-
| Spike / validation | `sonnet` | `high` |
|
|
168
|
-
| Optimize (metric AC) | `opus` | `high` |
|
|
169
|
-
| Feature work (well-specced) | `sonnet` | `medium` |
|
|
170
|
-
| Feature work (ambiguous ACs) | `opus` | `medium` |
|
|
171
|
-
| Refactor (>5 files, many callers) | `opus` | `medium` |
|
|
172
|
-
| Architecture change | `opus` | `high` |
|
|
173
|
-
| Unfamiliar API integration | `opus` | `high` |
|
|
174
|
-
| Retried after revert | _(raise one level)_ | `high` |
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
1. **File count** — 1 file → haiku/sonnet, 2-5 → sonnet, >5 → sonnet/opus
|
|
179
|
-
2. **Impact blast radius** — many callers/duplicates → raise model
|
|
180
|
-
3. **Spec clarity** — clear ACs → lower effort, ambiguous → raise effort
|
|
181
|
-
4. **Type** — spikes → `sonnet high`, bootstrap → `haiku low`
|
|
182
|
-
5. **Has prior failures** — raise model one level AND set effort to `high`
|
|
183
|
-
6. **Repetitiveness** — repetitive pattern across files → lower effort even at higher model
|
|
184
|
-
|
|
185
|
-
#### Effort economics
|
|
186
|
-
|
|
187
|
-
Effort controls ALL token spend (text, tool calls, thinking). Lower effort = fewer tool calls, less preamble, shorter reasoning.
|
|
188
|
-
|
|
189
|
-
- `low` → ~60-70% token reduction vs high. Use when task is mechanical.
|
|
190
|
-
- `medium` → ~30-40% token reduction. Use when specs are clear.
|
|
191
|
-
- `high` → full spend (default). Use when ambiguity or risk is high.
|
|
192
|
-
|
|
193
|
-
Add `Model: haiku|sonnet|opus` and `Effort: low|medium|high` to each task block. Defaults: `Model: sonnet`, `Effort: medium`.
|
|
127
|
+
| Task type | Model | Effort |
|
|
128
|
+
|-----------|-------|--------|
|
|
129
|
+
| Bootstrap (scaffold, config, rename) | `haiku` | `low` |
|
|
130
|
+
| browse-fetch (doc retrieval) | `haiku` | `low` |
|
|
131
|
+
| Single-file simple addition | `haiku` | `high` |
|
|
132
|
+
| Multi-file with clear specs | `sonnet` | `medium` |
|
|
133
|
+
| Bug fix (clear repro) | `sonnet` | `medium` |
|
|
134
|
+
| Bug fix (unclear cause) | `sonnet` | `high` |
|
|
135
|
+
| Spike / validation | `sonnet` | `high` |
|
|
136
|
+
| Optimize (metric AC) | `opus` | `high` |
|
|
137
|
+
| Feature work (well-specced) | `sonnet` | `medium` |
|
|
138
|
+
| Feature work (ambiguous ACs) | `opus` | `medium` |
|
|
139
|
+
| Refactor (>5 files, many callers) | `opus` | `medium` |
|
|
140
|
+
| Architecture change | `opus` | `high` |
|
|
141
|
+
| Unfamiliar API integration | `opus` | `high` |
|
|
142
|
+
| Retried after revert | _(raise one level)_ | `high` |
|
|
143
|
+
|
|
144
|
+
Add `Model:` and `Effort:` to each task. Defaults: `sonnet` / `medium`.
|
|
194
145
|
|
|
195
146
|
### 6. GENERATE SPIKE TASKS (IF NEEDED)
|
|
196
147
|
|
|
197
|
-
**
|
|
148
|
+
**Format:**
|
|
198
149
|
```markdown
|
|
199
150
|
- [ ] **T1** [SPIKE]: Validate {hypothesis}
|
|
200
151
|
- Type: spike
|
|
@@ -206,12 +157,10 @@ Add `Model: haiku|sonnet|opus` and `Effort: low|medium|high` to each task block.
|
|
|
206
157
|
- Blocked by: none
|
|
207
158
|
```
|
|
208
159
|
|
|
209
|
-
All implementation tasks MUST `Blocked by: T{spike}`. Spike fails → `--failed.md`, no implementation
|
|
160
|
+
All implementation tasks MUST `Blocked by: T{spike}`. Spike fails → `--failed.md`, no implementation.
|
|
210
161
|
|
|
211
162
|
#### Probe Diversity
|
|
212
163
|
|
|
213
|
-
When generating multiple spikes for the same problem:
|
|
214
|
-
|
|
215
164
|
| Requirement | Rule |
|
|
216
165
|
|-------------|------|
|
|
217
166
|
| Contradictory | ≥2 probes with opposing approaches |
|
|
@@ -221,38 +170,15 @@ When generating multiple spikes for the same problem:
|
|
|
221
170
|
|
|
222
171
|
Before output, verify: ≥2 opposing probes, ≥1 naive, all independent.
|
|
223
172
|
|
|
224
|
-
**Example — caching problem, 3 diverse probes:**
|
|
225
|
-
```markdown
|
|
226
|
-
- [ ] **T1** [SPIKE]: Validate in-memory LRU cache
|
|
227
|
-
- Role: Contradictory-A (in-process)
|
|
228
|
-
- Hypothesis: In-memory LRU reduces DB queries by ≥80%
|
|
229
|
-
- Method: LRU with 1000-item cap, load test
|
|
230
|
-
- Success criteria: DB queries drop ≥80% under 100 concurrent users
|
|
231
|
-
|
|
232
|
-
- [ ] **T2** [SPIKE]: Validate Redis distributed cache
|
|
233
|
-
- Role: Contradictory-B (external, opposing T1)
|
|
234
|
-
- Hypothesis: Redis scales across multiple instances
|
|
235
|
-
- Method: Redis client, cache top 10 queries, same load test
|
|
236
|
-
- Success criteria: DB queries drop ≥80%, works across 2 instances
|
|
237
|
-
|
|
238
|
-
- [ ] **T3** [SPIKE]: Validate query optimization without cache
|
|
239
|
-
- Role: Naive (no prior justification — tests if caching is even necessary)
|
|
240
|
-
- Hypothesis: Indexes + query batching alone may suffice
|
|
241
|
-
- Method: Add indexes, batch N+1 queries, same load test — no cache
|
|
242
|
-
- Success criteria: DB queries drop ≥80% with zero cache infrastructure
|
|
243
|
-
```
|
|
244
|
-
|
|
245
173
|
### 6.5. GENERATE OPTIMIZE TASKS (FROM METRIC ACs)
|
|
246
174
|
|
|
247
|
-
|
|
248
|
-
|
|
249
|
-
**Optimize Task Format:**
|
|
175
|
+
**Format:**
|
|
250
176
|
```markdown
|
|
251
177
|
- [ ] **T{n}** [OPTIMIZE]: Improve {metric_name} to {target}
|
|
252
178
|
- Type: optimize
|
|
253
|
-
- Files: {primary files
|
|
179
|
+
- Files: {primary files affecting metric}
|
|
254
180
|
- Optimize:
|
|
255
|
-
metric: "{shell command
|
|
181
|
+
metric: "{shell command outputting single number}"
|
|
256
182
|
target: {number}
|
|
257
183
|
direction: higher|lower
|
|
258
184
|
max_cycles: {number, default 20}
|
|
@@ -262,95 +188,39 @@ For each metric AC detected in section 5, generate an `Optimize:` task using thi
|
|
|
262
188
|
regression_threshold: 5%
|
|
263
189
|
- Model: opus
|
|
264
190
|
- Effort: high
|
|
265
|
-
- Blocked by: {spike
|
|
191
|
+
- Blocked by: {spike if applicable, else none}
|
|
266
192
|
```
|
|
267
193
|
|
|
268
|
-
**Field rules:**
|
|
269
|
-
- `metric`: a shell command returning a single scalar float/integer (e.g., `npx jest --coverage --json | jq '.coverageMap | .. | .pct? | numbers' | awk '{sum+=$1;n++} END{print sum/n}'`). Must be deterministic and side-effect free.
|
|
270
|
-
- `target`: the numeric threshold extracted from the AC (strip unit suffix for the value; note unit in task description)
|
|
271
|
-
- `direction`: `higher` if operator is `>` or `>=`; `lower` if `<` or `<=`; `higher` by convention for `==`
|
|
272
|
-
- `max_cycles`: from spec if stated; default 20
|
|
273
|
-
- `secondary_metrics`: other metrics from the same spec that could regress (e.g., build time, bundle size, test count). Omit if none.
|
|
274
|
-
|
|
275
|
-
**Model/Effort**: always `opus` / `high` (see routing matrix).
|
|
276
|
-
|
|
277
|
-
**Blocking**: if a spike exists for the same area, block the optimize task on the spike passing.
|
|
194
|
+
**Field rules:** `metric` must be deterministic, side-effect free, return single scalar. `direction`: higher for `>`/`>=`, lower for `<`/`<=`, higher for `==`. `max_cycles`: from spec or default 20. Always `opus`/`high`. Block on spike if one exists.
|
|
278
195
|
|
|
279
196
|
### 7. VALIDATE HYPOTHESES
|
|
280
197
|
|
|
281
|
-
Unfamiliar APIs or performance-critical → prototype in scratchpad. Fails →
|
|
198
|
+
Unfamiliar APIs or performance-critical → prototype in scratchpad. Fails → `--failed.md`. Skip for known patterns.
|
|
282
199
|
|
|
283
200
|
### 8. CLEANUP PLAN.md
|
|
284
201
|
|
|
285
|
-
Prune stale
|
|
202
|
+
Prune stale `done-*` sections and orphaned headers. Recalculate Summary. Empty → recreate fresh.
|
|
286
203
|
|
|
287
204
|
### 9. OUTPUT & RENAME
|
|
288
205
|
|
|
289
206
|
Append tasks grouped by `### doing-{spec-name}`. Rename `specs/feature.md` → `specs/doing-feature.md`.
|
|
290
207
|
|
|
291
|
-
Report:
|
|
208
|
+
Report:
|
|
209
|
+
```
|
|
210
|
+
✓ Plan generated — {n} specs, {n} tasks. Run /df:execute
|
|
211
|
+
|
|
212
|
+
Spec layers:
|
|
213
|
+
{name}: L{N} ({label}) — {n} spikes{, {n} impl tasks if L2+}
|
|
214
|
+
```
|
|
215
|
+
|
|
216
|
+
If any L0–L1 spec: `ℹ L0–L1 specs generate spikes only. Deepen with /df:spec {name} to unlock implementation.`
|
|
292
217
|
|
|
293
218
|
## Rules
|
|
219
|
+
- **Layer-gated** — L0–L1 → spikes only; L2+ → implementation; L3 → full planning
|
|
294
220
|
- **Spike-first** — No `--passed.md` → spike before implementation
|
|
295
|
-
- **Block on spike** — Implementation
|
|
221
|
+
- **Block on spike** — Implementation blocked until spike validates
|
|
296
222
|
- **Learn from failures** — Extract next hypothesis, never repeat approach
|
|
297
223
|
- **Plan only** — Do NOT implement (except quick validation prototypes)
|
|
298
224
|
- **One task = one logical unit** — Atomic, committable
|
|
299
225
|
- Prefer existing utilities over new code; flag spec gaps
|
|
300
|
-
|
|
301
|
-
## Agent Scaling
|
|
302
|
-
|
|
303
|
-
| Agent | Model | Base | Scale |
|
|
304
|
-
|-------|-------|------|-------|
|
|
305
|
-
| Explore | haiku | 3-5 | none |
|
|
306
|
-
| Reasoner | opus | 5 | +1 per 2 specs |
|
|
307
|
-
|
|
308
|
-
Always use `Task` tool with explicit `subagent_type` and `model`.
|
|
309
|
-
|
|
310
|
-
## Example
|
|
311
|
-
|
|
312
|
-
```markdown
|
|
313
|
-
### doing-upload
|
|
314
|
-
|
|
315
|
-
- [ ] **T1** [SPIKE]: Validate streaming upload approach
|
|
316
|
-
- Type: spike
|
|
317
|
-
- Hypothesis: Streaming uploads handle >1GB without memory issues
|
|
318
|
-
- Success criteria: Memory <500MB during 2GB upload
|
|
319
|
-
- Files: .deepflow/experiments/upload--streaming--active.md
|
|
320
|
-
- Blocked by: none
|
|
321
|
-
|
|
322
|
-
- [ ] **T2**: Create upload endpoint
|
|
323
|
-
- Files: src/api/upload.ts
|
|
324
|
-
- Model: sonnet
|
|
325
|
-
- Impact:
|
|
326
|
-
- Callers: src/routes/index.ts:5
|
|
327
|
-
- Duplicates: backend/legacy-upload.go [dead — DELETE]
|
|
328
|
-
- Blocked by: T1
|
|
329
|
-
|
|
330
|
-
- [ ] **T3**: Add S3 service with streaming
|
|
331
|
-
- Files: src/services/storage.ts
|
|
332
|
-
- Model: opus
|
|
333
|
-
- Blocked by: T1, T2
|
|
334
|
-
```
|
|
335
|
-
|
|
336
|
-
**Optimize task example** (from spec AC: `coverage > 85%`):
|
|
337
|
-
|
|
338
|
-
```markdown
|
|
339
|
-
### doing-quality
|
|
340
|
-
|
|
341
|
-
- [ ] **T1** [OPTIMIZE]: Improve test coverage to >85%
|
|
342
|
-
- Type: optimize
|
|
343
|
-
- Files: src/
|
|
344
|
-
- Optimize:
|
|
345
|
-
metric: "npx jest --coverage --json 2>/dev/null | jq '[.. | .pct? | numbers] | add / length'"
|
|
346
|
-
target: 85
|
|
347
|
-
direction: higher
|
|
348
|
-
max_cycles: 20
|
|
349
|
-
secondary_metrics:
|
|
350
|
-
- metric: "npx jest --json 2>/dev/null | jq '.testResults | length'"
|
|
351
|
-
name: test_count
|
|
352
|
-
regression_threshold: 5%
|
|
353
|
-
- Model: opus
|
|
354
|
-
- Effort: high
|
|
355
|
-
- Blocked by: none
|
|
356
|
-
```
|
|
226
|
+
- Always use `Task` tool with explicit `subagent_type` and `model`
|