deepflow 0.1.44 → 0.1.46
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/install.js +2 -2
- package/package.json +1 -1
- package/src/commands/df/debate.md +283 -0
- package/src/commands/df/discover.md +182 -0
- package/src/commands/df/execute.md +124 -16
- package/src/commands/df/spec.md +2 -0
- package/src/commands/df/verify.md +9 -2
package/bin/install.js
CHANGED
|
@@ -146,7 +146,7 @@ async function main() {
|
|
|
146
146
|
console.log(`${c.green}Installation complete!${c.reset}`);
|
|
147
147
|
console.log('');
|
|
148
148
|
console.log(`Installed to ${c.cyan}${CLAUDE_DIR}${c.reset}:`);
|
|
149
|
-
console.log(' commands/df/ — /df:spec, /df:plan, /df:execute, /df:verify');
|
|
149
|
+
console.log(' commands/df/ — /df:discover, /df:debate, /df:spec, /df:plan, /df:execute, /df:verify');
|
|
150
150
|
console.log(' skills/ — gap-discovery, atomic-commits, code-completeness');
|
|
151
151
|
console.log(' agents/ — reasoner');
|
|
152
152
|
if (level === 'global') {
|
|
@@ -165,7 +165,7 @@ async function main() {
|
|
|
165
165
|
console.log(' 1. claude');
|
|
166
166
|
}
|
|
167
167
|
console.log(' 2. Describe what you want to build');
|
|
168
|
-
console.log(' 3. /df:
|
|
168
|
+
console.log(' 3. /df:discover feature-name');
|
|
169
169
|
console.log('');
|
|
170
170
|
}
|
|
171
171
|
|
package/package.json
CHANGED
|
@@ -0,0 +1,283 @@
|
|
|
1
|
+
# /df:debate — Multi-Perspective Analysis
|
|
2
|
+
|
|
3
|
+
## Orchestrator Role
|
|
4
|
+
|
|
5
|
+
You coordinate reasoner agents to debate a problem from multiple perspectives, then synthesize their arguments into a structured document.
|
|
6
|
+
|
|
7
|
+
**NEVER:** Read source files, use Glob/Grep directly, run git, use TaskOutput, use `run_in_background`, use Explore agents
|
|
8
|
+
|
|
9
|
+
**ONLY:** Spawn reasoner agents (non-background), write debate file, respond conversationally
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## Purpose
|
|
14
|
+
Generate a multi-perspective analysis of a problem before formalizing into a spec. Surfaces tensions, trade-offs, and blind spots that a single perspective would miss.
|
|
15
|
+
|
|
16
|
+
## Usage
|
|
17
|
+
```
|
|
18
|
+
/df:debate <name>
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
## Skills & Agents
|
|
22
|
+
|
|
23
|
+
**Use Task tool to spawn agents:**
|
|
24
|
+
| Agent | subagent_type | model | Purpose |
|
|
25
|
+
|-------|---------------|-------|---------|
|
|
26
|
+
| User Advocate | `reasoner` | `opus` | UX, simplicity, real user needs |
|
|
27
|
+
| Tech Skeptic | `reasoner` | `opus` | Technical risks, hidden complexity, feasibility |
|
|
28
|
+
| Systems Thinker | `reasoner` | `opus` | Integration, scalability, long-term effects |
|
|
29
|
+
| LLM Efficiency | `reasoner` | `opus` | Token density, minimal scaffolding, navigable structure |
|
|
30
|
+
| Synthesizer | `reasoner` | `opus` | Merge perspectives into consensus + tensions |
|
|
31
|
+
|
|
32
|
+
---
|
|
33
|
+
|
|
34
|
+
## Behavior
|
|
35
|
+
|
|
36
|
+
### 1. SUMMARIZE
|
|
37
|
+
|
|
38
|
+
Summarize the conversation context (from prior discover/conversation) in ~200 words. This summary will be passed to each perspective agent.
|
|
39
|
+
|
|
40
|
+
The summary should capture:
|
|
41
|
+
- The core problem being solved
|
|
42
|
+
- Key requirements mentioned
|
|
43
|
+
- Constraints and boundaries
|
|
44
|
+
- User's stated preferences and priorities
|
|
45
|
+
|
|
46
|
+
### 2. SPAWN PERSPECTIVES
|
|
47
|
+
|
|
48
|
+
**Spawn ALL 4 perspective agents in ONE message (non-background, parallel):**
|
|
49
|
+
|
|
50
|
+
Each agent receives the same context summary but a different role. Each must:
|
|
51
|
+
- Argue from their perspective
|
|
52
|
+
- Identify risks the other perspectives might miss
|
|
53
|
+
- Propose concrete alternatives where they disagree with the likely approach
|
|
54
|
+
|
|
55
|
+
```python
|
|
56
|
+
# All 4 in a single message — parallel, non-background:
|
|
57
|
+
Task(subagent_type="reasoner", model="opus", prompt="""
|
|
58
|
+
You are the USER ADVOCATE in a design debate.
|
|
59
|
+
|
|
60
|
+
## Context
|
|
61
|
+
{summary}
|
|
62
|
+
|
|
63
|
+
## Your Role
|
|
64
|
+
Argue from the perspective of the end user. Focus on:
|
|
65
|
+
- Simplicity and ease of use
|
|
66
|
+
- Real user needs vs assumed needs
|
|
67
|
+
- Friction points and cognitive load
|
|
68
|
+
- Whether the solution matches how users actually think
|
|
69
|
+
|
|
70
|
+
Provide:
|
|
71
|
+
1. Your key arguments (3-5 points)
|
|
72
|
+
2. Risks you see from a user perspective
|
|
73
|
+
3. Concrete alternatives if you disagree with the current direction
|
|
74
|
+
|
|
75
|
+
Keep response under 400 words.
|
|
76
|
+
""")
|
|
77
|
+
|
|
78
|
+
Task(subagent_type="reasoner", model="opus", prompt="""
|
|
79
|
+
You are the TECH SKEPTIC in a design debate.
|
|
80
|
+
|
|
81
|
+
## Context
|
|
82
|
+
{summary}
|
|
83
|
+
|
|
84
|
+
## Your Role
|
|
85
|
+
Challenge technical assumptions and surface hidden complexity. Focus on:
|
|
86
|
+
- What could go wrong technically
|
|
87
|
+
- Hidden dependencies or coupling
|
|
88
|
+
- Complexity that seems simple but isn't
|
|
89
|
+
- Maintenance burden over time
|
|
90
|
+
|
|
91
|
+
Provide:
|
|
92
|
+
1. Your key arguments (3-5 points)
|
|
93
|
+
2. Technical risks others might overlook
|
|
94
|
+
3. Simpler alternatives worth considering
|
|
95
|
+
|
|
96
|
+
Keep response under 400 words.
|
|
97
|
+
""")
|
|
98
|
+
|
|
99
|
+
Task(subagent_type="reasoner", model="opus", prompt="""
|
|
100
|
+
You are the SYSTEMS THINKER in a design debate.
|
|
101
|
+
|
|
102
|
+
## Context
|
|
103
|
+
{summary}
|
|
104
|
+
|
|
105
|
+
## Your Role
|
|
106
|
+
Analyze how this fits into the broader system. Focus on:
|
|
107
|
+
- Integration with existing components
|
|
108
|
+
- Scalability implications
|
|
109
|
+
- Second-order effects and unintended consequences
|
|
110
|
+
- Long-term evolution and extensibility
|
|
111
|
+
|
|
112
|
+
Provide:
|
|
113
|
+
1. Your key arguments (3-5 points)
|
|
114
|
+
2. Systemic risks and ripple effects
|
|
115
|
+
3. Architectural alternatives worth considering
|
|
116
|
+
|
|
117
|
+
Keep response under 400 words.
|
|
118
|
+
""")
|
|
119
|
+
|
|
120
|
+
Task(subagent_type="reasoner", model="opus", prompt="""
|
|
121
|
+
You are the LLM EFFICIENCY expert in a design debate.
|
|
122
|
+
|
|
123
|
+
## Context
|
|
124
|
+
{summary}
|
|
125
|
+
|
|
126
|
+
## Your Role
|
|
127
|
+
Evaluate from the perspective of LLM consumption and interaction. Focus on:
|
|
128
|
+
- Token density: can the output be consumed efficiently by LLMs?
|
|
129
|
+
- Minimal scaffolding: avoid ceremony that adds tokens without information
|
|
130
|
+
- Navigable structure: can an LLM quickly find what it needs?
|
|
131
|
+
- Attention budget: does the design respect limited context windows?
|
|
132
|
+
|
|
133
|
+
Provide:
|
|
134
|
+
1. Your key arguments (3-5 points)
|
|
135
|
+
2. Efficiency risks others might not consider
|
|
136
|
+
3. Alternatives that optimize for LLM consumption
|
|
137
|
+
|
|
138
|
+
Keep response under 400 words.
|
|
139
|
+
""")
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
### 3. SYNTHESIZE
|
|
143
|
+
|
|
144
|
+
After all 4 perspectives return, spawn 1 additional reasoner to synthesize:
|
|
145
|
+
|
|
146
|
+
```python
|
|
147
|
+
Task(subagent_type="reasoner", model="opus", prompt="""
|
|
148
|
+
You are the SYNTHESIZER. Four perspectives have debated a design problem.
|
|
149
|
+
|
|
150
|
+
## Context
|
|
151
|
+
{summary}
|
|
152
|
+
|
|
153
|
+
## User Advocate's Arguments
|
|
154
|
+
{user_advocate_response}
|
|
155
|
+
|
|
156
|
+
## Tech Skeptic's Arguments
|
|
157
|
+
{tech_skeptic_response}
|
|
158
|
+
|
|
159
|
+
## Systems Thinker's Arguments
|
|
160
|
+
{systems_thinker_response}
|
|
161
|
+
|
|
162
|
+
## LLM Efficiency's Arguments
|
|
163
|
+
{llm_efficiency_response}
|
|
164
|
+
|
|
165
|
+
## Your Task
|
|
166
|
+
Synthesize these perspectives into:
|
|
167
|
+
|
|
168
|
+
1. **Consensus** — Points where all or most perspectives agree
|
|
169
|
+
2. **Tensions** — Unresolved disagreements and genuine trade-offs
|
|
170
|
+
3. **Open Decisions** — Questions that need human judgment to resolve
|
|
171
|
+
4. **Recommendation** — Your balanced recommendation considering all perspectives
|
|
172
|
+
|
|
173
|
+
Be specific. Name the tensions, don't smooth them over.
|
|
174
|
+
|
|
175
|
+
Keep response under 500 words.
|
|
176
|
+
""")
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
### 4. WRITE DEBATE FILE
|
|
180
|
+
|
|
181
|
+
Create `specs/.debate-{name}.md`:
|
|
182
|
+
|
|
183
|
+
```markdown
|
|
184
|
+
# Debate: {Name}
|
|
185
|
+
|
|
186
|
+
## Context
|
|
187
|
+
[~200 word summary from step 1]
|
|
188
|
+
|
|
189
|
+
## Perspectives
|
|
190
|
+
|
|
191
|
+
### User Advocate
|
|
192
|
+
[arguments from agent]
|
|
193
|
+
|
|
194
|
+
### Tech Skeptic
|
|
195
|
+
[arguments from agent]
|
|
196
|
+
|
|
197
|
+
### Systems Thinker
|
|
198
|
+
[arguments from agent]
|
|
199
|
+
|
|
200
|
+
### LLM Efficiency
|
|
201
|
+
[arguments from agent]
|
|
202
|
+
|
|
203
|
+
## Synthesis
|
|
204
|
+
|
|
205
|
+
### Consensus
|
|
206
|
+
[from synthesizer]
|
|
207
|
+
|
|
208
|
+
### Tensions
|
|
209
|
+
[from synthesizer]
|
|
210
|
+
|
|
211
|
+
### Open Decisions
|
|
212
|
+
[from synthesizer]
|
|
213
|
+
|
|
214
|
+
### Recommendation
|
|
215
|
+
[from synthesizer]
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
### 5. CONFIRM
|
|
219
|
+
|
|
220
|
+
After writing the file, present a brief summary to the user:
|
|
221
|
+
|
|
222
|
+
```
|
|
223
|
+
✓ Created specs/.debate-{name}.md
|
|
224
|
+
|
|
225
|
+
Key tensions:
|
|
226
|
+
- [tension 1]
|
|
227
|
+
- [tension 2]
|
|
228
|
+
|
|
229
|
+
Open decisions:
|
|
230
|
+
- [decision 1]
|
|
231
|
+
- [decision 2]
|
|
232
|
+
|
|
233
|
+
Next: Run /df:spec {name} to formalize into a specification
|
|
234
|
+
```
|
|
235
|
+
|
|
236
|
+
---
|
|
237
|
+
|
|
238
|
+
## Rules
|
|
239
|
+
|
|
240
|
+
- **All 4 perspective agents MUST be spawned in ONE message** (parallel, non-background)
|
|
241
|
+
- **NEVER use `run_in_background`** — causes late notifications that pollute output
|
|
242
|
+
- **NEVER use TaskOutput** — returns full transcripts that explode context
|
|
243
|
+
- **NEVER use Explore agents** — this command doesn't read code
|
|
244
|
+
- **NEVER read source files directly** — agents receive context via prompt only
|
|
245
|
+
- Reasoner agents receive context through their prompt, not by reading files
|
|
246
|
+
- The debate file goes in `specs/` so `/df:spec` can reference it
|
|
247
|
+
- File name MUST be `.debate-{name}.md` (dot prefix = auxiliary file)
|
|
248
|
+
- Keep each perspective under 400 words, synthesis under 500 words
|
|
249
|
+
|
|
250
|
+
## Example
|
|
251
|
+
|
|
252
|
+
```
|
|
253
|
+
USER: /df:debate auth
|
|
254
|
+
|
|
255
|
+
CLAUDE: Let me summarize what we've discussed and get multiple perspectives
|
|
256
|
+
on the authentication design.
|
|
257
|
+
|
|
258
|
+
[Summarizes: ~200 words about auth requirements from conversation]
|
|
259
|
+
|
|
260
|
+
[Spawns 4 reasoner agents in parallel — User Advocate, Tech Skeptic,
|
|
261
|
+
Systems Thinker, LLM Efficiency]
|
|
262
|
+
|
|
263
|
+
[All 4 return their arguments]
|
|
264
|
+
|
|
265
|
+
[Spawns synthesizer agent with all 4 perspectives]
|
|
266
|
+
|
|
267
|
+
[Synthesizer returns consensus, tensions, open decisions, recommendation]
|
|
268
|
+
|
|
269
|
+
[Writes specs/.debate-auth.md]
|
|
270
|
+
|
|
271
|
+
✓ Created specs/.debate-auth.md
|
|
272
|
+
|
|
273
|
+
Key tensions:
|
|
274
|
+
- OAuth complexity vs simpler API key approach
|
|
275
|
+
- User convenience (social login) vs privacy concerns
|
|
276
|
+
- Centralized auth service vs per-route middleware
|
|
277
|
+
|
|
278
|
+
Open decisions:
|
|
279
|
+
- Session storage strategy (JWT vs server-side)
|
|
280
|
+
- Token expiration policy
|
|
281
|
+
|
|
282
|
+
Next: Run /df:spec auth to formalize into a specification
|
|
283
|
+
```
|
|
@@ -0,0 +1,182 @@
|
|
|
1
|
+
# /df:discover — Deep Problem Exploration
|
|
2
|
+
|
|
3
|
+
## Orchestrator Role
|
|
4
|
+
|
|
5
|
+
You are a Socratic questioner. Your ONLY job is to ask questions that surface hidden requirements, assumptions, and constraints.
|
|
6
|
+
|
|
7
|
+
**NEVER:** Read source files, use Glob/Grep, spawn agents, create files, run git, use TaskOutput, use Task tool
|
|
8
|
+
|
|
9
|
+
**ONLY:** Ask questions using `AskUserQuestion` tool, respond conversationally
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## Purpose
|
|
14
|
+
Explore a problem space deeply before formalizing into specs. Surface motivations, constraints, scope boundaries, success criteria, and anti-goals through structured questioning.
|
|
15
|
+
|
|
16
|
+
## Usage
|
|
17
|
+
```
|
|
18
|
+
/df:discover <name>
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
## Behavior
|
|
22
|
+
|
|
23
|
+
Work through these phases organically. You don't need to announce phases — let the conversation flow naturally. Move to the next phase when the current one feels sufficiently explored.
|
|
24
|
+
|
|
25
|
+
### Phase 1: MOTIVATION
|
|
26
|
+
Why does this need to exist? What problem does it solve? Who suffers without it?
|
|
27
|
+
|
|
28
|
+
Example questions:
|
|
29
|
+
- What triggered the need for this?
|
|
30
|
+
- Who will use this and what's their current workaround?
|
|
31
|
+
- What happens if we don't build this?
|
|
32
|
+
|
|
33
|
+
### Phase 2: CONTEXT
|
|
34
|
+
What already exists? What has been tried? What's the current state?
|
|
35
|
+
|
|
36
|
+
Example questions:
|
|
37
|
+
- Is there existing code or infrastructure that relates to this?
|
|
38
|
+
- Have you tried solving this before? What worked/didn't?
|
|
39
|
+
- Are there external systems or APIs involved?
|
|
40
|
+
|
|
41
|
+
### Phase 3: SCOPE
|
|
42
|
+
What's in? What's out? What's the minimum viable version?
|
|
43
|
+
|
|
44
|
+
Example questions:
|
|
45
|
+
- What's the smallest version that would be useful?
|
|
46
|
+
- What features feel essential vs nice-to-have?
|
|
47
|
+
- Are there parts you explicitly want to exclude?
|
|
48
|
+
|
|
49
|
+
### Phase 4: CONSTRAINTS
|
|
50
|
+
Technical limits, time pressure, resource boundaries?
|
|
51
|
+
|
|
52
|
+
Example questions:
|
|
53
|
+
- Are there performance requirements or SLAs?
|
|
54
|
+
- What technologies are non-negotiable?
|
|
55
|
+
- Is there a deadline or timeline pressure?
|
|
56
|
+
|
|
57
|
+
### Phase 5: SUCCESS
|
|
58
|
+
How do we know it worked? What does "done" look like?
|
|
59
|
+
|
|
60
|
+
Example questions:
|
|
61
|
+
- How will you verify this works correctly?
|
|
62
|
+
- What metrics would indicate success?
|
|
63
|
+
- What would make you confident enough to ship?
|
|
64
|
+
|
|
65
|
+
### Phase 6: ANTI-GOALS
|
|
66
|
+
What should we explicitly NOT do? What traps to avoid?
|
|
67
|
+
|
|
68
|
+
Example questions:
|
|
69
|
+
- What's the most common way this kind of feature gets over-engineered?
|
|
70
|
+
- Are there approaches you've seen fail elsewhere?
|
|
71
|
+
- What should we explicitly avoid building?
|
|
72
|
+
|
|
73
|
+
---
|
|
74
|
+
|
|
75
|
+
## Rules
|
|
76
|
+
|
|
77
|
+
### Questioning Rules
|
|
78
|
+
- Use `AskUserQuestion` tool for structured questions with options
|
|
79
|
+
- Maximum **4 questions per `AskUserQuestion` call** (tool limit)
|
|
80
|
+
- Headers must be **≤12 characters**
|
|
81
|
+
- Mix structured questions (AskUserQuestion) with conversational follow-ups
|
|
82
|
+
- Ask follow-up questions based on answers — don't just march through phases mechanically
|
|
83
|
+
- Go deeper on surprising or unclear answers
|
|
84
|
+
|
|
85
|
+
### Behavioral Rules
|
|
86
|
+
- **NEVER assume** — if something is ambiguous, ask
|
|
87
|
+
- **NEVER suggest ending** — the user decides when they're done
|
|
88
|
+
- **NEVER take action** — no code reading, no file creation, no agents
|
|
89
|
+
- **NEVER skip phases** — but adapt depth based on the problem
|
|
90
|
+
- Keep your responses short between questions — don't lecture
|
|
91
|
+
- Acknowledge answers briefly before asking the next question
|
|
92
|
+
|
|
93
|
+
### When the User Wants to Move On
|
|
94
|
+
When the user signals they want to advance (e.g., "I think that's enough", "let's move on", "ready for next step"):
|
|
95
|
+
|
|
96
|
+
```
|
|
97
|
+
Great, we've covered a lot of ground. To analyze this from multiple perspectives, run:
|
|
98
|
+
|
|
99
|
+
/df:debate {name}
|
|
100
|
+
|
|
101
|
+
Or if you want to go straight to a spec:
|
|
102
|
+
|
|
103
|
+
/df:spec {name}
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
## Example
|
|
107
|
+
|
|
108
|
+
```
|
|
109
|
+
USER: /df:discover auth
|
|
110
|
+
|
|
111
|
+
CLAUDE: Let's explore the authentication problem space.
|
|
112
|
+
|
|
113
|
+
[Uses AskUserQuestion]
|
|
114
|
+
{
|
|
115
|
+
"questions": [
|
|
116
|
+
{
|
|
117
|
+
"question": "What's driving the need for auth right now?",
|
|
118
|
+
"header": "Motivation",
|
|
119
|
+
"multiSelect": false,
|
|
120
|
+
"options": [
|
|
121
|
+
{"label": "Security gap", "description": "Current system has no auth or weak auth"},
|
|
122
|
+
{"label": "New users", "description": "Opening up to users who need access control"},
|
|
123
|
+
{"label": "Compliance", "description": "Regulatory or policy requirement"}
|
|
124
|
+
]
|
|
125
|
+
},
|
|
126
|
+
{
|
|
127
|
+
"question": "Who are the primary users that will authenticate?",
|
|
128
|
+
"header": "Users",
|
|
129
|
+
"multiSelect": true,
|
|
130
|
+
"options": [
|
|
131
|
+
{"label": "End users", "description": "Public-facing users of the product"},
|
|
132
|
+
{"label": "Internal team", "description": "Your team or organization members"},
|
|
133
|
+
{"label": "API clients", "description": "Other services or integrations"}
|
|
134
|
+
]
|
|
135
|
+
}
|
|
136
|
+
]
|
|
137
|
+
}
|
|
138
|
+
|
|
139
|
+
USER: [Selects: Security gap, End users + API clients]
|
|
140
|
+
|
|
141
|
+
CLAUDE: So there's a real security gap and you need to cover both human users
|
|
142
|
+
and machine clients. That's two different auth flows to think about.
|
|
143
|
+
|
|
144
|
+
[Uses AskUserQuestion]
|
|
145
|
+
{
|
|
146
|
+
"questions": [
|
|
147
|
+
{
|
|
148
|
+
"question": "Do end users currently have any way to identify themselves?",
|
|
149
|
+
"header": "Current state",
|
|
150
|
+
"multiSelect": false,
|
|
151
|
+
"options": [
|
|
152
|
+
{"label": "Nothing", "description": "Completely open, no identification"},
|
|
153
|
+
{"label": "Basic auth", "description": "Simple username/password exists"},
|
|
154
|
+
{"label": "Third-party", "description": "Using external auth provider already"}
|
|
155
|
+
]
|
|
156
|
+
},
|
|
157
|
+
{
|
|
158
|
+
"question": "For API clients, what authentication exists today?",
|
|
159
|
+
"header": "API auth",
|
|
160
|
+
"multiSelect": false,
|
|
161
|
+
"options": [
|
|
162
|
+
{"label": "None", "description": "APIs are open"},
|
|
163
|
+
{"label": "API keys", "description": "Static keys in headers"},
|
|
164
|
+
{"label": "OAuth/JWT", "description": "Token-based auth already in place"}
|
|
165
|
+
]
|
|
166
|
+
}
|
|
167
|
+
]
|
|
168
|
+
}
|
|
169
|
+
|
|
170
|
+
[... continues through phases based on answers ...]
|
|
171
|
+
|
|
172
|
+
USER: I think we've covered everything, let's move on.
|
|
173
|
+
|
|
174
|
+
CLAUDE: Great, we've covered a lot of ground. To analyze this from multiple
|
|
175
|
+
perspectives, run:
|
|
176
|
+
|
|
177
|
+
/df:debate auth
|
|
178
|
+
|
|
179
|
+
Or if you want to go straight to a spec:
|
|
180
|
+
|
|
181
|
+
/df:spec auth
|
|
182
|
+
```
|
|
@@ -137,8 +137,10 @@ experiment_file: ".deepflow/experiments/upload--streaming--failed.md"
|
|
|
137
137
|
}
|
|
138
138
|
```
|
|
139
139
|
|
|
140
|
+
Note: `completed_tasks` is kept for backward compatibility but is now derivable from PLAN.md `[x]` entries. The native task system (TaskList) is the primary source for runtime task status.
|
|
141
|
+
|
|
140
142
|
**On checkpoint:** Complete wave → update PLAN.md → save to worktree → exit.
|
|
141
|
-
**Resume:** `--continue` loads checkpoint, verifies worktree, skips completed tasks.
|
|
143
|
+
**Resume:** `--continue` loads checkpoint, verifies worktree, skips completed tasks. Native tasks are re-registered for remaining `[ ]` items only.
|
|
142
144
|
|
|
143
145
|
## Behavior
|
|
144
146
|
|
|
@@ -188,6 +190,30 @@ Load: PLAN.md (required), specs/doing-*.md, .deepflow/config.yaml
|
|
|
188
190
|
If missing: "No PLAN.md found. Run /df:plan first."
|
|
189
191
|
```
|
|
190
192
|
|
|
193
|
+
### 2.5. REGISTER NATIVE TASKS
|
|
194
|
+
|
|
195
|
+
Parse PLAN.md and create native tasks for tracking, dependency management, and UI spinners.
|
|
196
|
+
|
|
197
|
+
**For each uncompleted task (`[ ]`) in PLAN.md:**
|
|
198
|
+
|
|
199
|
+
```
|
|
200
|
+
1. TaskCreate:
|
|
201
|
+
- subject: "{task_id}: {description}" (e.g. "T1: Create upload endpoint")
|
|
202
|
+
- description: Full task block from PLAN.md (files, blocked by, type, etc.)
|
|
203
|
+
- activeForm: "{gerund form of description}" (e.g. "Creating upload endpoint")
|
|
204
|
+
|
|
205
|
+
2. Store mapping: PLAN.md task_id (T1) → native task ID
|
|
206
|
+
```
|
|
207
|
+
|
|
208
|
+
**After all tasks created, set up dependencies:**
|
|
209
|
+
|
|
210
|
+
```
|
|
211
|
+
For each task with "Blocked by: T{n}, T{m}":
|
|
212
|
+
TaskUpdate(taskId: native_id, addBlockedBy: [native_id_of_Tn, native_id_of_Tm])
|
|
213
|
+
```
|
|
214
|
+
|
|
215
|
+
**On `--continue`:** Only create tasks for remaining `[ ]` items (skip `[x]` completed).
|
|
216
|
+
|
|
191
217
|
### 3. CHECK FOR UNPLANNED SPECS
|
|
192
218
|
|
|
193
219
|
Warn if `specs/*.md` (excluding doing-/done-) exist. Non-blocking.
|
|
@@ -244,12 +270,30 @@ Topic extraction:
|
|
|
244
270
|
|
|
245
271
|
### 5. IDENTIFY READY TASKS
|
|
246
272
|
|
|
247
|
-
|
|
273
|
+
Use TaskList to find ready tasks (replaces manual PLAN.md parsing):
|
|
274
|
+
|
|
275
|
+
```
|
|
276
|
+
Ready = TaskList results where:
|
|
277
|
+
- status: "pending"
|
|
278
|
+
- blockedBy: empty (auto-unblocked by native dependency system)
|
|
279
|
+
```
|
|
280
|
+
|
|
281
|
+
**Cross-check with experiment validation** (for spike-blocked tasks):
|
|
282
|
+
- If task depends on spike AND experiment not `--passed.md` → still blocked
|
|
283
|
+
- TaskUpdate to add spike as blocker if not already set
|
|
284
|
+
|
|
285
|
+
Ready = TaskList pending + empty blockedBy + experiment validated (if applicable).
|
|
248
286
|
|
|
249
287
|
### 6. SPAWN AGENTS
|
|
250
288
|
|
|
251
289
|
Context ≥50%: checkpoint and exit.
|
|
252
290
|
|
|
291
|
+
**Before spawning each agent**, mark its native task as in_progress:
|
|
292
|
+
```
|
|
293
|
+
TaskUpdate(taskId: native_id, status: "in_progress")
|
|
294
|
+
```
|
|
295
|
+
This activates the UI spinner showing the task's activeForm (e.g. "Creating upload endpoint").
|
|
296
|
+
|
|
253
297
|
**CRITICAL: Spawn ALL ready tasks in a SINGLE response with MULTIPLE Task tool calls.**
|
|
254
298
|
|
|
255
299
|
DO NOT spawn one task, wait, then spawn another. Instead, call Task tool multiple times in the SAME message block. This enables true parallelism.
|
|
@@ -319,8 +363,15 @@ Then rename experiment:
|
|
|
319
363
|
|
|
320
364
|
**Gate:**
|
|
321
365
|
```
|
|
322
|
-
VERIFIED_PASS →
|
|
323
|
-
|
|
366
|
+
VERIFIED_PASS →
|
|
367
|
+
TaskUpdate(taskId: spike_native_id, status: "completed")
|
|
368
|
+
# Native system auto-unblocks dependent tasks
|
|
369
|
+
Log "✓ Spike {task_id} verified"
|
|
370
|
+
|
|
371
|
+
VERIFIED_FAIL →
|
|
372
|
+
# Spike task stays as pending, dependents remain blocked
|
|
373
|
+
# No TaskUpdate needed — native system keeps them blocked
|
|
374
|
+
Log "✗ Spike {task_id} failed verification"
|
|
324
375
|
If override: log "⚠ Agent incorrectly marked as passed"
|
|
325
376
|
```
|
|
326
377
|
|
|
@@ -390,6 +441,12 @@ Rules:
|
|
|
390
441
|
|
|
391
442
|
When a task fails and cannot be auto-fixed:
|
|
392
443
|
|
|
444
|
+
**Native task update:**
|
|
445
|
+
```
|
|
446
|
+
TaskUpdate(taskId: native_id, status: "pending") # Reset to pending, not deleted
|
|
447
|
+
```
|
|
448
|
+
This keeps the task visible for retry. Dependent tasks remain blocked.
|
|
449
|
+
|
|
393
450
|
**Behavior:**
|
|
394
451
|
1. Leave worktree intact at `{worktree_path}`
|
|
395
452
|
2. Keep checkpoint.json for potential resume
|
|
@@ -434,9 +491,11 @@ After spawning wave agents, your turn ENDS. Completion notifications drive the l
|
|
|
434
491
|
|
|
435
492
|
**Per notification:**
|
|
436
493
|
1. Read result file for the completed agent
|
|
437
|
-
2.
|
|
438
|
-
3.
|
|
439
|
-
4.
|
|
494
|
+
2. TaskUpdate(taskId: native_id, status: "completed") — auto-unblocks dependent tasks
|
|
495
|
+
3. Update PLAN.md: `[ ]` → `[x]` + commit hash (as before)
|
|
496
|
+
4. Report ONE line: "✓ Tx: status (commit)"
|
|
497
|
+
5. If NOT all wave agents done → end turn, wait
|
|
498
|
+
6. If ALL wave agents done → use TaskList to find newly unblocked tasks, check context, spawn next wave or finish
|
|
440
499
|
|
|
441
500
|
**Between waves:** Check context %. If ≥50%, checkpoint and exit.
|
|
442
501
|
|
|
@@ -456,18 +515,41 @@ After spawning wave agents, your turn ENDS. Completion notifications drive the l
|
|
|
456
515
|
|
|
457
516
|
```
|
|
458
517
|
/df:execute (context: 12%)
|
|
459
|
-
|
|
518
|
+
|
|
519
|
+
Loading PLAN.md...
|
|
520
|
+
T1: Create upload endpoint (ready)
|
|
521
|
+
T2: Add S3 service (blocked by T1)
|
|
522
|
+
T3: Add auth guard (blocked by T1)
|
|
523
|
+
|
|
524
|
+
Registering native tasks...
|
|
525
|
+
TaskCreate → T1 (native: task-001)
|
|
526
|
+
TaskCreate → T2 (native: task-002)
|
|
527
|
+
TaskCreate → T3 (native: task-003)
|
|
528
|
+
TaskUpdate(task-002, addBlockedBy: [task-001])
|
|
529
|
+
TaskUpdate(task-003, addBlockedBy: [task-001])
|
|
530
|
+
|
|
531
|
+
Spawning Wave 1: T1
|
|
532
|
+
TaskUpdate(task-001, status: "in_progress") ← spinner: "Creating upload endpoint"
|
|
460
533
|
|
|
461
534
|
[Agent "T1" completed]
|
|
462
|
-
|
|
535
|
+
TaskUpdate(task-001, status: "completed") ← auto-unblocks task-002, task-003
|
|
536
|
+
✓ T1: success (abc1234)
|
|
537
|
+
|
|
538
|
+
TaskList → task-002, task-003 now ready (blockedBy empty)
|
|
539
|
+
|
|
540
|
+
Spawning Wave 2: T2, T3 parallel
|
|
541
|
+
TaskUpdate(task-002, status: "in_progress")
|
|
542
|
+
TaskUpdate(task-003, status: "in_progress")
|
|
463
543
|
|
|
464
544
|
[Agent "T2" completed]
|
|
465
|
-
|
|
545
|
+
TaskUpdate(task-002, status: "completed")
|
|
546
|
+
✓ T2: success (def5678)
|
|
466
547
|
|
|
467
548
|
[Agent "T3" completed]
|
|
468
|
-
|
|
549
|
+
TaskUpdate(task-003, status: "completed")
|
|
550
|
+
✓ T3: success (ghi9012)
|
|
469
551
|
|
|
470
|
-
Wave
|
|
552
|
+
Wave 2 complete (2/2). Context: 35%
|
|
471
553
|
|
|
472
554
|
✓ doing-upload → done-upload
|
|
473
555
|
✓ Complete: 3/3 tasks
|
|
@@ -480,27 +562,43 @@ Next: Run /df:verify to verify specs and merge to main
|
|
|
480
562
|
```
|
|
481
563
|
/df:execute (context: 10%)
|
|
482
564
|
|
|
565
|
+
Loading PLAN.md...
|
|
566
|
+
Registering native tasks...
|
|
567
|
+
TaskCreate → T1 [SPIKE] (native: task-001)
|
|
568
|
+
TaskCreate → T2 (native: task-002)
|
|
569
|
+
TaskCreate → T3 (native: task-003)
|
|
570
|
+
TaskUpdate(task-002, addBlockedBy: [task-001])
|
|
571
|
+
TaskUpdate(task-003, addBlockedBy: [task-001])
|
|
572
|
+
|
|
483
573
|
Checking experiment status...
|
|
484
574
|
T1 [SPIKE]: No experiment yet, spike executable
|
|
485
575
|
T2: Blocked by T1 (spike not validated)
|
|
486
576
|
T3: Blocked by T1 (spike not validated)
|
|
487
577
|
|
|
488
|
-
Spawning Wave 1: T1 [SPIKE]
|
|
578
|
+
Spawning Wave 1: T1 [SPIKE]
|
|
579
|
+
TaskUpdate(task-001, status: "in_progress")
|
|
489
580
|
|
|
490
581
|
[Agent "T1 SPIKE" completed]
|
|
491
582
|
✓ T1: complete, verifying...
|
|
492
583
|
|
|
493
584
|
Verifying T1...
|
|
494
585
|
✓ Spike T1 verified (throughput 8500 >= 7000)
|
|
586
|
+
TaskUpdate(task-001, status: "completed") ← auto-unblocks task-002, task-003
|
|
495
587
|
→ upload--streaming--passed.md
|
|
496
588
|
|
|
497
|
-
|
|
589
|
+
TaskList → task-002, task-003 now ready
|
|
590
|
+
|
|
591
|
+
Spawning Wave 2: T2, T3 parallel
|
|
592
|
+
TaskUpdate(task-002, status: "in_progress")
|
|
593
|
+
TaskUpdate(task-003, status: "in_progress")
|
|
498
594
|
|
|
499
595
|
[Agent "T2" completed]
|
|
500
|
-
|
|
596
|
+
TaskUpdate(task-002, status: "completed")
|
|
597
|
+
✓ T2: success (def5678)
|
|
501
598
|
|
|
502
599
|
[Agent "T3" completed]
|
|
503
|
-
|
|
600
|
+
TaskUpdate(task-003, status: "completed")
|
|
601
|
+
✓ T3: success (ghi9012)
|
|
504
602
|
|
|
505
603
|
Wave 2 complete (2/2). Context: 40%
|
|
506
604
|
|
|
@@ -515,11 +613,16 @@ Next: Run /df:verify to verify specs and merge to main
|
|
|
515
613
|
```
|
|
516
614
|
/df:execute (context: 10%)
|
|
517
615
|
|
|
616
|
+
Registering native tasks...
|
|
617
|
+
TaskCreate → T1 [SPIKE], T2, T3 (with dependencies)
|
|
618
|
+
|
|
518
619
|
Wave 1: T1 [SPIKE] (context: 15%)
|
|
620
|
+
TaskUpdate(task-001, status: "in_progress")
|
|
519
621
|
T1: complete, verifying...
|
|
520
622
|
|
|
521
623
|
Verifying T1...
|
|
522
624
|
✗ Spike T1 failed verification (throughput 1500 < 7000)
|
|
625
|
+
# Spike stays pending — dependents remain blocked
|
|
523
626
|
→ upload--streaming--failed.md
|
|
524
627
|
|
|
525
628
|
⚠ Spike T1 invalidated hypothesis
|
|
@@ -533,12 +636,17 @@ Next: Run /df:plan to generate new hypothesis spike
|
|
|
533
636
|
```
|
|
534
637
|
/df:execute (context: 10%)
|
|
535
638
|
|
|
639
|
+
Registering native tasks...
|
|
640
|
+
TaskCreate → T1 [SPIKE], T2, T3 (with dependencies)
|
|
641
|
+
|
|
536
642
|
Wave 1: T1 [SPIKE] (context: 15%)
|
|
643
|
+
TaskUpdate(task-001, status: "in_progress")
|
|
537
644
|
T1: complete (agent said: success), verifying...
|
|
538
645
|
|
|
539
646
|
Verifying T1...
|
|
540
647
|
✗ Spike T1 failed verification (throughput 1500 < 7000)
|
|
541
648
|
⚠ Agent incorrectly marked as passed — overriding to FAILED
|
|
649
|
+
TaskUpdate(task-001, status: "pending") ← reset, dependents stay blocked
|
|
542
650
|
→ upload--streaming--failed.md
|
|
543
651
|
|
|
544
652
|
⚠ Spike T1 invalidated hypothesis
|
package/src/commands/df/spec.md
CHANGED
|
@@ -31,6 +31,8 @@ Transform conversation context into a structured specification file.
|
|
|
31
31
|
|
|
32
32
|
### 1. GATHER CODEBASE CONTEXT
|
|
33
33
|
|
|
34
|
+
**Check for debate file first:** If `specs/.debate-{name}.md` exists, read it using the Read tool. Pass its content (especially the Synthesis section) to the reasoner agent in step 3 as additional context. The debate file contains multi-perspective analysis that should inform requirements and constraints.
|
|
35
|
+
|
|
34
36
|
**NEVER use `run_in_background` for Explore agents** — causes late "Agent completed" notifications that pollute output after work is done.
|
|
35
37
|
|
|
36
38
|
**NEVER use TaskOutput** — returns full agent transcripts (100KB+) that explode context.
|
|
@@ -51,13 +51,20 @@ Report per spec: requirements count, acceptance count, quality issues.
|
|
|
51
51
|
|
|
52
52
|
**If all pass:** Proceed to Post-Verification merge.
|
|
53
53
|
|
|
54
|
-
**If issues found:** Add fix tasks to PLAN.md in the worktree and loop back to execute:
|
|
54
|
+
**If issues found:** Add fix tasks to PLAN.md in the worktree and register as native tasks, then loop back to execute:
|
|
55
55
|
|
|
56
56
|
1. Discover worktree (same logic as Post-Verification step 1)
|
|
57
57
|
2. Write new fix tasks to `{worktree_path}/PLAN.md` under the existing spec section
|
|
58
58
|
- Task IDs continue from last (e.g. if T9 was last, fixes start at T10)
|
|
59
59
|
- Format: `- [ ] **T10**: Fix {description}` with `Files:` and details
|
|
60
|
-
3.
|
|
60
|
+
3. Register fix tasks as native tasks for immediate tracking:
|
|
61
|
+
```
|
|
62
|
+
For each fix task added:
|
|
63
|
+
TaskCreate(subject: "T10: Fix {description}", description: "...", activeForm: "Fixing {description}")
|
|
64
|
+
TaskUpdate(addBlockedBy: [...]) if dependencies exist
|
|
65
|
+
```
|
|
66
|
+
This allows `/df:execute --continue` to find fix tasks via TaskList immediately.
|
|
67
|
+
4. Output report + next step:
|
|
61
68
|
|
|
62
69
|
```
|
|
63
70
|
done-upload.md: 4/4 reqs ✓, 3/5 acceptance ✗, 1 quality issue
|