deepflow 0.1.72 → 0.1.74
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +82 -201
- package/bin/install.js +95 -12
- package/package.json +7 -3
- package/src/commands/df/execute.md +7 -2
- package/src/skills/context-hub/SKILL.md +87 -0
package/README.md
CHANGED
|
@@ -8,25 +8,36 @@
|
|
|
8
8
|
```
|
|
9
9
|
|
|
10
10
|
<p align="center">
|
|
11
|
-
<strong>
|
|
11
|
+
<strong>Doing reveals what thinking can't predict</strong>
|
|
12
12
|
</p>
|
|
13
13
|
|
|
14
14
|
<p align="center">
|
|
15
15
|
<a href="#quick-start">Quick Start</a> •
|
|
16
16
|
<a href="#two-modes">Two Modes</a> •
|
|
17
|
-
<a href="#commands">Commands</a>
|
|
17
|
+
<a href="#commands">Commands</a> •
|
|
18
|
+
<a href="#what-deepflow-rejects">What It Rejects</a> •
|
|
19
|
+
<a href="#principles">Principles</a>
|
|
18
20
|
</p>
|
|
19
21
|
|
|
20
22
|
---
|
|
21
23
|
|
|
22
|
-
##
|
|
24
|
+
## Why Deepflow
|
|
23
25
|
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
-
|
|
27
|
-
|
|
28
|
-
- **
|
|
29
|
-
- **
|
|
26
|
+
**You can't foresee what you don't know to ask.** Doing reveals — at every layer.
|
|
27
|
+
|
|
28
|
+
Most spec-driven frameworks start from a finished spec and execute a static plan. Deepflow treats the entire process as discovery: asking reveals hidden requirements, debating reveals blind spots, spiking reveals technical risks, implementing reveals edge cases. Each step makes the next one sharper.
|
|
29
|
+
|
|
30
|
+
- **Asking reveals what assuming hides** — Before any code, Socratic questioning surfaces the requirements you didn't know you had. Four AI perspectives collide to expose tensions in your approach. The spec isn't written from what you think you know — it's written from what the conversation uncovered.
|
|
31
|
+
- **Spec as living hypothesis** — Core intent stays fixed, details refine through implementation. "The spec becomes bulletproof because you built it, not before."
|
|
32
|
+
- **Parallel probes reveal the best path** — Uncertain approaches spawn parallel spikes in isolated worktrees. The machine selects the winner (fewer regressions > better coverage > fewer files changed). Failed approaches stay recorded and never repeat.
|
|
33
|
+
- **Metrics decide, not opinions** — No LLM judges another LLM. Build, tests, typecheck, lint are the only judges. After an agent commits, the orchestrator runs health checks. Pass = keep. Fail = revert + new hypothesis.
|
|
34
|
+
- **The loop is the product** — Not "execute a plan" — "evolve the codebase toward the spec's goals through iterative cycles." Each cycle reveals what the previous one couldn't see.
|
|
35
|
+
|
|
36
|
+
## What We Learned by Doing
|
|
37
|
+
|
|
38
|
+
Deepflow started with adversarial selection: one AI evaluated another AI's code in a fresh context. The "doing reveals" philosophy applied to the system itself — we discovered that **LLM judging LLM produces gaming**: agents that estimated instead of measuring, simulated instead of implementing, presented shortcuts as deliverables.
|
|
39
|
+
|
|
40
|
+
The fix: eliminate subjective judgment. Only objective metrics decide. Tests created by the agent itself are excluded from the baseline to prevent self-validation. We call this a **ratchet** — inspired by [Karpathy's autoresearch](https://github.com/karpathy/autoresearch): a mechanism where the metric can only improve, never regress. Each cycle ratchets quality forward.
|
|
30
41
|
|
|
31
42
|
## Quick Start
|
|
32
43
|
|
|
@@ -38,212 +49,87 @@ npx deepflow
|
|
|
38
49
|
npx deepflow --uninstall
|
|
39
50
|
```
|
|
40
51
|
|
|
41
|
-
|
|
52
|
+
The installer configures granular permissions so background agents can read, write, run git, and execute health checks (build/test/typecheck/lint) without blocking on approval prompts. All permissions are scoped and cleaned up on uninstall.
|
|
42
53
|
|
|
43
|
-
|
|
54
|
+
## Two Modes
|
|
44
55
|
|
|
45
|
-
### Interactive
|
|
56
|
+
### Interactive (human-in-the-loop)
|
|
46
57
|
|
|
47
|
-
You
|
|
58
|
+
You explore the problem, shape the spec, and trigger execution — all inside a Claude Code session.
|
|
48
59
|
|
|
49
60
|
```bash
|
|
50
61
|
claude
|
|
51
62
|
|
|
52
|
-
# 1.
|
|
63
|
+
# 1. Discover — understand the problem before solving it
|
|
53
64
|
/df:discover image-upload
|
|
65
|
+
# "Why do you need image upload? What exists today?
|
|
66
|
+
# What file sizes? What formats? Where are images stored?
|
|
67
|
+
# What does 'done' look like? What should this NOT do?"
|
|
54
68
|
|
|
55
|
-
# 2. Debate
|
|
69
|
+
# 2. Debate — stress-test the approach (optional)
|
|
56
70
|
/df:debate upload-strategy
|
|
71
|
+
# User Advocate: "Drag-and-drop is table stakes, not a feature"
|
|
72
|
+
# Tech Skeptic: "Client-side resize before upload, or you'll hit memory limits"
|
|
73
|
+
# Systems Thinker: "What happens when storage goes down mid-upload?"
|
|
74
|
+
# LLM Efficiency: "Split this into two specs: upload + processing"
|
|
57
75
|
|
|
58
|
-
# 3.
|
|
76
|
+
# 3. Spec — now the conversation is rich enough to produce a solid spec
|
|
59
77
|
/df:spec image-upload
|
|
60
78
|
|
|
61
|
-
# 4
|
|
62
|
-
/df:plan
|
|
63
|
-
|
|
64
|
-
#
|
|
65
|
-
/df:execute
|
|
66
|
-
|
|
67
|
-
# 6. Verify and merge to main
|
|
68
|
-
/df:verify
|
|
79
|
+
# 4-6: the AI takes over
|
|
80
|
+
/df:plan # Compare spec to code, create tasks
|
|
81
|
+
/df:execute # Parallel agents in worktree, ratchet validates
|
|
82
|
+
/df:verify # Check spec satisfied, merge to main
|
|
69
83
|
```
|
|
70
84
|
|
|
71
85
|
**What requires you:** Steps 1-3 (defining the problem and approving the spec). Steps 4-6 run autonomously but you trigger each one and can intervene.
|
|
72
86
|
|
|
73
|
-
### Autonomous
|
|
87
|
+
### Autonomous (unattended)
|
|
74
88
|
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
```bash
|
|
78
|
-
# You define WHAT (the specs), the AI figures out HOW, overnight
|
|
89
|
+
The human loop comes first — discover and debate are where intent gets shaped. You refine the problem, stress-test ideas, and produce a spec that captures what you actually need. That's the living contract. Then you hand it off.
|
|
79
90
|
|
|
80
|
-
# Inside Claude Code (requires Agent Teams)
|
|
81
|
-
/df:auto # process all specs in specs/
|
|
82
|
-
```
|
|
83
|
-
|
|
84
|
-
**What the AI does alone:**
|
|
85
|
-
1. Pre-checks if spec is already satisfied (skips if so)
|
|
86
|
-
2. Discovers specs, respects `depends_on` ordering
|
|
87
|
-
3. Generates N hypotheses for how to implement each spec
|
|
88
|
-
4. Runs parallel spikes in isolated worktrees (one per hypothesis)
|
|
89
|
-
5. Implements the passing approaches
|
|
90
|
-
6. Adversarial selection: a fresh AI context compares approaches by artifacts only (never reads code), picks the best or rejects all
|
|
91
|
-
7. If rejected: generates new hypotheses, retries (up to max-cycles)
|
|
92
|
-
8. On convergence: verifies (L0-L4 gates), creates PR, merges to main
|
|
93
|
-
|
|
94
|
-
**What you do:** Write specs (via interactive mode or manually) in `specs/`, run `/df:auto` inside Claude Code, read the report at `.deepflow/auto-report.md`. No need to run `/df:plan` first — auto mode promotes plain specs to `doing-*` automatically.
|
|
95
|
-
|
|
96
|
-
**How to use:**
|
|
97
91
|
```bash
|
|
98
|
-
#
|
|
92
|
+
# First: the human loop — discover, debate, refine until the spec is solid
|
|
99
93
|
$ claude
|
|
100
94
|
> /df:discover auth
|
|
101
|
-
> /df:
|
|
95
|
+
> /df:debate auth-strategy
|
|
96
|
+
> /df:spec auth # specs/auth.md — the handoff point
|
|
102
97
|
> /exit
|
|
103
98
|
|
|
104
|
-
#
|
|
99
|
+
# Then: the AI loop — plan, execute, validate, merge
|
|
100
|
+
$ claude
|
|
105
101
|
> /df:auto
|
|
106
102
|
|
|
107
|
-
# Next morning
|
|
103
|
+
# Next morning
|
|
108
104
|
$ cat .deepflow/auto-report.md
|
|
109
105
|
$ git log --oneline
|
|
110
106
|
```
|
|
111
107
|
|
|
112
|
-
**
|
|
108
|
+
**What the AI does alone:**
|
|
109
|
+
1. Runs `/df:plan` if no PLAN.md exists
|
|
110
|
+
2. Snapshots pre-existing tests (ratchet baseline)
|
|
111
|
+
3. Starts a loop (`/loop 1m /df:auto-cycle`) — fresh context each cycle
|
|
112
|
+
4. Each cycle: picks next task → executes in worktree → runs health checks (build/tests/typecheck/lint)
|
|
113
|
+
5. Pass = commit stands. Fail = revert + retry next cycle
|
|
114
|
+
6. Circuit breaker: halts after N consecutive reverts on same task
|
|
115
|
+
7. When all tasks done: runs `/df:verify`, merges to main
|
|
116
|
+
|
|
117
|
+
**Safety:** Never pushes to remote. Failed approaches recorded in `.deepflow/experiments/` and never repeated. Specs validated before processing.
|
|
113
118
|
|
|
114
|
-
###
|
|
119
|
+
### Two Loops, One Handoff
|
|
115
120
|
|
|
116
121
|
```
|
|
117
|
-
|
|
122
|
+
HUMAN LOOP AI LOOP
|
|
118
123
|
───────────────────────────────── ──────────────────────────────────
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
Read morning report
|
|
124
|
+
/df:discover — ask, surface gaps /df:plan — compare spec to code
|
|
125
|
+
/df:debate — stress-test approach /df:execute — spike, implement
|
|
126
|
+
/df:spec — produce living contract /df:verify — health checks, merge
|
|
127
|
+
↻ refine until solid ↻ retry until converged
|
|
124
128
|
───────────────────────────────── ──────────────────────────────────
|
|
125
129
|
specs/*.md is the handoff point
|
|
126
130
|
```
|
|
127
131
|
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
```
|
|
131
|
-
/df:discover <name>
|
|
132
|
-
| Socratic questioning (motivation, scope, constraints...)
|
|
133
|
-
v
|
|
134
|
-
/df:debate <topic> <- optional
|
|
135
|
-
| 4 perspectives: User Advocate, Tech Skeptic,
|
|
136
|
-
| Systems Thinker, LLM Efficiency
|
|
137
|
-
| Creates specs/.debate-{topic}.md
|
|
138
|
-
v
|
|
139
|
-
/df:spec <name>
|
|
140
|
-
| Creates specs/{name}.md from conversation
|
|
141
|
-
| Validates structure before writing
|
|
142
|
-
v
|
|
143
|
-
/df:plan
|
|
144
|
-
| Checks past experiments (learn from failures)
|
|
145
|
-
| Risky work? -> generates spike task first
|
|
146
|
-
| Creates PLAN.md with prioritized tasks
|
|
147
|
-
| Renames: feature.md -> doing-feature.md
|
|
148
|
-
v
|
|
149
|
-
/df:execute
|
|
150
|
-
| Creates isolated worktree (main stays clean)
|
|
151
|
-
| Spike tasks run first, verified before continuing
|
|
152
|
-
| Parallel agents, file conflicts serialize
|
|
153
|
-
| Context-aware (>=50% -> checkpoint)
|
|
154
|
-
v
|
|
155
|
-
/df:verify
|
|
156
|
-
| Checks requirements met
|
|
157
|
-
| Merges worktree to main, cleans up
|
|
158
|
-
| Extracts decisions -> .deepflow/decisions.md
|
|
159
|
-
| Deletes done-* spec after extraction
|
|
160
|
-
```
|
|
161
|
-
|
|
162
|
-
## The Flow (Autonomous)
|
|
163
|
-
|
|
164
|
-
```
|
|
165
|
-
/df:auto
|
|
166
|
-
| Discover specs (auto-promote, topological sort by depends_on)
|
|
167
|
-
| For each doing-* spec:
|
|
168
|
-
|
|
|
169
|
-
| Pre-check (Haiku: already satisfied? skip)
|
|
170
|
-
| v
|
|
171
|
-
| Validate spec (malformed? skip)
|
|
172
|
-
| v
|
|
173
|
-
| Generate N hypotheses
|
|
174
|
-
| v
|
|
175
|
-
| Parallel spikes (one worktree per hypothesis)
|
|
176
|
-
| | Pass? -> implement in same worktree
|
|
177
|
-
| | Fail? -> record experiment, discard
|
|
178
|
-
| v
|
|
179
|
-
| Adversarial selection (fresh context, artifacts only)
|
|
180
|
-
| | Winner? -> verify (L0-L4) -> PR -> merge
|
|
181
|
-
| | Reject all? -> new hypotheses, retry
|
|
182
|
-
| v
|
|
183
|
-
| Morning report -> .deepflow/auto-report.md
|
|
184
|
-
```
|
|
185
|
-
|
|
186
|
-
## Spec Lifecycle
|
|
187
|
-
|
|
188
|
-
```
|
|
189
|
-
specs/
|
|
190
|
-
feature.md -> new, needs /df:plan
|
|
191
|
-
doing-feature.md -> in progress (active contract between you and the AI)
|
|
192
|
-
done-feature.md -> transient (decisions extracted, then deleted)
|
|
193
|
-
```
|
|
194
|
-
|
|
195
|
-
## Works With Any Project
|
|
196
|
-
|
|
197
|
-
**Greenfield:** Everything is new, agents create from scratch.
|
|
198
|
-
|
|
199
|
-
**Ongoing:** Detects existing patterns, follows conventions, integrates with current code.
|
|
200
|
-
|
|
201
|
-
## Spike-First Planning
|
|
202
|
-
|
|
203
|
-
For risky or uncertain work, `/df:plan` generates a **spike task** first:
|
|
204
|
-
|
|
205
|
-
```
|
|
206
|
-
Spike: Validate streaming upload handles 10MB+ files
|
|
207
|
-
| Run minimal experiment
|
|
208
|
-
| Pass? -> Unblock implementation tasks
|
|
209
|
-
| Fail? -> Record learning, generate new hypothesis
|
|
210
|
-
```
|
|
211
|
-
|
|
212
|
-
Experiments are tracked in `.deepflow/experiments/`. Failed approaches won't be repeated.
|
|
213
|
-
|
|
214
|
-
## Worktree Isolation
|
|
215
|
-
|
|
216
|
-
Execution happens in an isolated git worktree:
|
|
217
|
-
- Main branch stays clean during execution
|
|
218
|
-
- On failure, worktree preserved for debugging
|
|
219
|
-
- Resume with `/df:execute --continue`
|
|
220
|
-
- On success, `/df:verify` merges to main and cleans up
|
|
221
|
-
|
|
222
|
-
## LSP Integration
|
|
223
|
-
|
|
224
|
-
/df:automatically enables Claude Code's LSP tools during install, giving agents access to `goToDefinition`, `findReferences`, and `workspaceSymbol` for precise code navigation instead of grep-based searching.
|
|
225
|
-
|
|
226
|
-
- **Global install:** sets `ENABLE_LSP_TOOL=1` in `~/.claude/settings.json`
|
|
227
|
-
- **Project install:** sets it in `.claude/settings.local.json`
|
|
228
|
-
- **Uninstall:** cleans up automatically
|
|
229
|
-
|
|
230
|
-
Agents prefer LSP tools when available and fall back to Grep/Glob silently. You'll need a language server installed for your language (e.g. `typescript-language-server`, `pyright`, `rust-analyzer`, `gopls`).
|
|
231
|
-
|
|
232
|
-
## Spec Validation
|
|
233
|
-
|
|
234
|
-
Specs are validated before downstream consumption by `/df:spec`, `/df:plan`, and `/df:auto`:
|
|
235
|
-
|
|
236
|
-
- **Hard invariants** (block on failure): required sections present, REQ-N prefixes, checkbox ACs, no duplicate IDs
|
|
237
|
-
- **Advisory warnings** (warn interactively, block in auto mode): long specs, orphaned requirements, excessive technical notes
|
|
238
|
-
|
|
239
|
-
Run manually: `node hooks/df-spec-lint.js specs/my-spec.md`
|
|
240
|
-
|
|
241
|
-
## Context-Aware Execution
|
|
242
|
-
|
|
243
|
-
Statusline shows context usage. At >=50%:
|
|
244
|
-
- Waits for running agents
|
|
245
|
-
- Checkpoints state
|
|
246
|
-
- Resume with `/df:execute --continue`
|
|
132
|
+
**Spec lifecycle:** `feature.md` (new) → `doing-feature.md` (in progress) → `done-feature.md` (decisions extracted, then deleted)
|
|
247
133
|
|
|
248
134
|
## Commands
|
|
249
135
|
|
|
@@ -259,7 +145,7 @@ Statusline shows context usage. At >=50%:
|
|
|
259
145
|
| `/df:consolidate` | Deduplicate and clean up decisions.md |
|
|
260
146
|
| `/df:resume` | Session continuity briefing |
|
|
261
147
|
| `/df:update` | Update deepflow to latest |
|
|
262
|
-
| `/df:auto` | Autonomous
|
|
148
|
+
| `/df:auto` | Autonomous mode (plan → loop → verify, no human needed) |
|
|
263
149
|
|
|
264
150
|
## File Structure
|
|
265
151
|
|
|
@@ -273,39 +159,34 @@ your-project/
|
|
|
273
159
|
+-- config.yaml # project settings
|
|
274
160
|
+-- decisions.md # auto-extracted + ad-hoc decisions
|
|
275
161
|
+-- auto-report.md # morning report (autonomous mode)
|
|
276
|
-
+-- auto-
|
|
277
|
-
+-- last-consolidated.json # consolidation timestamp
|
|
278
|
-
+-- context.json # context % tracking
|
|
162
|
+
+-- auto-memory.yaml # cross-cycle learning
|
|
279
163
|
+-- experiments/ # spike results (pass/fail)
|
|
280
164
|
+-- worktrees/ # isolated execution
|
|
281
165
|
+-- upload/ # one worktree per spec
|
|
282
166
|
```
|
|
283
167
|
|
|
284
|
-
##
|
|
285
|
-
|
|
286
|
-
Create `.deepflow/config.yaml`:
|
|
168
|
+
## What Deepflow Rejects
|
|
287
169
|
|
|
288
|
-
|
|
289
|
-
|
|
290
|
-
|
|
291
|
-
|
|
170
|
+
- **Predicting everything before doing** — You discover what you need by building it. TDD assumes you already know the correct behavior before coding. Deepflow assumes that **execution reveals** what planning can't anticipate.
|
|
171
|
+
- **LLM judging LLM** — We started with adversarial selection (AI evaluating AI). We discovered gaming. We replaced it with objective metrics. Deepflow's own evolution proved the principle.
|
|
172
|
+
- **Agents role-playing job titles** — Flat orchestrator + model routing. No PM agent, no QA agent, no Scrum Master agent.
|
|
173
|
+
- **Automated research before understanding** — Conversation with you first. AI research comes after you've defined the problem.
|
|
174
|
+
- **Ceremony** — 6 commands, one flow. Markdown, not schemas. No sprint planning, no story points, no retrospectives.
|
|
292
175
|
|
|
293
|
-
|
|
294
|
-
execute:
|
|
295
|
-
max: 5 # max parallel agents
|
|
176
|
+
## Principles
|
|
296
177
|
|
|
297
|
-
|
|
298
|
-
|
|
299
|
-
|
|
300
|
-
|
|
178
|
+
1. **Discover before specifying, spike before implementing** — Ask, debate, probe — then commit
|
|
179
|
+
2. **You define WHAT, AI figures out HOW** — Specs are the contract
|
|
180
|
+
3. **Metrics decide, not opinions** — Build/test/typecheck/lint are the only judges
|
|
181
|
+
4. **Confirm before assume** — Search the code before marking "missing"
|
|
182
|
+
5. **Complete implementations** — No stubs, no placeholders
|
|
183
|
+
6. **Atomic commits** — One task = one commit
|
|
184
|
+
7. **Context-aware** — Checkpoint before limits, resume seamlessly
|
|
301
185
|
|
|
302
|
-
##
|
|
186
|
+
## More
|
|
303
187
|
|
|
304
|
-
|
|
305
|
-
|
|
306
|
-
3. **Complete implementations** — No stubs, no placeholders
|
|
307
|
-
4. **Atomic commits** — One task = one commit
|
|
308
|
-
5. **Context-aware** — Checkpoint before limits
|
|
188
|
+
- [Concepts](docs/concepts.md) — Philosophy and flow in depth
|
|
189
|
+
- [Configuration](docs/configuration.md) — All options, models, parallelism
|
|
309
190
|
|
|
310
191
|
## License
|
|
311
192
|
|
package/bin/install.js
CHANGED
|
@@ -184,13 +184,14 @@ async function main() {
|
|
|
184
184
|
console.log('');
|
|
185
185
|
console.log(`Installed to ${c.cyan}${CLAUDE_DIR}${c.reset}:`);
|
|
186
186
|
console.log(' commands/df/ — /df:discover, /df:debate, /df:spec, /df:plan, /df:execute, /df:verify, /df:auto, /df:note, /df:resume, /df:update');
|
|
187
|
-
console.log(' skills/ — gap-discovery, atomic-commits, code-completeness');
|
|
187
|
+
console.log(' skills/ — gap-discovery, atomic-commits, code-completeness, context-hub');
|
|
188
188
|
console.log(' agents/ — reasoner (/df:auto — autonomous execution via /loop)');
|
|
189
189
|
if (level === 'global') {
|
|
190
190
|
console.log(' hooks/ — statusline, update checker');
|
|
191
191
|
}
|
|
192
192
|
console.log(' hooks/df-spec-* — spec validation (auto-enforced by /df:spec and /df:plan)');
|
|
193
193
|
console.log(' env/ — ENABLE_LSP_TOOL (code navigation via goToDefinition, findReferences, workspaceSymbol)');
|
|
194
|
+
console.log(' permissions/ — granular allow-list for background agents (git, build, test, read/write)');
|
|
194
195
|
console.log('');
|
|
195
196
|
if (level === 'project') {
|
|
196
197
|
console.log(`${c.dim}Note: Statusline is only available with global install.${c.reset}`);
|
|
@@ -252,6 +253,10 @@ async function configureHooks(claudeDir) {
|
|
|
252
253
|
settings.env.ENABLE_LSP_TOOL = "1";
|
|
253
254
|
log('LSP tool enabled');
|
|
254
255
|
|
|
256
|
+
// Configure permissions for background agents
|
|
257
|
+
configurePermissions(settings);
|
|
258
|
+
log('Agent permissions configured');
|
|
259
|
+
|
|
255
260
|
// Configure statusline
|
|
256
261
|
if (settings.statusLine) {
|
|
257
262
|
const answer = await ask(
|
|
@@ -319,8 +324,72 @@ function configureProjectSettings(claudeDir) {
|
|
|
319
324
|
if (!settings.env) settings.env = {};
|
|
320
325
|
settings.env.ENABLE_LSP_TOOL = "1";
|
|
321
326
|
|
|
327
|
+
// Configure permissions for background agents
|
|
328
|
+
configurePermissions(settings);
|
|
329
|
+
|
|
322
330
|
fs.writeFileSync(settingsPath, JSON.stringify(settings, null, 2));
|
|
323
|
-
log('LSP tool enabled (project)');
|
|
331
|
+
log('LSP tool enabled + agent permissions configured (project)');
|
|
332
|
+
}
|
|
333
|
+
|
|
334
|
+
// Permissions required for background agents to work without blocking
|
|
335
|
+
const DEEPFLOW_PERMISSIONS = [
|
|
336
|
+
// Agents need to read/write code
|
|
337
|
+
"Edit",
|
|
338
|
+
"Write",
|
|
339
|
+
"Read",
|
|
340
|
+
// Agents need to search codebase
|
|
341
|
+
"Glob",
|
|
342
|
+
"Grep",
|
|
343
|
+
// Git operations (orchestrator handles worktrees, agents read status)
|
|
344
|
+
"Bash(git status:*)",
|
|
345
|
+
"Bash(git diff:*)",
|
|
346
|
+
"Bash(git add:*)",
|
|
347
|
+
"Bash(git commit:*)",
|
|
348
|
+
"Bash(git log:*)",
|
|
349
|
+
"Bash(git stash:*)",
|
|
350
|
+
"Bash(git checkout:*)",
|
|
351
|
+
"Bash(git branch:*)",
|
|
352
|
+
"Bash(git revert:*)",
|
|
353
|
+
"Bash(git worktree:*)",
|
|
354
|
+
"Bash(git ls-files:*)",
|
|
355
|
+
"Bash(git merge:*)",
|
|
356
|
+
// Build & test (ratchet health checks)
|
|
357
|
+
"Bash(npm run build:*)",
|
|
358
|
+
"Bash(npm test:*)",
|
|
359
|
+
"Bash(npm run lint:*)",
|
|
360
|
+
"Bash(npx tsc:*)",
|
|
361
|
+
"Bash(cargo build:*)",
|
|
362
|
+
"Bash(cargo test:*)",
|
|
363
|
+
"Bash(go build:*)",
|
|
364
|
+
"Bash(go test:*)",
|
|
365
|
+
"Bash(pytest:*)",
|
|
366
|
+
"Bash(python -m pytest:*)",
|
|
367
|
+
"Bash(ruff:*)",
|
|
368
|
+
"Bash(mypy:*)",
|
|
369
|
+
// Utility
|
|
370
|
+
"Bash(node:*)",
|
|
371
|
+
"Bash(ls:*)",
|
|
372
|
+
"Bash(cat:*)",
|
|
373
|
+
"Bash(mkdir:*)",
|
|
374
|
+
"Bash(date:*)",
|
|
375
|
+
"Bash(wc:*)",
|
|
376
|
+
"Bash(head:*)",
|
|
377
|
+
"Bash(tail:*)",
|
|
378
|
+
];
|
|
379
|
+
|
|
380
|
+
function configurePermissions(settings) {
|
|
381
|
+
if (!settings.permissions) settings.permissions = {};
|
|
382
|
+
if (!settings.permissions.allow) settings.permissions.allow = [];
|
|
383
|
+
|
|
384
|
+
const existing = new Set(settings.permissions.allow);
|
|
385
|
+
let added = 0;
|
|
386
|
+
|
|
387
|
+
for (const perm of DEEPFLOW_PERMISSIONS) {
|
|
388
|
+
if (!existing.has(perm)) {
|
|
389
|
+
settings.permissions.allow.push(perm);
|
|
390
|
+
added++;
|
|
391
|
+
}
|
|
392
|
+
}
|
|
324
393
|
}
|
|
325
394
|
|
|
326
395
|
function ask(question) {
|
|
@@ -400,6 +469,7 @@ async function uninstall() {
|
|
|
400
469
|
'skills/atomic-commits',
|
|
401
470
|
'skills/code-completeness',
|
|
402
471
|
'skills/gap-discovery',
|
|
472
|
+
'skills/context-hub',
|
|
403
473
|
'agents/reasoner.md'
|
|
404
474
|
];
|
|
405
475
|
|
|
@@ -449,23 +519,30 @@ async function uninstall() {
|
|
|
449
519
|
}
|
|
450
520
|
}
|
|
451
521
|
|
|
452
|
-
// Remove ENABLE_LSP_TOOL from global settings
|
|
522
|
+
// Remove ENABLE_LSP_TOOL and deepflow permissions from global settings
|
|
453
523
|
if (fs.existsSync(settingsPath)) {
|
|
454
524
|
try {
|
|
455
525
|
const settings = JSON.parse(fs.readFileSync(settingsPath, 'utf8'));
|
|
456
526
|
if (settings.env?.ENABLE_LSP_TOOL) {
|
|
457
527
|
delete settings.env.ENABLE_LSP_TOOL;
|
|
458
528
|
if (settings.env && Object.keys(settings.env).length === 0) delete settings.env;
|
|
459
|
-
fs.writeFileSync(settingsPath, JSON.stringify(settings, null, 2));
|
|
460
529
|
console.log(` ${c.green}✓${c.reset} Removed ENABLE_LSP_TOOL from settings`);
|
|
461
530
|
}
|
|
531
|
+
if (settings.permissions?.allow) {
|
|
532
|
+
const dfPerms = new Set(DEEPFLOW_PERMISSIONS);
|
|
533
|
+
settings.permissions.allow = settings.permissions.allow.filter(p => !dfPerms.has(p));
|
|
534
|
+
if (settings.permissions.allow.length === 0) delete settings.permissions.allow;
|
|
535
|
+
if (settings.permissions && Object.keys(settings.permissions).length === 0) delete settings.permissions;
|
|
536
|
+
console.log(` ${c.green}✓${c.reset} Removed deepflow permissions from settings`);
|
|
537
|
+
}
|
|
538
|
+
fs.writeFileSync(settingsPath, JSON.stringify(settings, null, 2));
|
|
462
539
|
} catch (e) {
|
|
463
540
|
// Fail silently
|
|
464
541
|
}
|
|
465
542
|
}
|
|
466
543
|
}
|
|
467
544
|
|
|
468
|
-
// Remove ENABLE_LSP_TOOL from project settings.local.json
|
|
545
|
+
// Remove ENABLE_LSP_TOOL and deepflow permissions from project settings.local.json
|
|
469
546
|
if (level === 'project') {
|
|
470
547
|
const localSettingsPath = path.join(PROJECT_DIR, 'settings.local.json');
|
|
471
548
|
if (fs.existsSync(localSettingsPath)) {
|
|
@@ -474,13 +551,19 @@ async function uninstall() {
|
|
|
474
551
|
if (localSettings.env?.ENABLE_LSP_TOOL) {
|
|
475
552
|
delete localSettings.env.ENABLE_LSP_TOOL;
|
|
476
553
|
if (localSettings.env && Object.keys(localSettings.env).length === 0) delete localSettings.env;
|
|
477
|
-
|
|
478
|
-
|
|
479
|
-
|
|
480
|
-
|
|
481
|
-
|
|
482
|
-
|
|
483
|
-
|
|
554
|
+
}
|
|
555
|
+
if (localSettings.permissions?.allow) {
|
|
556
|
+
const dfPerms = new Set(DEEPFLOW_PERMISSIONS);
|
|
557
|
+
localSettings.permissions.allow = localSettings.permissions.allow.filter(p => !dfPerms.has(p));
|
|
558
|
+
if (localSettings.permissions.allow.length === 0) delete localSettings.permissions.allow;
|
|
559
|
+
if (localSettings.permissions && Object.keys(localSettings.permissions).length === 0) delete localSettings.permissions;
|
|
560
|
+
}
|
|
561
|
+
if (Object.keys(localSettings).length === 0) {
|
|
562
|
+
fs.unlinkSync(localSettingsPath);
|
|
563
|
+
console.log(` ${c.green}✓${c.reset} Removed settings.local.json (empty after cleanup)`);
|
|
564
|
+
} else {
|
|
565
|
+
fs.writeFileSync(localSettingsPath, JSON.stringify(localSettings, null, 2));
|
|
566
|
+
console.log(` ${c.green}✓${c.reset} Removed deepflow settings from settings.local.json`);
|
|
484
567
|
}
|
|
485
568
|
} catch (e) {
|
|
486
569
|
// Fail silently
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "deepflow",
|
|
3
|
-
"version": "0.1.
|
|
4
|
-
"description": "
|
|
3
|
+
"version": "0.1.74",
|
|
4
|
+
"description": "Doing reveals what thinking can't predict — spec-driven iterative development for Claude Code",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"claude",
|
|
7
7
|
"claude-code",
|
|
@@ -12,7 +12,11 @@
|
|
|
12
12
|
"specs",
|
|
13
13
|
"tasks",
|
|
14
14
|
"automation",
|
|
15
|
-
"productivity"
|
|
15
|
+
"productivity",
|
|
16
|
+
"ratchet",
|
|
17
|
+
"autonomous",
|
|
18
|
+
"spikes",
|
|
19
|
+
"evolutionary"
|
|
16
20
|
],
|
|
17
21
|
"author": "saidwafiq",
|
|
18
22
|
"license": "MIT",
|
|
@@ -24,6 +24,7 @@ Implement tasks from PLAN.md with parallel agents, atomic commits, ratchet-drive
|
|
|
24
24
|
|
|
25
25
|
## Skills & Agents
|
|
26
26
|
- Skill: `atomic-commits` — Clean commit protocol
|
|
27
|
+
- Skill: `context-hub` — Fetch external API docs before coding (when task involves external libraries)
|
|
27
28
|
|
|
28
29
|
**Use Task tool to spawn agents:**
|
|
29
30
|
| Agent | subagent_type | Purpose |
|
|
@@ -453,8 +454,11 @@ Files: {target files}
|
|
|
453
454
|
Spec: {spec_name}
|
|
454
455
|
|
|
455
456
|
Steps:
|
|
456
|
-
1.
|
|
457
|
-
|
|
457
|
+
1. If the task involves external APIs/SDKs, run: chub search "<library>" --json → chub get <id> --lang <lang>
|
|
458
|
+
Use fetched docs as ground truth for API signatures. Annotate any gaps: chub annotate <id> "note"
|
|
459
|
+
Skip this step if chub is not installed or the task only touches internal code.
|
|
460
|
+
2. Implement the task
|
|
461
|
+
3. Commit as feat({spec}): {description}
|
|
458
462
|
|
|
459
463
|
Your ONLY job is to write code and commit. The orchestrator will run health checks after you finish.
|
|
460
464
|
```
|
|
@@ -546,6 +550,7 @@ After spawning wave agents, your turn ENDS. Completion notifications drive the l
|
|
|
546
550
|
| Machine-selected winner | Fewer regressions > better coverage > fewer files changed; no LLM judge |
|
|
547
551
|
| Failed probe insights logged | `.deepflow/auto-memory.yaml` in main tree; persists across cycles |
|
|
548
552
|
| Winner cherry-picked to shared worktree | Downstream tasks see winning approach via shared worktree |
|
|
553
|
+
| External APIs → chub first | Agents fetch curated docs before implementing external API calls; skip if chub unavailable |
|
|
549
554
|
|
|
550
555
|
## Example
|
|
551
556
|
|
|
@@ -0,0 +1,87 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: context-hub
|
|
3
|
+
description: Fetches curated API docs for external libraries before coding. Use when implementing code that uses external APIs/SDKs (Stripe, OpenAI, MongoDB, etc.) to avoid hallucinating APIs and reduce token usage.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Context Hub
|
|
7
|
+
|
|
8
|
+
Fetch curated, versioned docs for external libraries instead of guessing APIs.
|
|
9
|
+
|
|
10
|
+
## When to Use
|
|
11
|
+
|
|
12
|
+
Before writing code that calls an external API or SDK:
|
|
13
|
+
- New library integration (e.g., Stripe payments, AWS S3)
|
|
14
|
+
- Unfamiliar API version or method
|
|
15
|
+
- Complex API with many options (e.g., MongoDB aggregation)
|
|
16
|
+
|
|
17
|
+
**Skip when:** Working with internal code (use LSP instead) or well-known stdlib APIs.
|
|
18
|
+
|
|
19
|
+
## Prerequisites
|
|
20
|
+
|
|
21
|
+
Requires `chub` CLI: `npm install -g @aisuite/chub`
|
|
22
|
+
|
|
23
|
+
If `chub` is not installed, tell the user and skip — don't block implementation.
|
|
24
|
+
|
|
25
|
+
## Workflow
|
|
26
|
+
|
|
27
|
+
### 1. Search for docs
|
|
28
|
+
|
|
29
|
+
```bash
|
|
30
|
+
chub search "<library or API>" --json
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
Example:
|
|
34
|
+
```bash
|
|
35
|
+
chub search "stripe payments" --json
|
|
36
|
+
chub search "mongodb aggregation" --json
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
### 2. Fetch relevant docs
|
|
40
|
+
|
|
41
|
+
```bash
|
|
42
|
+
chub get <id> --lang <py|js|ts>
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
Use `--lang` matching the project language. Use `--full` only if the summary lacks what you need.
|
|
46
|
+
|
|
47
|
+
### 3. Write code using fetched docs
|
|
48
|
+
|
|
49
|
+
Use the retrieved documentation as ground truth for API signatures, parameter names, and patterns.
|
|
50
|
+
|
|
51
|
+
### 4. Annotate discoveries
|
|
52
|
+
|
|
53
|
+
When you find something the docs missed or got wrong:
|
|
54
|
+
|
|
55
|
+
```bash
|
|
56
|
+
chub annotate <id> "Note: method X requires param Y since v2.0"
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
This persists locally and appears on future `chub get` calls — the agent learns across sessions.
|
|
60
|
+
|
|
61
|
+
### 5. Rate docs (optional)
|
|
62
|
+
|
|
63
|
+
```bash
|
|
64
|
+
chub feedback <id> up --label accurate
|
|
65
|
+
chub feedback <id> down --label outdated
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
Labels: `accurate`, `outdated`, `incomplete`, `wrong-version`, `helpful`
|
|
69
|
+
|
|
70
|
+
## Integration with LSP
|
|
71
|
+
|
|
72
|
+
| Need | Tool |
|
|
73
|
+
|------|------|
|
|
74
|
+
| Internal code navigation | LSP (`goToDefinition`, `findReferences`) |
|
|
75
|
+
| External API signatures | Context Hub (`chub get`) |
|
|
76
|
+
| Symbol search in project | LSP (`workspaceSymbol`) |
|
|
77
|
+
| Library usage patterns | Context Hub (`chub search`) |
|
|
78
|
+
|
|
79
|
+
**Combined approach:** Use LSP to understand how the project currently uses a library, then use Context Hub to verify correct API usage and discover better patterns.
|
|
80
|
+
|
|
81
|
+
## Rules
|
|
82
|
+
|
|
83
|
+
- Always search before implementing external API calls
|
|
84
|
+
- Trust chub docs over training data for API specifics
|
|
85
|
+
- Annotate gaps so future sessions benefit
|
|
86
|
+
- Don't block on chub failures — fall back to best knowledge
|
|
87
|
+
- Prefer `--json` flag for programmatic parsing in automated workflows
|