buildflow-dev 4.0.1 → 4.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -95,10 +95,10 @@ These are installed into your AI tool and triggered by typing `/` (or `@` / `$`
95
95
  | Command | Agent | Purpose | Token Cost |
96
96
  |---------|-------|---------|-----------|
97
97
  | `/buildflow-start` | Strategist | Begin project: vision questions, pruning of stale context, saves to `core/vision.md` | ~8K |
98
- | `/buildflow-think [topic]` | Researcher × 3 + Synthesizer | Parallel web research on a topic, synthesized into a recommendation | ~30K |
99
- | `/buildflow-spec` | Strategist | **NEW** — Generate formal PRD + Technical Design + Acceptance Criteria. Required before planning | ~18K |
100
- | `/buildflow-plan [phase]` | Architect | Reads specs, maps tasks to ACs, groups into dependency waves, checks full AC coverage | ~20K |
101
- | `/buildflow-build [wave]` | Builder × N + Reviewer | Execute waves with context-isolated Builders each wave auto-tests, auto-fixes, only advances when green | ~50K/wave |
98
+ | `/buildflow-think [topic]` | Researcher × 3 + Synthesizer | Research + `--arch` (architecture review) + `--build-vs-buy` + `--debt` + `--complexity` modes | ~30K |
99
+ | `/buildflow-spec` | Strategist | Generate user-story-backed PRD + TDD + ACs with Spec Critic self-review pass. Required before planning | ~20K |
100
+ | `/buildflow-plan [phase]` | Architect | AC-traced tasks, HARD/SOFT/EXTERNAL dependency reasoning, effort estimates, risk sequencing, Engineering Review | ~22K |
101
+ | `/buildflow-build [wave]` | Builder × N + Reviewer | Context packets with closest-example + before/after contracts. Auto-test, auto-fix, PR-ready commits per wave | ~50K/wave |
102
102
  | `/buildflow-test [wave]` | Reviewer | Standalone test + fix loop — re-verify a wave or test a manual change | ~25K |
103
103
  | `/buildflow-check` | Reviewer × 4 | Spec compliance + correctness + quality + security in parallel | ~22K |
104
104
  | `/buildflow-ship` | Strategist + Security Auditor | Spec gate + security gate + context pruning + git tag | ~22K |
@@ -107,8 +107,8 @@ These are installed into your AI tool and triggered by typing `/` (or `@` / `$`
107
107
 
108
108
  | Command | Agent | Purpose | Token Cost |
109
109
  |---------|-------|---------|-----------|
110
- | `/buildflow-onboard` | Cartographer | One-time analysis: writes `MAP.md`, `PATTERNS.md`, `DEPENDENCIES.md`, `HOTSPOTS.md` | ~35K |
111
- | `/buildflow-modify "description"` | Surgeon | Surgical change with blast-radius analysis and restore point use for features **and bugfixes** | ~30K |
110
+ | `/buildflow-onboard` | Cartographer | Deep analysis: import graph, module boundaries, load-bearing files, risk scores → MAP/GRAPH/PATTERNS/DEPENDENCIES/HOTSPOTS | ~40K |
111
+ | `/buildflow-modify "description"` | Surgeon | Full transitive impact chain + risk scores + test coverage map + API contract check + surgical change | ~30K |
112
112
  | `/buildflow-refactor [scope]` | Surgeon + Reviewer | Improve code quality without changing behavior | ~40K |
113
113
 
114
114
  **`/buildflow-modify` works for both features and bugs.** Pass a plain-English description either way:
@@ -460,7 +460,7 @@ buildflow-dev/
460
460
  │ │ followed by numbered steps the agent follows.
461
461
  │ │
462
462
  │ ├── start.md Vision gathering, mode detection, light.md pruning on session start
463
- │ ├── think.md Parallel research with up to 3 Researcher agents
463
+ │ ├── think.md Parallel research + architecture review + build-vs-buy + debt + complexity modes
464
464
  │ ├── spec.md Generate PRD + TDD + Acceptance Criteria (required before plan)
465
465
  │ ├── plan.md AC-traced dependency mapping → wave-based execution plan
466
466
  │ ├── build.md Wave-by-wave parallel Builder execution
@@ -468,8 +468,8 @@ buildflow-dev/
468
468
  │ ├── check.md 3-reviewer parallel quality check
469
469
  │ ├── ship.md Spec gate + security gate + context pruning → retro → git tag
470
470
  │ ├── hotfix.md Fast-path fix — no spec, no plan, restore point → fix → test → commit
471
- │ ├── onboard.md One-time codebase analysis → MAP/PATTERNS/DEPENDENCIES/HOTSPOTS
472
- │ ├── modify.md Surgical code change with blast-radius analysis
471
+ │ ├── onboard.md Deep codebase analysis → MAP/GRAPH/PATTERNS/DEPENDENCIES/HOTSPOTS with risk scores
472
+ │ ├── modify.md Transitive impact chain + risk scoring + test coverage map + surgical change
473
473
  │ ├── refactor.md Quality improvement without behavior change
474
474
  │ ├── audit.md OWASP Top 10 AI-powered scan
475
475
  │ ├── debug.md Root-cause analysis for failing tests or broken behavior
@@ -536,10 +536,14 @@ their-project/
536
536
  │ with sources, trust scores, and the synthesized recommendation.
537
537
 
538
538
  ├── codebase/ Generated by /buildflow-onboard (existing projects only).
539
- │ ├── MAP.md Architecture overview, folder structure, entry points
540
- │ ├── PATTERNS.md Code conventions: naming, imports, error handling, testing
541
- ├── DEPENDENCIES.md Top dependencies with purpose and security status
542
- └── HOTSPOTS.md High-complexity files to handle carefully
539
+ │ ├── MAP.md Architecture overview, module boundaries, load-bearing files
540
+ │ ├── GRAPH.md Import dependency graph fan-in/fan-out per file. Used by
541
+ │ /buildflow-modify for transitive impact analysis.
542
+ ├── PATTERNS.md Code conventions: naming, imports, error handling, testing.
543
+ │ │ Used by Builders as the "closest example" source.
544
+ │ ├── DEPENDENCIES.md Top dependencies with purpose, criticality, security status
545
+ │ └── HOTSPOTS.md Files with risk scores ≥ 3.5 — high fan-in, low test coverage,
546
+ │ large size. Surgeon always checks this before modifying.
543
547
 
544
548
  ├── phases/ One subfolder per phase (01/, 02/, etc.)
545
549
  │ └── 01/
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "buildflow-dev",
3
- "version": "4.0.1",
3
+ "version": "4.0.2",
4
4
  "description": "Spec-driven, multi-agent AI development orchestration with automatic token pruning. Works with Claude Code, Gemini CLI, Codex CLI, Cursor, and more.",
5
5
  "keywords": [
6
6
  "ai",
@@ -1,70 +1,202 @@
1
1
  ---
2
2
  name: buildflow-build
3
- description: Execute the spec-traced plan wave-by-wave with auto-test and auto-fix per wave
3
+ description: Spec-traced wave execution with pattern-matched Builders, auto-test, auto-fix, and PR-ready commits
4
4
  allowed-tools: Read, Write, Bash, Grep, Glob
5
5
  agents: builder, reviewer
6
6
  ---
7
7
 
8
8
  # /buildflow-build
9
9
 
10
- Execute the current phase plan. Spawns parallel Builder agents per wave. Each wave auto-tests and auto-fixes until green. The next wave does not start until the current wave fully passes.
11
-
12
- Every task is traced to an acceptance criterion. Builders reference specs — not opinions.
10
+ Execute the current phase plan. Each Builder receives a precise context packet — task spec, AC refs, before/after contract, and the closest existing example to follow. Every wave auto-tests, auto-fixes until green, and produces a PR-ready commit. The next wave never starts until the current wave is fully passing.
13
11
 
14
12
  ## Usage
15
- - `/buildflow-build` — execute all waves in the current plan
13
+ - `/buildflow-build` — execute all waves
16
14
  - `/buildflow-build wave-2` — execute a specific wave
17
- - `/buildflow-build <task>` — build and test a single task
15
+ - `/buildflow-build <task>` — build a single task
18
16
 
19
17
  ## Context Packet for this command (load only these)
20
18
  - `.buildflow/phases/[N]/PLAN.md`
21
- - `.buildflow/memory/light.md` (app_name, framework, style_fingerprint fields only)
22
19
  - `.buildflow/codebase/PATTERNS.md` (if exists)
23
- - Do NOT load: full codebase, specs, research, retros, old phases
20
+ - `.buildflow/memory/light.md` (app_name, framework, style_fingerprint only)
21
+
22
+ Do NOT load: full specs, full codebase, research, retros, old phases.
23
+
24
+ ---
24
25
 
25
- ## Step 1: Load Plan
26
+ ## Step 1: Load & Confirm Plan
26
27
  Read `.buildflow/phases/[N]/PLAN.md`.
27
- Confirm: "Phase [N] — [N] waves, [N] tasks, [N] ACs covered. Starting Wave 1."
28
+ Report: "Phase [N] — [N] waves, [N] tasks, [N] ACs. Est: [total]. Starting Wave [N]."
29
+
30
+ Check external dependency checklist if present. If unchecked items: "Verify these before building: [list]"
31
+
32
+ ---
33
+
34
+ ## Step 2: Detect Test Framework (runs once before any wave)
35
+
36
+ Before writing a single test line, identify what testing infrastructure exists.
37
+
38
+ ### Detection checklist:
39
+
40
+ **JavaScript / TypeScript:**
41
+ ```bash
42
+ # Check package.json for test deps
43
+ cat package.json | grep -E "jest|vitest|mocha|jasmine|@testing-library|supertest|cypress|playwright"
44
+ # Check for config files
45
+ ls jest.config.* vitest.config.* .mocharc.* 2>/dev/null
46
+ # Check for existing test files
47
+ find . -name "*.test.ts" -o -name "*.test.js" -o -name "*.spec.ts" -o -name "*.spec.js" | head -5
48
+ find . -type d -name "__tests__" | head -3
49
+ ```
50
+
51
+ **Python:**
52
+ ```bash
53
+ cat requirements.txt pyproject.toml setup.cfg 2>/dev/null | grep -E "pytest|unittest|nose"
54
+ find . -name "test_*.py" -o -name "*_test.py" | head -5
55
+ ```
56
+
57
+ **Go:**
58
+ ```bash
59
+ find . -name "*_test.go" | head -5
60
+ ```
61
+
62
+ **Rust:**
63
+ ```bash
64
+ grep -n "#\[test\]\|#\[cfg(test)\]" src/**/*.rs | head -5
65
+ ```
66
+
67
+ ### Framework Resolution:
68
+
69
+ | Result | Action |
70
+ |--------|--------|
71
+ | Framework found + config exists + test files exist | Use it. Infer conventions from existing test files. |
72
+ | Framework in package.json but no test files yet | Use it. Write tests following framework docs conventions. |
73
+ | No framework found, greenfield project | Ask: "No test framework detected. Recommend installing [Jest/Vitest for TS, pytest for Python, built-in for Go/Rust]. Set it up now? (yes / skip / I'll do it later)" |
74
+ | No framework, existing project with no tests | Warn: "⚠ No test framework found. Tests cannot be written until one is installed. Proceeding without tests — recommend adding [framework] before shipping." Log to `security/DEBT.md`: "No test framework — zero coverage." |
75
+
76
+ ### If framework found — capture test profile:
77
+ ```
78
+ Test Framework Profile
79
+ ──────────────────────
80
+ Framework: Jest 29 / Vitest 1.x / pytest 7.x / go test / cargo test
81
+ Config file: jest.config.ts / vitest.config.ts / pytest.ini / N/A
82
+ Test location: co-located (*.test.ts) / __tests__/ / tests/
83
+ Naming: describe/it / test() / def test_ / #[test]
84
+ Mocking: jest.mock / vi.mock / pytest fixtures / mockall
85
+ Coverage tool: --coverage / --cov / go test -cover / cargo tarpaulin
86
+ Existing tests: [N] files, [N] total cases
87
+ ```
88
+
89
+ This profile is passed to every Builder as part of their context packet.
90
+
91
+ ---
92
+
93
+ ## Step 3: Establish Style Fingerprint
94
+ If `PATTERNS.md` exists: extract the 5 most important conventions and hold them in scope.
95
+ If not: read 2 existing source files and infer:
96
+ - Naming convention
97
+ - Import order
98
+ - Error handling pattern
99
+ - Async style
100
+ - Test naming pattern (from test profile above)
28
101
 
29
- ## Step 2: Style Fingerprint
30
- Before writing any code:
31
- - Naming conventions (camelCase, PascalCase, snake_case)
32
- - Import organization pattern
33
- - Error handling style
34
- - Test file location and naming
35
- - Comment style
102
+ This fingerprint applies to every Builder in every wave.
36
103
 
37
104
  ---
38
105
 
39
106
  ## Step 3: Wave Execution Loop
40
107
 
41
- Repeat this block for each wave:
108
+ Repeat for each wave:
109
+
110
+ ### 3a — Build Context Packets
111
+ For each task in this wave, assemble a minimal context packet:
42
112
 
43
- ### 3a — Prepare Builder Context Packets
44
- For each task in this wave, prepare a minimal context packet:
45
113
  ```
46
- Task spec (from PLAN.md)
47
- AC refs: [which ACs this task satisfies]
48
- Relevant files: [max 5 files this task touches — not full codebase]
49
- Style rules: [3-5 key conventions from PATTERNS.md]
114
+ Task: [name]
115
+ Goal: [one sentence — what this task makes true]
116
+ AC refs: [AC-001, AC-003]
117
+ Before: [what currently exists "file doesn't exist" or "function X does Y"]
118
+ After: [what must be true when this task is done]
119
+
120
+ Files to create/modify: [explicit list — max 5]
121
+ Closest existing example: [path/to/similar/file.ts — "follow this structure"]
122
+ Key pattern to follow: [specific convention from PATTERNS.md]
123
+ Definition of done: [linked ACs that must pass]
50
124
  ```
51
- Builders receive ONLY this packet — not full project state.
52
- This is what keeps token usage low and context clean.
53
125
 
54
- ### 3bBuild (parallel)
55
- Spawn Builder agents in parallel, one per task.
126
+ The "closest existing example" is the most important field. Builders replicate proven patterns they don't invent new ones unless the task explicitly requires it. Find the nearest analog in the codebase.
127
+
128
+ ### 3b — Parallel Build
129
+ Spawn one Builder per task. Each Builder receives ONLY its context packet.
130
+
56
131
  Each Builder:
57
- - Receives its context packet only
58
- - Writes code that satisfies the referenced ACs
59
- - Adds LEARN: comment for non-obvious patterns
60
- - Reports back: files created/modified, AC coverage confirmed
61
-
62
- ### 3c — Review
63
- Reviewer checks each output:
64
- - Does it satisfy the referenced ACs?
65
- - Does it match PATTERNS.md style?
132
+ - Writes code that satisfies the Before → After contract
133
+ - Follows the closest existing example's structure
134
+ - Covers the referenced ACs
135
+ - **Writes tests as part of the same task — not after, not later, not optional**
136
+ - Adds `LEARN:` comment only for patterns not present elsewhere in the codebase
137
+
138
+ #### Mandatory Test Writing Rules (enforced per Builder)
139
+
140
+ **Prerequisite:** Test Framework Profile from Step 2 must exist. If no framework was found and user chose to skip, mark this task's test output as SKIPPED and log to `security/DEBT.md`.
141
+
142
+ **For every new source file created:**
143
+ - Create a corresponding test file using the detected framework and location convention:
144
+ - Jest/Vitest co-located: `auth.service.ts` → `auth.service.test.ts`
145
+ - `__tests__` folder: `src/auth/auth.service.ts` → `src/auth/__tests__/auth.service.test.ts`
146
+ - pytest: `src/auth/service.py` → `tests/auth/test_service.py`
147
+ - Go: `auth/service.go` → `auth/service_test.go` (same package)
148
+ - Rust: add `#[cfg(test)] mod tests { }` block inside same file
149
+ - Test file must cover: each exported function/method, each AC referenced by this task
150
+ - Minimum: 1 happy path + 1 error/edge case per exported function
151
+
152
+ **For every modified source file:**
153
+ - Locate the existing test file using the detected convention
154
+ - Add new test cases for every function whose behavior changed
155
+ - Update existing test cases if the function's contract or signature changed
156
+ - Do NOT delete passing test cases unless the behavior they test was explicitly removed
157
+
158
+ **Test structure — follow detected framework exactly:**
159
+
160
+ Jest / Vitest:
161
+ ```typescript
162
+ describe('AuthService', () => {
163
+ describe('login', () => {
164
+ it('returns token when credentials are valid', async () => { ... })
165
+ it('throws UnauthorizedError when password is wrong', async () => { ... })
166
+ })
167
+ })
168
+ ```
169
+ pytest:
170
+ ```python
171
+ def test_login_returns_token_with_valid_credentials(): ...
172
+ def test_login_raises_unauthorized_with_wrong_password(): ...
173
+ ```
174
+ Go:
175
+ ```go
176
+ func TestLogin_ReturnsToken_WithValidCredentials(t *testing.T) { ... }
177
+ func TestLogin_ReturnsError_WithWrongPassword(t *testing.T) { ... }
178
+ ```
179
+
180
+ Builder reports back:
181
+ ```
182
+ Task: [name] — COMPLETE
183
+ Files created: [list]
184
+ Files modified: [list]
185
+ Test files written/updated: [list with case count]
186
+ auth.service.test.ts — 6 cases (4 new, 2 updated)
187
+ ACs addressed: [AC-001 ✓, AC-003 ✓]
188
+ Pattern followed: [example file used]
189
+ ```
190
+
191
+ ### 3c — Reviewer Check
192
+ Reviewer reads each Builder's output:
193
+ - Does the implementation satisfy the referenced ACs?
194
+ - Does it match the style fingerprint and closest example?
195
+ - Are tests present for non-trivial logic?
66
196
  - Any security concerns?
67
- - Tests written for new logic?
197
+ - Did the Builder follow the Before → After contract?
198
+
199
+ Flag any deviation from existing patterns — Builders should blend in, not stand out.
68
200
 
69
201
  ### 3d — Test + Fix Loop
70
202
  Run the full test suite:
@@ -75,42 +207,70 @@ go test ./... # Go
75
207
  cargo test # Rust
76
208
  ```
77
209
 
78
- If frontend code changed: verify dev server renders without errors, core UI flow works.
210
+ If frontend changed: verify dev server renders without errors, core flow works.
211
+ Check: no regressions in previously passing tests.
79
212
 
80
- **If tests fail:**
81
- 1. Identify root cause (error file line → why)
82
- 2. Apply minimal fix (change only what broke)
83
- 3. Re-run full test suite
84
- 4. Repeat until green
213
+ **On test failure:**
214
+ 1. Read the exact error file, line, message
215
+ 2. Trace root cause (not just symptom)
216
+ 3. Apply minimal fix
217
+ 4. Re-run tests
218
+ 5. Repeat until green
85
219
 
86
- Max 5 fix attempts per wave.
87
- If still failing after 5: stop, report unresolved failures, ask user how to proceed.
220
+ Max 5 fix attempts. After 5: stop, report what's unresolved, ask how to proceed.
88
221
 
89
- Fix log:
222
+ Fix log per attempt:
90
223
  ```
91
- Wave [N] Fix [X]/5: [error] → [root cause] → [fix applied] → [result]
224
+ Fix [X]/5 Wave [N]
225
+ Error: [message at file:line]
226
+ Root cause: [why it's failing]
227
+ Fix: [exactly what changed]
228
+ Result: PASS / still failing
229
+ ```
230
+
231
+ ### 3e — Wave Commit
232
+ When all tests pass, commit this wave atomically:
233
+ ```bash
234
+ git add [changed files — explicit list, not -A]
235
+ git commit -m "[type](scope): [what changed]
236
+
237
+ [Body: why this change, which ACs it satisfies]
238
+ [AC refs: AC-001, AC-003]
239
+ [Wave: N of M]"
92
240
  ```
93
241
 
94
- ### 3e Wave Complete
95
- Only after all tests pass:
96
- - Mark wave as complete in `phases/[N]/PLAN.md`
97
- - Continue to next wave
242
+ Commit types: `feat` / `fix` / `test` / `refactor` / `chore`
243
+
244
+ Example:
245
+ ```
246
+ feat(auth): add JWT middleware and login route
247
+
248
+ Implements token validation for all protected routes.
249
+ Satisfies: AC-001 (valid login), AC-002 (invalid password rejection), AC-003 (expired token)
250
+ Wave: 2 of 4
251
+ ```
252
+
253
+ Mark wave complete in `phases/[N]/PLAN.md`. Proceed to next wave.
98
254
 
99
255
  ---
100
256
 
101
- ## Step 4: Integration Check
102
- After all waves pass:
103
- - Run full test suite one final time
104
- - Verify pieces connect correctly across wave boundaries
105
- - Check for import errors or missing dependencies
257
+ ## Step 4: Final Integration Check
258
+ After all waves:
259
+ - Run full test suite one last time
260
+ - Verify all AC-referenced behaviors work end-to-end
261
+ - Check imports across wave boundaries (no dangling references)
106
262
 
107
- ## Step 5: Update Memory (minimal — prune stale fields)
263
+ ---
264
+
265
+ ## Step 5: Update Memory (lean — prune old build fields)
108
266
  ```yaml
109
267
  last_build_date: [today]
110
- current_phase: [N]
111
268
  plan_status: built
112
269
  test_status: passing
270
+ waves_completed: [N]
113
271
  ```
114
- Remove from light.md: any per-wave task details from previous builds (keep it lean).
272
+ Remove from `light.md`: per-task details from previous builds.
273
+
274
+ ---
115
275
 
116
- ## Token Budget: ~50K per wave (build + context packets + test-fix loop)
276
+ ## Token Budget: ~50K per wave (context packets keep individual Builder costs low)
@@ -55,16 +55,41 @@ Make the minimal change:
55
55
  - Do not refactor, rename, or clean up surrounding code
56
56
  - Match existing code style
57
57
 
58
+ ## Step 4b: Write Regression Test (always — even in hotfix mode)
59
+
60
+ ### First: check if a test framework exists
61
+ ```bash
62
+ cat package.json | grep -E "jest|vitest|mocha" 2>/dev/null
63
+ find . -name "*.test.ts" -o -name "*.test.js" -o -name "test_*.py" | head -3
64
+ ```
65
+
66
+ - **Framework found:** write the regression test using it
67
+ - **No framework found:** warn — "No test framework detected. Regression test skipped. This bug may recur." Log to `security/DEBT.md`: "Hotfix [description] shipped without regression test — no framework available."
68
+ - Do not block the hotfix for a missing framework, but always log the gap.
69
+
70
+ For the specific behavior being fixed:
71
+ 1. Write a test that reproduces the bug before applying the fix
72
+ 2. Run it — confirm it fails
73
+ 3. Apply the fix (Step 4)
74
+ 4. Run it again — confirm it passes
75
+
76
+ Name it after the exact bug:
77
+ ```
78
+ it('should not crash when user has no profile photo')
79
+ it('should return 401 when session token is expired')
80
+ ```
81
+
82
+ If a test file already exists for the changed file: add the case there.
83
+ If not: create a minimal test file covering this function only. Do not skip — a hotfix without a regression test will regress again.
84
+
58
85
  ## Step 5: Test
59
- Run the test suite:
86
+ Run the full test suite:
60
87
  ```bash
61
88
  npm test # or pytest / go test etc.
62
89
  ```
63
90
 
64
91
  If tests fail: fix and re-test. Max 3 attempts before stopping and asking the user.
65
92
 
66
- If no tests exist for the changed area: flag it after shipping.
67
-
68
93
  ## Step 6: Ship
69
94
  ```bash
70
95
  git add [changed files only]