npm - buildflow-dev - Versions diffs - 4.0.1 → 4.0.2 - Mend

buildflow-dev 4.0.1 → 4.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/README.md +17 -13
package/package.json +1 -1
package/templates/commands/build.md +221 -61
package/templates/commands/hotfix.md +28 -3
package/templates/commands/modify.md +212 -38
package/templates/commands/onboard.md +246 -52
package/templates/commands/plan.md +182 -56
package/templates/commands/spec.md +178 -74
package/templates/commands/think.md +186 -38

package/README.md CHANGED Viewed

@@ -95,10 +95,10 @@ These are installed into your AI tool and triggered by typing `/` (or `@` / `$`
 | Command | Agent | Purpose | Token Cost |
 |---------|-------|---------|-----------|
 | `/buildflow-start` | Strategist | Begin project: vision questions, pruning of stale context, saves to `core/vision.md` | ~8K |
-| `/buildflow-think [topic]` | Researcher × 3 + Synthesizer | Parallel web research on a topic, synthesized into a recommendation | ~30K |
-| `/buildflow-spec` | Strategist | **NEW** — Generate formal PRD + Technical Design + Acceptance Criteria. Required before planning | ~18K |
-| `/buildflow-plan [phase]` | Architect | Reads specs, maps tasks to ACs, groups into dependency waves, checks full AC coverage | ~20K |
-| `/buildflow-build [wave]` | Builder × N + Reviewer | Execute waves with context-isolated Builders — each wave auto-tests, auto-fixes, only advances when green | ~50K/wave |
+| `/buildflow-think [topic]` | Researcher × 3 + Synthesizer | Research + `--arch` (architecture review) + `--build-vs-buy` + `--debt` + `--complexity` modes | ~30K |
+| `/buildflow-spec` | Strategist | Generate user-story-backed PRD + TDD + ACs with Spec Critic self-review pass. Required before planning | ~20K |
+| `/buildflow-plan [phase]` | Architect | AC-traced tasks, HARD/SOFT/EXTERNAL dependency reasoning, effort estimates, risk sequencing, Engineering Review | ~22K |
+| `/buildflow-build [wave]` | Builder × N + Reviewer | Context packets with closest-example + before/after contracts. Auto-test, auto-fix, PR-ready commits per wave | ~50K/wave |
 | `/buildflow-test [wave]` | Reviewer | Standalone test + fix loop — re-verify a wave or test a manual change | ~25K |
 | `/buildflow-check` | Reviewer × 4 | Spec compliance + correctness + quality + security in parallel | ~22K |
 | `/buildflow-ship` | Strategist + Security Auditor | Spec gate + security gate + context pruning + git tag | ~22K |
@@ -107,8 +107,8 @@ These are installed into your AI tool and triggered by typing `/` (or `@` / `$`
 | Command | Agent | Purpose | Token Cost |
 |---------|-------|---------|-----------|
-| `/buildflow-onboard` | Cartographer | One-time analysis: writes `MAP.md`, `PATTERNS.md`, `DEPENDENCIES.md`, `HOTSPOTS.md` | ~35K |
-| `/buildflow-modify "description"` | Surgeon | Surgical change with blast-radius analysis and restore point — use for features **and bugfixes** | ~30K |
+| `/buildflow-onboard` | Cartographer | Deep analysis: import graph, module boundaries, load-bearing files, risk scores → MAP/GRAPH/PATTERNS/DEPENDENCIES/HOTSPOTS | ~40K |
+| `/buildflow-modify "description"` | Surgeon | Full transitive impact chain + risk scores + test coverage map + API contract check + surgical change | ~30K |
 | `/buildflow-refactor [scope]` | Surgeon + Reviewer | Improve code quality without changing behavior | ~40K |
 **`/buildflow-modify` works for both features and bugs.** Pass a plain-English description either way:
@@ -460,7 +460,7 @@ buildflow-dev/
 │       │                     followed by numbered steps the agent follows.
 │       │
 │       ├── start.md          Vision gathering, mode detection, light.md pruning on session start
-│       ├── think.md          Parallel research with up to 3 Researcher agents
+│       ├── think.md          Parallel research + architecture review + build-vs-buy + debt + complexity modes
 │       ├── spec.md           Generate PRD + TDD + Acceptance Criteria (required before plan)
 │       ├── plan.md           AC-traced dependency mapping → wave-based execution plan
 │       ├── build.md          Wave-by-wave parallel Builder execution
@@ -468,8 +468,8 @@ buildflow-dev/
 │       ├── check.md          3-reviewer parallel quality check
 │       ├── ship.md           Spec gate + security gate + context pruning → retro → git tag
 │       ├── hotfix.md         Fast-path fix — no spec, no plan, restore point → fix → test → commit
-│       ├── onboard.md        One-time codebase analysis → MAP/PATTERNS/DEPENDENCIES/HOTSPOTS
-│       ├── modify.md         Surgical code change with blast-radius analysis
+│       ├── onboard.md        Deep codebase analysis → MAP/GRAPH/PATTERNS/DEPENDENCIES/HOTSPOTS with risk scores
+│       ├── modify.md         Transitive impact chain + risk scoring + test coverage map + surgical change
 │       ├── refactor.md       Quality improvement without behavior change
 │       ├── audit.md          OWASP Top 10 AI-powered scan
 │       ├── debug.md          Root-cause analysis for failing tests or broken behavior
@@ -536,10 +536,14 @@ their-project/
     │                         with sources, trust scores, and the synthesized recommendation.
     │
     ├── codebase/             Generated by /buildflow-onboard (existing projects only).
-    │   ├── MAP.md            Architecture overview, folder structure, entry points
-    │   ├── PATTERNS.md       Code conventions: naming, imports, error handling, testing
-    │   ├── DEPENDENCIES.md   Top dependencies with purpose and security status
-    │   └── HOTSPOTS.md       High-complexity files to handle carefully
+    │   ├── MAP.md            Architecture overview, module boundaries, load-bearing files
+    │   ├── GRAPH.md          Import dependency graph — fan-in/fan-out per file. Used by
+    │   │                     /buildflow-modify for transitive impact analysis.
+    │   ├── PATTERNS.md       Code conventions: naming, imports, error handling, testing.
+    │   │                     Used by Builders as the "closest example" source.
+    │   ├── DEPENDENCIES.md   Top dependencies with purpose, criticality, security status
+    │   └── HOTSPOTS.md       Files with risk scores ≥ 3.5 — high fan-in, low test coverage,
+    │                         large size. Surgeon always checks this before modifying.
     │
     ├── phases/               One subfolder per phase (01/, 02/, etc.)
     │   └── 01/

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "buildflow-dev",
-  "version": "4.0.1",
+  "version": "4.0.2",
   "description": "Spec-driven, multi-agent AI development orchestration with automatic token pruning. Works with Claude Code, Gemini CLI, Codex CLI, Cursor, and more.",
   "keywords": [
     "ai",

package/templates/commands/build.md CHANGED Viewed

@@ -1,70 +1,202 @@
 ---
 name: buildflow-build
-description: Execute the spec-traced plan wave-by-wave with auto-test and auto-fix per wave
+description: Spec-traced wave execution with pattern-matched Builders, auto-test, auto-fix, and PR-ready commits
 allowed-tools: Read, Write, Bash, Grep, Glob
 agents: builder, reviewer
 ---
 # /buildflow-build
-Execute the current phase plan. Spawns parallel Builder agents per wave. Each wave auto-tests and auto-fixes until green. The next wave does not start until the current wave fully passes.
-Every task is traced to an acceptance criterion. Builders reference specs — not opinions.
+Execute the current phase plan. Each Builder receives a precise context packet — task spec, AC refs, before/after contract, and the closest existing example to follow. Every wave auto-tests, auto-fixes until green, and produces a PR-ready commit. The next wave never starts until the current wave is fully passing.
 ## Usage
-- `/buildflow-build` — execute all waves in the current plan
+- `/buildflow-build` — execute all waves
 - `/buildflow-build wave-2` — execute a specific wave
-- `/buildflow-build <task>` — build and test a single task
+- `/buildflow-build <task>` — build a single task
 ## Context Packet for this command (load only these)
 - `.buildflow/phases/[N]/PLAN.md`
-- `.buildflow/memory/light.md` (app_name, framework, style_fingerprint fields only)
 - `.buildflow/codebase/PATTERNS.md` (if exists)
-- Do NOT load: full codebase, specs, research, retros, old phases
+- `.buildflow/memory/light.md` (app_name, framework, style_fingerprint only)
+Do NOT load: full specs, full codebase, research, retros, old phases.
+---
-## Step 1: Load Plan
+## Step 1: Load & Confirm Plan
 Read `.buildflow/phases/[N]/PLAN.md`.
-Confirm: "Phase [N] — [N] waves, [N] tasks, [N] ACs covered. Starting Wave 1."
+Report: "Phase [N] — [N] waves, [N] tasks, [N] ACs. Est: [total]. Starting Wave [N]."
+Check external dependency checklist if present. If unchecked items: "Verify these before building: [list]"
+---
+## Step 2: Detect Test Framework (runs once before any wave)
+Before writing a single test line, identify what testing infrastructure exists.
+### Detection checklist:
+**JavaScript / TypeScript:**
+```bash
+# Check package.json for test deps
+cat package.json | grep -E "jest|vitest|mocha|jasmine|@testing-library|supertest|cypress|playwright"
+# Check for config files
+ls jest.config.* vitest.config.* .mocharc.* 2>/dev/null
+# Check for existing test files
+find . -name "*.test.ts" -o -name "*.test.js" -o -name "*.spec.ts" -o -name "*.spec.js" | head -5
+find . -type d -name "__tests__" | head -3
+```
+**Python:**
+```bash
+cat requirements.txt pyproject.toml setup.cfg 2>/dev/null | grep -E "pytest|unittest|nose"
+find . -name "test_*.py" -o -name "*_test.py" | head -5
+```
+**Go:**
+```bash
+find . -name "*_test.go" | head -5
+```
+**Rust:**
+```bash
+grep -n "#\[test\]\|#\[cfg(test)\]" src/**/*.rs | head -5
+```
+### Framework Resolution:
+| Result | Action |
+|--------|--------|
+| Framework found + config exists + test files exist | Use it. Infer conventions from existing test files. |
+| Framework in package.json but no test files yet | Use it. Write tests following framework docs conventions. |
+| No framework found, greenfield project | Ask: "No test framework detected. Recommend installing [Jest/Vitest for TS, pytest for Python, built-in for Go/Rust]. Set it up now? (yes / skip / I'll do it later)" |
+| No framework, existing project with no tests | Warn: "⚠ No test framework found. Tests cannot be written until one is installed. Proceeding without tests — recommend adding [framework] before shipping." Log to `security/DEBT.md`: "No test framework — zero coverage." |
+### If framework found — capture test profile:
+```
+Test Framework Profile
+──────────────────────
+Framework:     Jest 29 / Vitest 1.x / pytest 7.x / go test / cargo test
+Config file:   jest.config.ts / vitest.config.ts / pytest.ini / N/A
+Test location: co-located (*.test.ts) / __tests__/ / tests/
+Naming:        describe/it / test() / def test_ / #[test]
+Mocking:       jest.mock / vi.mock / pytest fixtures / mockall
+Coverage tool: --coverage / --cov / go test -cover / cargo tarpaulin
+Existing tests: [N] files, [N] total cases
+```
+This profile is passed to every Builder as part of their context packet.
+---
+## Step 3: Establish Style Fingerprint
+If `PATTERNS.md` exists: extract the 5 most important conventions and hold them in scope.
+If not: read 2 existing source files and infer:
+- Naming convention
+- Import order
+- Error handling pattern
+- Async style
+- Test naming pattern (from test profile above)
-## Step 2: Style Fingerprint
-Before writing any code:
-- Naming conventions (camelCase, PascalCase, snake_case)
-- Import organization pattern
-- Error handling style
-- Test file location and naming
-- Comment style
+This fingerprint applies to every Builder in every wave.
 ---
 ## Step 3: Wave Execution Loop
-Repeat this block for each wave:
+Repeat for each wave:
+### 3a — Build Context Packets
+For each task in this wave, assemble a minimal context packet:
-### 3a — Prepare Builder Context Packets
-For each task in this wave, prepare a minimal context packet:
 ```
-Task spec (from PLAN.md)
-AC refs: [which ACs this task satisfies]
-Relevant files: [max 5 files this task touches — not full codebase]
-Style rules: [3-5 key conventions from PATTERNS.md]
+Task: [name]
+Goal: [one sentence — what this task makes true]
+AC refs: [AC-001, AC-003]
+Before: [what currently exists — "file doesn't exist" or "function X does Y"]
+After:  [what must be true when this task is done]
+Files to create/modify: [explicit list — max 5]
+Closest existing example: [path/to/similar/file.ts — "follow this structure"]
+Key pattern to follow: [specific convention from PATTERNS.md]
+Definition of done: [linked ACs that must pass]
 ```
-Builders receive ONLY this packet — not full project state.
-This is what keeps token usage low and context clean.
-### 3b — Build (parallel)
-Spawn Builder agents in parallel, one per task.
+The "closest existing example" is the most important field. Builders replicate proven patterns — they don't invent new ones unless the task explicitly requires it. Find the nearest analog in the codebase.
+### 3b — Parallel Build
+Spawn one Builder per task. Each Builder receives ONLY its context packet.
 Each Builder:
-- Receives its context packet only
-- Writes code that satisfies the referenced ACs
-- Adds LEARN: comment for non-obvious patterns
-- Reports back: files created/modified, AC coverage confirmed
-### 3c — Review
-Reviewer checks each output:
-- Does it satisfy the referenced ACs?
-- Does it match PATTERNS.md style?
+- Writes code that satisfies the Before → After contract
+- Follows the closest existing example's structure
+- Covers the referenced ACs
+- **Writes tests as part of the same task — not after, not later, not optional**
+- Adds `LEARN:` comment only for patterns not present elsewhere in the codebase
+#### Mandatory Test Writing Rules (enforced per Builder)
+**Prerequisite:** Test Framework Profile from Step 2 must exist. If no framework was found and user chose to skip, mark this task's test output as SKIPPED and log to `security/DEBT.md`.
+**For every new source file created:**
+- Create a corresponding test file using the detected framework and location convention:
+  - Jest/Vitest co-located: `auth.service.ts` → `auth.service.test.ts`
+  - `__tests__` folder: `src/auth/auth.service.ts` → `src/auth/__tests__/auth.service.test.ts`
+  - pytest: `src/auth/service.py` → `tests/auth/test_service.py`
+  - Go: `auth/service.go` → `auth/service_test.go` (same package)
+  - Rust: add `#[cfg(test)] mod tests { }` block inside same file
+- Test file must cover: each exported function/method, each AC referenced by this task
+- Minimum: 1 happy path + 1 error/edge case per exported function
+**For every modified source file:**
+- Locate the existing test file using the detected convention
+- Add new test cases for every function whose behavior changed
+- Update existing test cases if the function's contract or signature changed
+- Do NOT delete passing test cases unless the behavior they test was explicitly removed
+**Test structure — follow detected framework exactly:**
+Jest / Vitest:
+```typescript
+describe('AuthService', () => {
+  describe('login', () => {
+    it('returns token when credentials are valid', async () => { ... })
+    it('throws UnauthorizedError when password is wrong', async () => { ... })
+  })
+})
+```
+pytest:
+```python
+def test_login_returns_token_with_valid_credentials():  ...
+def test_login_raises_unauthorized_with_wrong_password(): ...
+```
+Go:
+```go
+func TestLogin_ReturnsToken_WithValidCredentials(t *testing.T) { ... }
+func TestLogin_ReturnsError_WithWrongPassword(t *testing.T) { ... }
+```
+Builder reports back:
+```
+Task: [name] — COMPLETE
+Files created:  [list]
+Files modified: [list]
+Test files written/updated: [list with case count]
+  auth.service.test.ts — 6 cases (4 new, 2 updated)
+ACs addressed: [AC-001 ✓, AC-003 ✓]
+Pattern followed: [example file used]
+```
+### 3c — Reviewer Check
+Reviewer reads each Builder's output:
+- Does the implementation satisfy the referenced ACs?
+- Does it match the style fingerprint and closest example?
+- Are tests present for non-trivial logic?
 - Any security concerns?
-- Tests written for new logic?
+- Did the Builder follow the Before → After contract?
+Flag any deviation from existing patterns — Builders should blend in, not stand out.
 ### 3d — Test + Fix Loop
 Run the full test suite:
@@ -75,42 +207,70 @@ go test ./...   # Go
 cargo test      # Rust
 ```
-If frontend code changed: verify dev server renders without errors, core UI flow works.
+If frontend changed: verify dev server renders without errors, core flow works.
+Check: no regressions in previously passing tests.
-**If tests fail:**
-1. Identify root cause (error → file → line → why)
-2. Apply minimal fix (change only what broke)
-3. Re-run full test suite
-4. Repeat until green
+**On test failure:**
+1. Read the exact error — file, line, message
+2. Trace root cause (not just symptom)
+3. Apply minimal fix
+4. Re-run tests
+5. Repeat until green
-Max 5 fix attempts per wave.
-If still failing after 5: stop, report unresolved failures, ask user how to proceed.
+Max 5 fix attempts. After 5: stop, report what's unresolved, ask how to proceed.
-Fix log:
+Fix log per attempt:
 ```
-Wave [N] Fix [X]/5: [error] → [root cause] → [fix applied] → [result]
+Fix [X]/5  Wave [N]
+Error:      [message at file:line]
+Root cause: [why it's failing]
+Fix:        [exactly what changed]
+Result:     PASS / still failing
+```
+### 3e — Wave Commit
+When all tests pass, commit this wave atomically:
+```bash
+git add [changed files — explicit list, not -A]
+git commit -m "[type](scope): [what changed]
+[Body: why this change, which ACs it satisfies]
+[AC refs: AC-001, AC-003]
+[Wave: N of M]"
 ```
-### 3e — Wave Complete
-Only after all tests pass:
-- Mark wave as complete in `phases/[N]/PLAN.md`
-- Continue to next wave
+Commit types: `feat` / `fix` / `test` / `refactor` / `chore`
+Example:
+```
+feat(auth): add JWT middleware and login route
+Implements token validation for all protected routes.
+Satisfies: AC-001 (valid login), AC-002 (invalid password rejection), AC-003 (expired token)
+Wave: 2 of 4
+```
+Mark wave complete in `phases/[N]/PLAN.md`. Proceed to next wave.
 ---
-## Step 4: Integration Check
-After all waves pass:
-- Run full test suite one final time
-- Verify pieces connect correctly across wave boundaries
-- Check for import errors or missing dependencies
+## Step 4: Final Integration Check
+After all waves:
+- Run full test suite one last time
+- Verify all AC-referenced behaviors work end-to-end
+- Check imports across wave boundaries (no dangling references)
-## Step 5: Update Memory (minimal — prune stale fields)
+---
+## Step 5: Update Memory (lean — prune old build fields)
 ```yaml
 last_build_date: [today]
-current_phase: [N]
 plan_status: built
 test_status: passing
+waves_completed: [N]
 ```
-Remove from light.md: any per-wave task details from previous builds (keep it lean).
+Remove from `light.md`: per-task details from previous builds.
+---
-## Token Budget: ~50K per wave (build + context packets + test-fix loop)
+## Token Budget: ~50K per wave (context packets keep individual Builder costs low)

package/templates/commands/hotfix.md CHANGED Viewed

@@ -55,16 +55,41 @@ Make the minimal change:
 - Do not refactor, rename, or clean up surrounding code
 - Match existing code style
+## Step 4b: Write Regression Test (always — even in hotfix mode)
+### First: check if a test framework exists
+```bash
+cat package.json | grep -E "jest|vitest|mocha" 2>/dev/null
+find . -name "*.test.ts" -o -name "*.test.js" -o -name "test_*.py" | head -3
+```
+- **Framework found:** write the regression test using it
+- **No framework found:** warn — "No test framework detected. Regression test skipped. This bug may recur." Log to `security/DEBT.md`: "Hotfix [description] shipped without regression test — no framework available."
+- Do not block the hotfix for a missing framework, but always log the gap.
+For the specific behavior being fixed:
+1. Write a test that reproduces the bug before applying the fix
+2. Run it — confirm it fails
+3. Apply the fix (Step 4)
+4. Run it again — confirm it passes
+Name it after the exact bug:
+```
+it('should not crash when user has no profile photo')
+it('should return 401 when session token is expired')
+```
+If a test file already exists for the changed file: add the case there.
+If not: create a minimal test file covering this function only. Do not skip — a hotfix without a regression test will regress again.
 ## Step 5: Test
-Run the test suite:
+Run the full test suite:
 ```bash
 npm test        # or pytest / go test etc.
 ```
 If tests fail: fix and re-test. Max 3 attempts before stopping and asking the user.
-If no tests exist for the changed area: flag it after shipping.
 ## Step 6: Ship
 ```bash
 git add [changed files only]