buildflow-dev 1.0.6 → 1.0.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -15,6 +15,7 @@
15
15
  - [Supported AI Tools](#supported-ai-tools)
16
16
  - [AI Slash Commands](#ai-slash-commands)
17
17
  - [CLI Commands](#cli-commands)
18
+ - [Example: Full Greenfield Flow](#example-full-greenfield-flow-phases--waves)
18
19
  - [How It Works](#how-it-works)
19
20
  - [Package Source Structure](#package-source-structure)
20
21
  - [The .buildflow/ Scaffold](#the-buildflow-scaffold)
@@ -91,7 +92,8 @@ These are installed into your AI tool and triggered by typing `/` (or `@` / `$`
91
92
  | `/buildflow-start` | Strategist | Begin project: asks vision questions, detects mode, saves to `core/vision.md` | ~8K |
92
93
  | `/buildflow-think [topic]` | Researcher × 3 + Synthesizer | Parallel web research on a topic, synthesized into a recommendation | ~30K |
93
94
  | `/buildflow-plan [phase]` | Architect | Maps task dependencies, groups into parallel waves, writes `phases/N/PLAN.md` | ~20K |
94
- | `/buildflow-build [wave]` | Builder × N + Reviewer | Executes the plan wave-by-wave with parallel Builders, style-matched to your codebase | ~50K/wave |
95
+ | `/buildflow-build [wave]` | Builder × N + Reviewer | Executes the plan wave-by-wave each wave auto-tests, auto-fixes failures, and only advances when fully green | ~50K/wave |
96
+ | `/buildflow-test [wave]` | Reviewer | Standalone test + fix loop — re-verify a wave or test a manual change outside of `/buildflow-build` | ~25K |
95
97
  | `/buildflow-check` | Reviewer × 3 | Three parallel reviewers check correctness, quality, and security | ~20K |
96
98
  | `/buildflow-ship` | Strategist + Security Auditor | Pre-ship security gate → retrospective → git tag | ~22K |
97
99
 
@@ -100,9 +102,37 @@ These are installed into your AI tool and triggered by typing `/` (or `@` / `$`
100
102
  | Command | Agent | Purpose | Token Cost |
101
103
  |---------|-------|---------|-----------|
102
104
  | `/buildflow-onboard` | Cartographer | One-time analysis: writes `MAP.md`, `PATTERNS.md`, `DEPENDENCIES.md`, `HOTSPOTS.md` | ~35K |
103
- | `/buildflow-modify "description"` | Surgeon | Surgical change with blast-radius analysis and restore point | ~30K |
105
+ | `/buildflow-modify "description"` | Surgeon | Surgical change with blast-radius analysis and restore point — use for features **and bugfixes** | ~30K |
104
106
  | `/buildflow-refactor [scope]` | Surgeon + Reviewer | Improve code quality without changing behavior | ~40K |
105
107
 
108
+ **`/buildflow-modify` works for both features and bugs.** Pass a plain-English description either way:
109
+
110
+ ```
111
+ # Feature
112
+ /buildflow-modify "Add pagination to the GET /users endpoint"
113
+
114
+ # Bugfix
115
+ /buildflow-modify "Fix null pointer crash when user has no profile photo"
116
+ /buildflow-modify "Fix login redirect loop when session expires"
117
+ ```
118
+
119
+ The Surgeon always runs a blast-radius analysis first (what files are affected, what calls them) and creates a git restore point before touching anything — making it especially safe for bugfixes where a wrong change can cause regressions.
120
+
121
+ If you're not sure where the bug is yet, use `/buildflow-help` first — it's a diagnostic mode that helps you locate the problem before you try to fix it.
122
+
123
+ | Situation | Command |
124
+ |-----------|---------|
125
+ | Know what needs to change | `/buildflow-modify "fix description"` |
126
+ | Don't know where the bug is | `/buildflow-help` first, then `/buildflow-modify` |
127
+ | Tests failing after a change | `/buildflow-debug` |
128
+
129
+ ### Debugging & Deployment
130
+
131
+ | Command | Agent | Purpose | Token Cost |
132
+ |---------|-------|---------|-----------|
133
+ | `/buildflow-debug ["error"]` | Surgeon | Root-cause analysis for failing tests or broken behavior — traces error to source, applies minimal fix | ~20K |
134
+ | `/buildflow-deploy [env]` | Strategist | Pre-flight checks then deploy to staging or production | ~15K |
135
+
106
136
  ### Security
107
137
 
108
138
  | Command | Agent | Purpose | Token Cost |
@@ -149,6 +179,126 @@ buildflow update --check # Check current version without updating
149
179
 
150
180
  ---
151
181
 
182
+ ## Example: Full Greenfield Flow (Phases & Waves)
183
+
184
+ Here's what a complete new project looks like end-to-end, showing how phases and waves are **auto-generated** by BuildFlow — you never define them manually.
185
+
186
+ ### 1. Init and start
187
+
188
+ ```bash
189
+ mkdir my-app && cd my-app
190
+ npx buildflow-dev init
191
+ ```
192
+
193
+ ```
194
+ /buildflow-start
195
+ ```
196
+ > Strategist asks 4–5 questions. Writes answers to `.buildflow/core/vision.md`.
197
+
198
+ ---
199
+
200
+ ### 2. Research (optional)
201
+
202
+ ```
203
+ /buildflow-think auth-strategy
204
+ ```
205
+ > 3 Researcher agents run in parallel. Synthesizer combines results.
206
+ > Output → `.buildflow/research/auth-strategy.md`
207
+
208
+ ---
209
+
210
+ ### 3. Plan — Architect auto-generates phases and waves
211
+
212
+ ```
213
+ /buildflow-plan
214
+ ```
215
+
216
+ The Architect reads `vision.md` and produces `.buildflow/phases/01/PLAN.md`:
217
+
218
+ ```
219
+ Phase 1 — Foundation
220
+
221
+ Wave 1 (parallel — no dependencies):
222
+ • Create database schema
223
+ • Create project config files
224
+ • Set up folder structure
225
+
226
+ Wave 2 (depends on Wave 1):
227
+ • Create data models
228
+ • Create auth middleware
229
+
230
+ Wave 3 (depends on Wave 2):
231
+ • Create API routes
232
+ • Create service layer
233
+
234
+ Wave 4 (depends on Wave 3):
235
+ • Create UI components
236
+ • Write integration tests
237
+ ```
238
+
239
+ You didn't write any of this — the Architect derived it from your vision.
240
+
241
+ ---
242
+
243
+ ### 4. Build — testing is automatic inside every wave
244
+
245
+ ```
246
+ /buildflow-build
247
+ ```
248
+
249
+ Testing is **built into every wave** — you don't run `/buildflow-test` manually. For each wave, the cycle is:
250
+
251
+ ```
252
+ Build wave tasks (parallel Builders)
253
+
254
+ Review output (Reviewer)
255
+
256
+ Run tests automatically
257
+
258
+ ┌─ Tests pass? ──────────────────────── Move to next wave
259
+ └─ Tests fail? → Fix → Re-test → loop until green (max 5 attempts)
260
+ ```
261
+
262
+ So `Wave 1` is fully green before `Wave 2` starts. `Wave 2` is fully green before `Wave 3` starts. And so on.
263
+
264
+ If a wave can't be fixed within 5 attempts, the build stops and reports exactly what failed — then you can use `/buildflow-debug` for deeper investigation.
265
+
266
+ ```
267
+ /buildflow-debug "auth middleware not rejecting expired tokens"
268
+ ```
269
+
270
+ **`/buildflow-test` standalone** is available if you want to re-verify a wave you already built, or test after a manual code change outside of `/buildflow-build`.
271
+
272
+ ---
273
+
274
+ ### 5. Check, ship, and deploy
275
+
276
+ ```
277
+ /buildflow-check
278
+ ```
279
+ > 3 Reviewers in parallel: correctness / quality / security
280
+
281
+ ```
282
+ /buildflow-ship
283
+ ```
284
+ > Security gate → retrospective written to `phases/01/retro.md` → git tag
285
+
286
+ ```
287
+ /buildflow-deploy staging
288
+ ```
289
+ > Pre-flight checks → deploy to staging → smoke test
290
+
291
+ ```
292
+ /buildflow-deploy production
293
+ ```
294
+ > Stricter gate (all tests + audit must pass) → deploy to production
295
+
296
+ ---
297
+
298
+ **Key point:** `[phase]` and `[wave]` arguments are optional escape hatches for resuming or re-running specific parts. In a normal flow you just type `/buildflow-plan` and `/buildflow-build` with no arguments.
299
+
300
+ ---
301
+
152
302
  ## How It Works
153
303
 
154
304
  ### The install flow
@@ -266,7 +416,7 @@ buildflow-dev/
266
416
  │ │ all available /buildflow-* commands.
267
417
  │ │ {{APP_NAME}} is replaced with the detected project name.
268
418
  │ │
269
- │ └── commands/ 14 markdown files — one per slash command.
419
+ │ └── commands/ 17 markdown files — one per slash command.
270
420
  │ │ Each file is the full instruction set for that command.
271
421
  │ │ The AI reads and executes these when you trigger the command.
272
422
  │ │ Format: YAML frontmatter (name, description, agent, tools)
@@ -276,12 +426,15 @@ buildflow-dev/
276
426
  │ ├── think.md Parallel research with up to 3 Researcher agents
277
427
  │ ├── plan.md Dependency mapping → wave-based execution plan
278
428
  │ ├── build.md Wave-by-wave parallel Builder execution
429
+ │ ├── test.md Run tests + UI verification after each wave
279
430
  │ ├── check.md 3-reviewer parallel quality check
280
431
  │ ├── ship.md Pre-ship security gate → retro → git tag
281
432
  │ ├── onboard.md One-time codebase analysis → MAP/PATTERNS/DEPENDENCIES/HOTSPOTS
282
433
  │ ├── modify.md Surgical code change with blast-radius analysis
283
434
  │ ├── refactor.md Quality improvement without behavior change
284
435
  │ ├── audit.md OWASP Top 10 AI-powered scan
436
+ │ ├── debug.md Root-cause analysis for failing tests or broken behavior
437
+ │ ├── deploy.md Pre-flight checks → deploy to staging or production
285
438
  │ ├── status.md Current phase and recommended next action
286
439
  │ ├── explain.md Plain-language explanation of code, concepts, errors
287
440
  │ ├── back.md Undo to git restore point, update state
@@ -576,11 +729,23 @@ Everything else (`.claude/`, `node_modules/`, `.gitignore`, etc.) is excluded.
576
729
 
577
730
  ## Roadmap
578
731
 
732
+ ### New AI Tools
579
733
  - [ ] `buildflow install --tool windsurf` — Windsurf IDE support
580
734
  - [ ] `buildflow install --tool aider` — Aider CLI support
581
735
  - [ ] `buildflow install --tool zed` — Zed editor support
582
- - [ ] GitHub Actions workflow: `buildflow audit` in CI
736
+
737
+ ### New Slash Commands
738
+ - [ ] `/buildflow-perf` — performance profiling: detect slow queries, bundle size issues, render bottlenecks
739
+ - [ ] `/buildflow-docs` — auto-generate or update README, API docs, and inline comments from code
740
+ - [ ] `/buildflow-migrate` — guided database migration: generate migration files, verify rollback safety
741
+ - [ ] `/buildflow-seed` — generate realistic test data for the current schema
742
+
743
+ ### CLI Improvements
744
+ - [ ] `buildflow audit` in GitHub Actions — CI-friendly exit codes already work, needs workflow template
583
745
  - [ ] `buildflow fix --auto` — non-interactive mode for CI
746
+ - [ ] `buildflow test` — terminal wrapper that runs the project's test suite with BuildFlow context
747
+
748
+ ### Platform
584
749
  - [ ] Web dashboard for project status visualization
585
750
  - [ ] Custom agent creation: `buildflow agent create`
586
751
  - [ ] Team sync: shared `.buildflow/` across teammates
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "buildflow-dev",
3
- "version": "1.0.6",
3
+ "version": "1.0.7",
4
4
  "description": "Adaptive AI-powered development orchestration. Works with Claude Code, Gemini CLI, Codex CLI, Cursor, and more.",
5
5
  "keywords": [
6
6
  "ai",
@@ -620,8 +620,9 @@ function loadCommandTemplates() {
620
620
  const templatesDir = join(__dirname, '../../templates/commands')
621
621
  const commands = {}
622
622
  const commandNames = [
623
- 'start', 'think', 'plan', 'build', 'check', 'ship',
623
+ 'start', 'think', 'plan', 'build', 'test', 'check', 'ship',
624
624
  'onboard', 'modify', 'refactor', 'audit',
625
+ 'debug', 'deploy',
625
626
  'status', 'explain', 'back', 'help',
626
627
  ]
627
628
  for (const name of commandNames) {
@@ -29,9 +29,13 @@ Type `/` in Claude Code to see available commands:
29
29
  - `/buildflow-think` — research and discuss
30
30
  - `/buildflow-plan` — create execution plan
31
31
  - `/buildflow-build` — implement the plan
32
- - `/buildflow-check` — verify quality
32
+ - `/buildflow-test` — run tests and verify UI/functionality after each wave
33
+ - `/buildflow-check` — verify quality with 3 parallel reviewers
34
+ - `/buildflow-debug` — root-cause analysis when tests fail or something breaks
33
35
  - `/buildflow-ship` — finalize with security gate
36
+ - `/buildflow-deploy` — pre-flight checks then deploy to staging or production
34
37
  - `/buildflow-audit` — run security scan
38
+ - `/buildflow-modify` — surgical change or bugfix to existing code
35
39
  - `/buildflow-status` — see where you are
36
40
  - `/buildflow-help` — get help or recover from issues
37
41
 
@@ -1,18 +1,18 @@
1
1
  ---
2
2
  name: buildflow-build
3
- description: Execute the plan with parallel Builder agents
4
- allowed-tools: Read, Write, Bash
3
+ description: Execute the plan with parallel Builder agents, auto-test and auto-fix each wave
4
+ allowed-tools: Read, Write, Bash, Grep, Glob
5
5
  agents: builder, reviewer
6
6
  ---
7
7
 
8
8
  # /buildflow-build
9
9
 
10
- Execute the current phase plan. Spawns parallel Builder agents per wave, then Reviewer checks quality.
10
+ Execute the current phase plan. Spawns parallel Builder agents per wave. After every wave, automatically runs tests and fixes failures — the next wave does not start until the current wave passes all tests.
11
11
 
12
12
  ## Usage
13
- - `/buildflow-build` — execute current phase plan
14
- - `/buildflow-build wave-2` — execute a specific wave
15
- - `/buildflow-build <task>` — build a single task
13
+ - `/buildflow-build` — execute current phase plan (all waves, auto-test each)
14
+ - `/buildflow-build wave-2` — execute and test a specific wave
15
+ - `/buildflow-build <task>` — build and test a single task
16
16
 
17
17
  ## Step 1: Load Plan
18
18
  Read `.buildflow/phases/[N]/PLAN.md`.
@@ -20,42 +20,86 @@ Load `.buildflow/memory/light.md` for style preferences.
20
20
  If existing project: load `.buildflow/codebase/PATTERNS.md`.
21
21
 
22
22
  ## Step 2: Style Fingerprint
23
- Before writing code, confirm:
23
+ Before writing any code, confirm:
24
24
  - Naming conventions (camelCase, PascalCase, snake_case)
25
25
  - Import organization
26
26
  - Error handling style
27
- - Comment style
28
27
  - Test file location and naming
29
28
 
30
- ## Step 3: Execute Wave 1
31
- Spawn Builder agents in parallel for Wave 1 tasks.
32
- Each Builder agent:
29
+ ## Step 3: Execute Wave
30
+
31
+ Repeat this block for each wave in the plan:
32
+
33
+ ### 3a — Build
34
+ Spawn Builder agents in parallel for all tasks in this wave.
35
+ Each Builder:
33
36
  - Gets the task spec and relevant context files
34
- - Writes code matching detected style
37
+ - Writes code matching the detected style
35
38
  - Adds LEARN: comments for non-obvious patterns
36
39
  - Reports back: files created/modified, decisions made
37
40
 
38
- ## Step 4: Review Wave 1
39
- Reviewer agent checks each output:
41
+ ### 3b Review
42
+ Reviewer checks each output:
40
43
  - Does it meet the task spec?
41
- - Does it match the codebase style?
44
+ - Does it match codebase style?
42
45
  - Any security concerns?
43
- - Tests present if needed?
46
+ - Are tests written for new logic?
47
+
48
+ ### 3c — Test (automatic, runs after every wave)
49
+ Detect and run the test suite:
50
+ ```bash
51
+ npm test # Node / JS / TS projects
52
+ pytest # Python
53
+ go test ./... # Go
54
+ cargo test # Rust
55
+ # etc. based on detected framework
56
+ ```
57
+
58
+ Also check:
59
+ - If frontend code changed: start dev server and verify UI renders, flows work, no console errors
60
+ - No import errors, missing modules, or broken references
61
+ - All previously passing tests still pass (no regressions)
62
+
63
+ ### 3d — Fix loop (runs only if tests fail)
64
+ If any test fails:
65
+ 1. Identify root cause (trace error → file → line → why)
66
+ 2. Apply minimal fix — change only what broke, do not refactor surrounding code
67
+ 3. Re-run the full test suite
68
+ 4. Repeat until all tests pass
69
+
70
+ **Do not move to the next wave until this wave is fully green.**
71
+
72
+ Maximum fix attempts per wave: 5.
73
+ If still failing after 5 attempts: stop, report the unresolved failure, and ask the user how to proceed.
74
+
75
+ Fix attempt log format:
76
+ ```
77
+ Wave [N] — Fix attempt [X]/5
78
+ Error: [error message]
79
+ Root cause: [explanation]
80
+ Fix applied: [what changed]
81
+ Result: [pass / still failing]
82
+ ```
44
83
 
45
- ## Step 5: Continue Waves
46
- Repeat for Wave 2, Wave 3, etc.
47
- Each wave waits for the previous to complete and pass review.
84
+ ## Step 4: Wave Complete
85
+ Only after a wave is fully tested and passing:
86
+ - Log the wave as complete in `.buildflow/phases/[N]/PLAN.md`
87
+ - Continue to the next wave (back to Step 3)
48
88
 
49
- ## Step 6: Integration Check
50
- After all waves: verify the pieces connect correctly.
51
- Run existing tests if available.
89
+ ## Step 5: Integration Check
90
+ After all waves pass:
91
+ - Run the full test suite one final time
92
+ - Verify all pieces connect correctly end-to-end
93
+ - Check for any import/dependency issues across wave boundaries
52
94
 
53
- ## Step 7: Update Memory
95
+ ## Step 6: Update Memory
54
96
  ```yaml
55
97
  last_build_date: [today]
56
98
  phase: [N]
57
99
  tasks_completed: [list]
58
100
  files_changed: [list]
101
+ waves_completed: [N]
102
+ test_status: all passing
59
103
  ```
60
104
 
61
- ## Token Budget: ~50K per wave (parallel)
105
+ ## Token Budget: ~50K per wave (build + test + fix loop)
@@ -0,0 +1,68 @@
1
+ ---
2
+ name: buildflow-debug
3
+ description: Systematic debugging when a test fails or something breaks
4
+ allowed-tools: Read, Write, Bash, Grep, Glob
5
+ agent: surgeon
6
+ ---
7
+
8
+ # /buildflow-debug
9
+
10
+ Systematic root-cause analysis for failing tests, broken builds, or unexpected behavior. The Surgeon reads the error, traces it to the source, and fixes it with minimal footprint.
11
+
12
+ ## Usage
13
+ - `/buildflow-debug` — debug the most recent failure
14
+ - `/buildflow-debug "error message or description"`
15
+ - `/buildflow-debug src/auth/login.ts` — debug a specific file
16
+ - `/buildflow-debug --trace` — full stack trace analysis
17
+
18
+ ## Step 1: Collect the Error
19
+ If a description was passed, use it.
20
+ Otherwise check for recent failure context:
21
+ - Last test run output
22
+ - Browser console errors
23
+ - Terminal error logs
24
+ - `.buildflow/phases/[N]/PLAN.md` for what was expected
25
+
26
+ ## Step 2: Reproduce the Failure
27
+ - Run the failing test or trigger the failing flow
28
+ - Confirm the error is reproducible before investigating
29
+ - Note: exact error message, file, line number, stack trace
30
+
31
+ ## Step 3: Trace to Root Cause
32
+ Work backwards from the symptom:
33
+ 1. What line threw the error?
34
+ 2. What called that line?
35
+ 3. What data was passed in?
36
+ 4. Where does that data come from?
37
+ 5. What assumption is violated?
38
+
39
+ Distinguish:
40
+ - **Symptom** — where the error surfaces
41
+ - **Root cause** — where the actual problem is
42
+
43
+ ## Step 4: Impact Check
44
+ Before fixing:
45
+ - How many places does this root cause affect?
46
+ - Is this a one-off bug or a systemic pattern?
47
+ - Will fixing this break anything else?
48
+
49
+ ## Step 5: Create Restore Point
50
+ ```bash
51
+ git stash # safe fallback before making changes
52
+ ```
53
+
54
+ ## Step 6: Apply Fix
55
+ - Fix only the root cause, not the symptom
56
+ - Minimum footprint — do not refactor surrounding code
57
+ - Match existing code style (PATTERNS.md)
58
+
59
+ ## Step 7: Verify Fix
60
+ - Re-run the failing test — confirm it passes
61
+ - Run full test suite — confirm no regressions
62
+ - If UI bug: verify the flow works end-to-end
63
+
64
+ ## Step 8: Prevent Recurrence
65
+ - Add a test that would have caught this bug
66
+ - Note the fix in `.buildflow/learnings/decisions.md` if it reveals a systemic issue
67
+
68
+ ## Token Budget: ~20K
@@ -0,0 +1,80 @@
1
+ ---
2
+ name: buildflow-deploy
3
+ description: Deploy to staging or production with pre-flight checks
4
+ allowed-tools: Read, Write, Bash, Grep, Glob
5
+ agent: strategist
6
+ ---
7
+
8
+ # /buildflow-deploy
9
+
10
+ Pre-flight checks and deployment orchestration. Ensures the build is safe to deploy before pushing to any environment.
11
+
12
+ ## Usage
13
+ - `/buildflow-deploy` — deploy to default environment
14
+ - `/buildflow-deploy staging` — deploy to staging
15
+ - `/buildflow-deploy production` — deploy to production (stricter gate)
16
+ - `/buildflow-deploy --dry-run` — show what would happen without deploying
17
+
18
+ ## Step 1: Load Context
19
+ Read `.buildflow/core/state.md` for current phase and status.
20
+ Read `.buildflow/memory/light.md` for project framework and deploy config.
21
+
22
+ ## Step 2: Pre-flight Gate
23
+
24
+ **Always required:**
25
+ - [ ] `/buildflow-test` passed (or confirm manually)
26
+ - [ ] `/buildflow-audit --pre-ship` passed (no critical secrets or vulnerabilities)
27
+ - [ ] No uncommitted changes (`git status` clean)
28
+ - [ ] On correct branch (not committing directly to main unless intentional)
29
+
30
+ **Production only (additional):**
31
+ - [ ] `/buildflow-check` passed
32
+ - [ ] All tests passing including integration
33
+ - [ ] Environment variables verified for target environment
34
+ - [ ] Database migrations reviewed if schema changed
35
+
36
+ If any gate fails: stop and report what needs to be resolved.
37
+
38
+ ## Step 3: Detect Deploy Setup
39
+ Check for:
40
+ - `package.json` scripts: `deploy`, `deploy:staging`, `deploy:prod`
41
+ - Deployment config files: `vercel.json`, `netlify.toml`, `fly.toml`, `railway.json`, `Dockerfile`
42
+ - CI/CD config: `.github/workflows/`, `.gitlab-ci.yml`
43
+ - Cloud CLI tools: `vercel`, `netlify`, `flyctl`, `railway`, `heroku`
44
+
45
+ ## Step 4: Environment Confirmation
46
+ Show:
47
+ - Target environment (staging / production)
48
+ - Deploy method detected
49
+ - What will change (git diff summary)
50
+
51
+ Ask for explicit confirmation before proceeding, especially for production.
52
+
53
+ ## Step 5: Deploy
54
+ Run the detected deploy command or guide the user through manual steps if no automation is detected.
55
+
56
+ ```bash
57
+ # Examples depending on detected setup:
58
+ vercel --prod
59
+ netlify deploy --prod
60
+ flyctl deploy
61
+ railway up
62
+ ```
63
+
64
+ ## Step 6: Post-Deploy Verification
65
+ - Confirm deploy succeeded (exit code, deploy URL)
66
+ - Run a smoke test if possible (ping health endpoint, load the app URL)
67
+ - Check for errors in deploy logs
68
+
69
+ ## Step 7: Update State
70
+ ```yaml
71
+ last_deploy: [today]
72
+ environment: [staging/production]
73
+ deployed_phase: [N]
74
+ deploy_url: [url if available]
75
+ ```
76
+
77
+ ## --dry-run Flag
78
+ Shows the pre-flight checklist results and what deploy command would run — without deploying.
79
+
80
+ ## Token Budget: ~15K
@@ -0,0 +1,82 @@
1
+ ---
2
+ name: buildflow-test
3
+ description: Run tests, verify UI flow, and auto-fix failures until all pass
4
+ allowed-tools: Read, Write, Bash, Grep, Glob
5
+ agent: reviewer
6
+ ---
7
+
8
+ # /buildflow-test
9
+
10
+ Standalone test + fix loop. Runs the test suite, checks UI flow and functionality, and automatically fixes failures — repeats until everything passes or the fix limit is reached.
11
+
12
+ Use this when:
13
+ - You want to re-verify a wave that was already built
14
+ - You made a manual code change and want to test it
15
+ - `/buildflow-build` stopped and you want to resume testing from where it left off
16
+
17
+ For automated testing during builds, this loop is already built into `/buildflow-build` — you don't need to run `/buildflow-test` separately after each wave unless you want to re-check.
18
+
19
+ ## Usage
20
+ - `/buildflow-test` — test current wave/phase output
21
+ - `/buildflow-test wave-2` — test a specific wave
22
+ - `/buildflow-test ui` — focus on UI alignment and flow only
23
+ - `/buildflow-test --full` — run full suite including integration and e2e
24
+
25
+ ## Step 1: Load Context
26
+ Read `.buildflow/phases/[N]/PLAN.md` to know what this wave was supposed to deliver.
27
+ Read `.buildflow/memory/light.md` for framework and test setup.
28
+
29
+ ## Step 2: Detect Test Setup
30
+ Identify:
31
+ - Test framework (Jest, Vitest, Pytest, Go test, Cargo, etc.)
32
+ - Test command (`npm test`, `pytest`, `go test ./...`, etc.)
33
+ - E2E framework if present (Playwright, Cypress, etc.)
34
+ - Dev server command if UI is involved
35
+
36
+ ## Step 3: Run Tests
37
+ ```bash
38
+ npm test # or pytest / go test etc.
39
+ ```
40
+
41
+ Also check:
42
+ - If frontend code changed: start dev server, verify UI renders and flows work, no console errors
43
+ - No import errors or missing modules
44
+ - Previously passing tests still pass (no regressions)
45
+
46
+ ## Step 4: Fix Loop (runs automatically on failure)
47
+
48
+ If any test fails:
49
+ 1. Identify root cause (trace error → file → line → why)
50
+ 2. Apply minimal fix — only change what broke, do not refactor surrounding code
51
+ 3. Re-run the full test suite
52
+ 4. Repeat until all tests pass
53
+
54
+ Maximum fix attempts: 5.
55
+ If still failing after 5 attempts: stop, report what's unresolved, and ask the user how to proceed.
56
+
57
+ Fix attempt log format:
58
+ ```
59
+ Fix attempt [X]/5
60
+ Error: [error message]
61
+ Root cause: [explanation]
62
+ Fix applied: [what changed]
63
+ Result: [pass / still failing]
64
+ ```
65
+
66
+ ## Step 5: Report
67
+
68
+ ```
69
+ Test Results
70
+ ────────────
71
+ ✓ PASS Tests: 24/24 passing
72
+ ✓ PASS Functional: signup flow works end-to-end
73
+ ✓ PASS UI: form renders correctly, validation messages shown
74
+ ⚠ WARN No test for empty email edge case (non-blocking)
75
+ ```
76
+
77
+ ## Step 6: Decision
78
+ - All pass: "Ready to continue to next wave or /buildflow-ship."
79
+ - Warnings only: "Non-blocking. Proceed or address first — your call."
80
+ - Unresolved after 5 attempts: "Manual intervention needed. Use /buildflow-debug for deeper analysis."
81
+
82
+ ## Token Budget: ~25K (more if fix loop runs multiple iterations)