@shipfast-ai/shipfast 1.1.0 → 1.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (44) hide show
  1. package/README.md +166 -201
  2. package/agents/architect.md +7 -7
  3. package/agents/builder.md +9 -10
  4. package/agents/critic.md +3 -3
  5. package/agents/scout.md +1 -1
  6. package/agents/scribe.md +9 -13
  7. package/bin/install.js +250 -9
  8. package/brain/index.cjs +38 -80
  9. package/brain/indexer.cjs +6 -9
  10. package/brain/schema.sql +4 -2
  11. package/commands/sf/brain.md +4 -0
  12. package/commands/sf/check-plan.md +3 -4
  13. package/commands/sf/config.md +1 -0
  14. package/commands/sf/cost.md +83 -0
  15. package/commands/sf/diff.md +53 -0
  16. package/commands/sf/discuss.md +115 -68
  17. package/commands/sf/do.md +140 -72
  18. package/commands/sf/help.md +10 -5
  19. package/commands/sf/map.md +16 -24
  20. package/commands/sf/plan.md +6 -9
  21. package/commands/sf/project.md +4 -4
  22. package/commands/sf/rollback.md +70 -0
  23. package/commands/sf/ship.md +13 -0
  24. package/commands/sf/status.md +1 -3
  25. package/commands/sf/verify.md +4 -9
  26. package/commands/sf/worktree.md +286 -0
  27. package/core/ambiguity.cjs +229 -125
  28. package/core/architecture.cjs +5 -8
  29. package/core/autopilot.cjs +1 -0
  30. package/core/budget.cjs +5 -11
  31. package/core/constants.cjs +63 -0
  32. package/core/context-builder.cjs +1 -58
  33. package/core/executor.cjs +18 -4
  34. package/core/guardrails.cjs +6 -5
  35. package/core/model-selector.cjs +5 -48
  36. package/core/retry.cjs +5 -1
  37. package/core/session.cjs +2 -2
  38. package/core/skip-logic.cjs +5 -1
  39. package/core/verify.cjs +11 -14
  40. package/hooks/sf-first-run.js +2 -2
  41. package/mcp/server.cjs +135 -4
  42. package/package.json +18 -4
  43. package/scripts/postinstall.js +1 -1
  44. package/commands/sf/workstream.md +0 -51
@@ -1,121 +1,168 @@
1
1
  ---
2
2
  name: sf:discuss
3
- description: "Detect ambiguity and ask targeted questions before planning. Stores answers as locked decisions."
4
- argument-hint: "<task description>"
3
+ description: "Detect ambiguity and ask domain-specific questions before planning. Stores answers as locked decisions."
4
+ argument-hint: "<task description> [--batch] [--chain] [--assume]"
5
5
  allowed-tools:
6
6
  - Read
7
7
  - Bash
8
8
  - AskUserQuestion
9
+ - Skill
9
10
  ---
10
11
 
11
12
  <objective>
12
- Smart questioning system that detects ambiguity BEFORE planning.
13
+ Domain-aware questioning system that detects ambiguity BEFORE planning.
13
14
  Prevents wasting tokens on plans built from wrong assumptions.
14
15
 
15
- Only asks questions for detected ambiguity types:
16
- - WHERE: unclear which files/components to change
17
- - WHAT: unclear expected behavior
18
- - HOW: multiple valid approaches
19
- - RISK: touches sensitive areas (auth/payment/data)
20
- - SCOPE: request covers multiple features
16
+ Detects 6 domains automatically: UI, API, Database, Auth, Content, Infra.
17
+ Asks domain-specific questions (not generic ones).
18
+
19
+ Flags:
20
+ - `--batch` Group all questions into 1-2 AskUserQuestion calls
21
+ - `--chain` After discussion, auto-run /sf-plan → /sf-check-plan → ask to execute
22
+ - `--assume` — Auto-resolve using brain.db patterns (no questions)
21
23
  </objective>
22
24
 
23
25
  <process>
24
26
 
25
- ## Step 1: Detect Ambiguity (zero tokens — rule-based)
27
+ ## Step 1: Detect Domain + Ambiguity (zero tokens — rule-based)
28
+
29
+ **Auto-detect domain** from task keywords:
30
+ - **UI**: style, layout, component, page, form, button, modal, responsive, dark mode
31
+ - **API**: endpoint, route, handler, webhook, rest, graphql, middleware
32
+ - **Database**: migration, schema, model, table, orm, prisma, drizzle
33
+ - **Auth**: login, signup, password, permission, role, token, session, oauth, jwt
34
+ - **Content**: docs, blog, email, notification, i18n, template
35
+ - **Infra**: deploy, ci/cd, docker, k8s, monitoring, terraform
26
36
 
27
- Analyze the user's input for ambiguity patterns:
37
+ Then detect ambiguity types:
38
+ - **WHERE**: No file paths, component names, or locations mentioned
39
+ - **WHAT**: No specific behavior described, very short input
40
+ - **HOW**: Contains alternatives or describes a generic feature
41
+ - **RISK**: Mentions auth/payment/database/delete/production
42
+ - **SCOPE**: More than 30 words with 2+ conjunctions
28
43
 
29
- **WHERE** No file paths, component names, or locations mentioned
30
- **WHAT** — No specific behavior/output described, request is very short
31
- **HOW** — Contains "or", "either", "maybe", or describes a generic feature (auth, cache, search)
32
- **RISK** — Mentions auth, payment, database, delete, production, deploy
33
- **SCOPE** — More than 30 words with 2+ conjunctions (and, also, plus)
44
+ Report domain detection: `Domain: [ui, auth] | Ambiguities: [HOW, WHERE, RISK]`
34
45
 
35
46
  ## Step 2: Check Locked Decisions
36
47
 
37
48
  Query brain.db for existing decisions tagged with detected ambiguity types.
38
49
  Skip any ambiguity that was already resolved in a previous session.
39
50
 
40
- ## Step 3: Generate Questions
51
+ ## Step 3: Ask Domain-Specific Questions
41
52
 
42
- For each remaining ambiguity, ask a targeted question:
53
+ **If `--batch` flag is set**: Group all questions into AskUserQuestion calls (max 4 per call).
43
54
 
44
- **Multiple choice** (when possible saves user effort):
45
- ```
46
- How should authentication work?
47
- a) JWT tokens (stateless, good for APIs)
48
- b) Session cookies (stateful, good for web apps)
49
- c) OAuth (delegate to Google/GitHub)
50
- d) Other (describe)
51
- ```
55
+ **If `--assume` flag is set**: Auto-resolve and present assumptions (see Assumptions Mode below).
52
56
 
53
- **Confirmation** (for RISK):
54
- ```
55
- This will modify the payment processing flow. Confirm:
56
- - Are you working in a development environment?
57
- - Should existing billing data be preserved?
58
- ```
57
+ For each remaining ambiguity, ask a **domain-specific** question:
59
58
 
60
- **Free text** (only when choices aren't possible):
61
- ```
62
- Where should the new component be placed?
63
- (Hint: mention a directory or existing component to place it near)
64
- ```
59
+ ### UI Domain
60
+ - HOW: "Layout density? [Compact | Comfortable | Spacious]"
61
+ - HOW: "Interaction pattern? [Inline editing | Modal dialogs | Page navigation | Drawer panels]"
62
+ - HOW: "Empty state behavior? [Placeholder | Onboarding CTA | Hide section]"
63
+ - WHERE: "Which page/route should this appear on?"
64
+ - RISK: "Does this affect existing UI users rely on?"
65
65
 
66
- ## Step 4: Lock Decisions
66
+ ### API Domain
67
+ - HOW: "Response format? [JSON REST | GraphQL | tRPC | JSON-RPC]"
68
+ - HOW: "Error handling? [HTTP status codes | Always 200 | RFC 7807]"
69
+ - HOW: "Auth mechanism? [Bearer token | API key | Session cookie | Public]"
70
+ - WHERE: "Which endpoint prefix? (e.g., /api/v1/users)"
71
+ - RISK: "Public-facing or internal API?"
67
72
 
68
- After each answer, store in brain.db as a locked decision:
69
- ```
70
- Question: "Auth approach?"
71
- Decision: "JWT tokens — stateless"
72
- Tags: "HOW"
73
- Phase: current phase/task
74
- ```
73
+ ### Database Domain
74
+ - HOW: "ORM? [Prisma | Drizzle | TypeORM | Knex | Raw SQL | Match existing]"
75
+ - HOW: "Migration strategy? [Auto-generate | Manual | Schema push]"
76
+ - WHERE: "Which table/model?"
77
+ - RISK: "Data migration needed? Existing production data?"
78
+
79
+ ### Auth Domain
80
+ - HOW: "Auth approach? [JWT | Session cookies | OAuth2 | API keys]"
81
+ - HOW: "Token storage? [httpOnly cookie | localStorage | Memory | Secure cookie + CSRF]"
82
+ - HOW: "Role model? [Simple roles | RBAC | ABAC | No roles]"
83
+ - RISK: "Affects existing user sessions?"
84
+
85
+ ### Content Domain
86
+ - HOW: "Format? [Markdown | Rich text | Structured JSON | Plain text]"
87
+ - HOW: "Tone? [Technical | Casual | Formal | Match existing]"
88
+ - HOW: "i18n? [English only | Multi-language | i18n-ready]"
89
+
90
+ ### Infra Domain
91
+ - HOW: "Deploy target? [Vercel | AWS | Docker | Self-hosted | Match existing]"
92
+ - HOW: "CI/CD? [GitHub Actions | GitLab CI | CircleCI | None | Match existing]"
93
+
94
+ Use **multiple choice** for HOW questions (saves user effort).
95
+ Use **free text** for WHERE questions.
96
+ Use **confirmation** for RISK questions.
97
+
98
+ ## Step 4: Follow-Up Depth
99
+
100
+ After each answer, score it:
101
+ - Multiple choice selection → sufficient (1.0)
102
+ - Short free text (<3 words) → needs follow-up (0.5)
103
+ - "I don't know" / "not sure" → needs follow-up (0.0)
104
+
105
+ **If score < 0.5**: Ask ONE follow-up:
106
+ - WHERE: "You mentioned [answer]. Can you be more specific — which file or directory?"
107
+ - WHAT: "You said [answer]. What should the user see when this is done?"
108
+ - HOW: "You picked [answer]. Any specific library or pattern to follow?"
109
+
110
+ **Max 2 follow-up rounds per ambiguity**. After that, lock whatever we have.
111
+
112
+ ## Step 5: Lock Decisions
113
+
114
+ Store each answer in brain.db with domain tag:
115
+
116
+ Use the `brain_decisions` MCP tool with: `{ "action": "add", "question": "[question]", "decision": "[answer]", "reasoning": "User-provided via discussion", "phase": "discuss", "tags": "[TYPE],[domain]" }`
75
117
 
76
118
  These decisions are:
77
119
  - Injected into all downstream agent contexts
78
120
  - Never asked again (even across sessions)
79
121
  - Visible via `/sf-brain decisions`
80
122
 
81
- ## Step 5: Report
123
+ ## Step 6: Report
82
124
 
83
125
  ```
84
- Resolved [N] ambiguities:
85
- WHERE: [answer summary]
86
- HOW: [answer summary]
87
- RISK: [confirmed]
126
+ Resolved [N] ambiguities (domains: [ui, auth]):
127
+ HOW (auth): JWT stateless tokens
128
+ HOW (ui): Compact layout, modal dialogs
129
+ WHERE: /app/auth/login page
130
+ RISK: Development only — confirmed
88
131
 
89
132
  Ready for planning. Run /sf-do to continue.
90
133
  ```
91
134
 
92
- ## Assumptions Mode (when `--assume` flag is set)
135
+ ## Step 7: Chain Mode (when `--chain` flag is set)
93
136
 
94
- Instead of asking questions, auto-resolve ambiguities using codebase patterns:
137
+ After all decisions locked:
138
+ 1. Auto-run `/sf-plan` with the task description + locked decisions
139
+ 2. After planning completes, auto-run `/sf-check-plan`
140
+ 3. If check passes, ask: "Plan ready. Execute now? [y/n]"
141
+ 4. If yes, auto-run `/sf-do`
95
142
 
96
- 1. For each detected ambiguity, query brain.db for matching patterns:
97
- - **WHERE**: Search nodes table for files matching task keywords
98
- - **HOW**: Reuse past HOW decisions or domain learnings
99
- - **WHAT**: Infer from task description
100
- - **RISK**: Auto-confirm if `.env.local` or `.env.development` exists
101
- - **SCOPE**: Default to "tackle all at once" for medium complexity
143
+ ## Assumptions Mode (when `--assume` flag is set)
144
+
145
+ Auto-resolve ambiguities using codebase patterns:
146
+ 1. **WHERE**: Search brain.db nodes for files matching task keywords
147
+ 2. **HOW**: Reuse past HOW decisions from same domain, or domain learnings
148
+ 3. **WHAT**: Infer from task description
149
+ 4. **RISK**: Auto-confirm if `.env.local` or `.env.development` exists
150
+ 5. **SCOPE**: Default to "tackle all at once"
102
151
 
103
- 2. Each auto-resolution has a confidence score (0-1):
104
- - Confidence >= 0.5: Accept and lock as decision
105
- - Confidence < 0.5: Fall back to asking the user
152
+ Each resolution has a confidence score (0-1):
153
+ - >= 0.5: Accept and lock
154
+ - < 0.5: Fall back to asking
106
155
 
107
- 3. Present assumptions to user before proceeding:
156
+ Present assumptions:
108
157
  ```
109
158
  Assuming (based on codebase patterns):
110
- WHERE: src/auth/login.ts, src/auth/session.ts (confidence: 0.8)
111
- HOW: Follow existing pattern: jwt-auth (confidence: 0.7)
112
- RISK: Confirmed development environment detected (confidence: 0.7)
159
+ HOW (auth): JWT — reusing previous decision (confidence: 0.8)
160
+ WHERE: src/auth/login.ts matched keyword "login" (confidence: 0.7)
161
+ RISK: Development env detected (confidence: 0.7)
113
162
 
114
- Say 'no' to override any of these, or press Enter to continue.
163
+ Say 'no' to override, or Enter to continue.
115
164
  ```
116
165
 
117
- 4. Lock accepted assumptions as decisions in brain.db.
118
-
119
166
  </process>
120
167
 
121
168
  <context>
package/commands/sf/do.md CHANGED
@@ -33,6 +33,8 @@ Extract flags from `$ARGUMENTS` before processing. Flags start with `--` and are
33
33
  - `--no-plan` — Skip discuss (Step 3) and plan (Step 4), go straight to execute
34
34
  - `--cheap` — Force ALL agents to use haiku (fastest, cheapest, ~80% cost reduction)
35
35
  - `--quality` — Force builder/architect to sonnet, architect to opus for complex tasks
36
+ - `--batch` — Batch all discussion questions into 1-2 AskUserQuestion calls
37
+ - `--chain` — After each step, auto-run the next (discuss → plan → check → execute)
36
38
 
37
39
  **Parse procedure:**
38
40
  1. Extract all `--flag` tokens from the input
@@ -116,7 +118,7 @@ Pipeline: scout → architect → builder → critic (acceleration: partial, 35%
116
118
 
117
119
  ## STEP 2: CONTEXT GATHERING (0 tokens)
118
120
 
119
- **FIX #5: Git diff awareness** — Run `git diff --name-only HEAD` to see what files changed since last commit. Pass this list to Scout so it focuses on recent changes instead of searching blindly.
121
+ **Git diff awareness** — Run `git diff --name-only HEAD` to see what files changed since last commit. Pass this list to Scout so it focuses on recent changes instead of searching blindly.
120
122
 
121
123
  If `.shipfast/brain.db` does not exist, tell user to run `shipfast init` first.
122
124
 
@@ -216,40 +218,53 @@ Launch ONE Builder agent with ALL tasks batched and `model: models.builder` from
216
218
  ### Complex workflow (per-task agents, fresh context each):
217
219
 
218
220
  **Check brain.db first** — if `/sf-plan` was run, tasks already exist:
219
- ```bash
220
- sqlite3 -json .shipfast/brain.db "SELECT id, description, plan_text FROM tasks WHERE status = 'pending' ORDER BY created_at;" 2>/dev/null
221
- ```
221
+
222
+ Use the `brain_tasks` MCP tool with: `{ "action": "list", "status": "pending" }`
222
223
 
223
224
  If tasks found in brain.db, execute them. If not, run inline planning first.
224
225
 
225
226
  **Per-task execution (fresh context per task):**
227
+
228
+ **REQUIRED — output progress for EVERY task (do NOT batch or skip):**
229
+
230
+ Before each task:
231
+ ```
232
+ [N/M] Building: [task description]...
233
+ ```
234
+ After each task:
235
+ ```
236
+ [N/M] ✓ [task description] (commit: [sha])
237
+ ```
238
+ Or on failure:
239
+ ```
240
+ [N/M] ✗ [task description] (error: [first 80 chars])
241
+ ```
242
+ If you did not output these lines, this is a process failure.
243
+
226
244
  For each pending task in brain.db:
227
245
  1. Launch a SEPARATE sf-builder agent with ONLY that task + brain context + `model: models.builder` from Step 1.5. If `--tdd` flag is set, prepend `MODE: TDD (red→green→refactor). Write failing test FIRST.` to the task context.
228
246
  2. Builder gets fresh context — no accumulated garbage from previous tasks
229
247
  3. Builder executes: read → grep consumers → implement → build → verify → commit
230
248
  4. After Builder completes, update task status and record model outcome:
231
- ```bash
232
- sqlite3 .shipfast/brain.db "UPDATE tasks SET status='passed', commit_sha='[sha]' WHERE id='[id]';"
233
- sqlite3 .shipfast/brain.db "INSERT INTO model_performance (agent, model, domain, task_id, outcome) VALUES ('builder', '[model used]', '[domain]', '[id]', 'success');"
234
- ```
249
+
250
+ Use the `brain_tasks` MCP tool with: `{ "action": "update", "id": "[id]", "status": "passed", "commit_sha": "[sha]" }`
251
+
252
+ Use the `brain_model_outcome` MCP tool with: `{ "agent": "builder", "model": "[model used]", "domain": "[domain]", "task_id": "[id]", "outcome": "success" }`
253
+
235
254
  5. If Builder fails after 3 attempts:
236
- ```bash
237
- sqlite3 .shipfast/brain.db "UPDATE tasks SET status='failed', error='[error]' WHERE id='[id]';"
238
- sqlite3 .shipfast/brain.db "INSERT INTO model_performance (agent, model, domain, task_id, outcome) VALUES ('builder', '[model used]', '[domain]', '[id]', 'failure');"
239
- ```
255
+
256
+ Use the `brain_tasks` MCP tool with: `{ "action": "update", "id": "[id]", "status": "failed", "error": "[error]" }`
257
+
258
+ Use the `brain_model_outcome` MCP tool with: `{ "agent": "builder", "model": "[model used]", "domain": "[domain]", "task_id": "[id]", "outcome": "failure" }`
240
259
  6. Continue to next task regardless
241
260
 
242
- **Wave grouping:**
243
- - Independent tasks (no `depends`) → same wave → launch Builder agents in parallel
261
+ **Wave grouping + parallel execution:**
262
+ - Independent tasks (no `depends`) → same wave
244
263
  - Dependent tasks → later wave → wait for dependencies to complete
245
264
  - Tasks touching same files → sequential (never parallel)
246
265
 
247
- **After all tasks:**
248
- - Launch Critic agent (fresh context) with `model: models.critic` to review ALL changes: `git diff HEAD~N`
249
- - Launch Scribe agent (fresh context) with `model: models.scribe` to record decisions + learnings to brain.db
250
- - Save session state for `/sf-resume`
251
-
252
- **After execution, run `/sf-verify` for thorough verification.**
266
+ **Parallel execution within waves:**
267
+ If a wave has 2+ tasks, launch ALL Builder agents in that wave simultaneously using multiple Agent tool calls in a single response. Wait for all to complete before starting the next wave. This is safe because wave tasks are independent by definition.
253
268
 
254
269
  ### Builder behavior:
255
270
  - Follows deviation tiers: auto-fix bugs (T1-3), STOP for architecture changes (T4)
@@ -258,71 +273,106 @@ For each pending task in brain.db:
258
273
  - Stub detection before commit: scan for TODO/FIXME/placeholder
259
274
  - Commit hygiene: stage specific files, never `git add .`
260
275
 
261
- ### If Critic finds CRITICAL issues:
262
- Send the issue back to Builder for fix (1 additional agent call, not a full re-run).
263
-
264
276
  ---
265
277
 
266
- ## STEP 7: VERIFY (0-3K tokens)
278
+ ## STEP 7: MANDATORY POST-EXECUTION VERIFICATION
267
279
 
268
- **Skip if**: trivial tasks with passing build, UNLESS `--verify` flag is set
269
- **Force if**: `--verify` flag is set, regardless of complexity
280
+ ⚠️ **STOP-GATE: Do NOT output the final report or say "Done" until ALL checks below are complete. If you skip verification, the task is FAILED regardless of whether the code works. This is not optional.**
270
281
 
271
- Run goal-backward verification:
272
- 1. Extract done-criteria from the original request + plan
273
- 2. Check each criterion:
274
- - File exists? → filesystem check
275
- - Symbol exists? → grep check
276
- - Build passes? → run build command
277
- - No stubs? → scan changed files for TODO/FIXME/placeholder
278
- - Behavior works? → mark as "manual verification needed"
279
- 3. Score: N/M criteria met
280
- - 100% → PASS
281
- - 80%+ → PASS_WITH_WARNINGS (list gaps)
282
- - Below 80% → FAIL (list what's missing)
282
+ You MUST complete **ALL** of the following in order. Check each off as you go.
283
283
 
284
- Store verification results in brain.db.
284
+ ### 7A. Launch Critic agent (REQUIRED for medium/complex)
285
285
 
286
- ### Auto-Fix on Failure
287
- If verification returns FAIL:
288
- 1. Generate targeted fix tasks from each failure (~200 tokens each, not a fresh agent)
289
- 2. Send fix tasks to Builder for one retry attempt
290
- 3. Re-verify after fixes
291
- 4. If still failing, report as DEFERRED — do not loop further
292
-
293
- ### TDD Verification (when --tdd flag is set)
294
- After all tasks complete, verify git log contains the correct commit sequence:
295
- 1. `test(...)` commit (RED phase) — must exist
296
- 2. `feat(...)` commit after it (GREEN phase) — must exist
297
- 3. Optional `refactor(...)` commit
298
- If sequence is violated, flag as TDD VIOLATION in the report.
299
-
300
- ### Requirement Verification (when project has REQ-IDs)
301
- If brain.db has requirements for this phase:
302
- 1. Check each v1 requirement mapped to this phase
303
- 2. Mark as done/pending based on verification results
304
- 3. Report coverage: "Requirements: N/M covered"
286
+ Launch sf-critic agent with `model: models.critic` and the full diff:
287
+ ```bash
288
+ git diff HEAD~[N commits]
289
+ ```
290
+ Wait for Critic to return its verdict. If Critic finds CRITICAL issues → send to Builder for fix (1 additional agent call, not a full re-run).
305
291
 
306
- ---
292
+ Report: `Critic: [PASS/PASS_WITH_WARNINGS/FAIL] — [N] findings`
307
293
 
308
- ## STEP 8: LEARN
294
+ ### 7B. Build verification (REQUIRED)
309
295
 
310
- **FIX #9/#10: Explicitly record decisions and learnings using these exact commands:**
296
+ Run the project's build/typecheck command:
297
+ ```bash
298
+ npm run build # or tsc --noEmit / cargo check
299
+ ```
300
+ Report: `Build: [PASS/FAIL]`
311
301
 
312
- If you made any architectural decisions during this task, record each one:
302
+ ### 7C. Consumer integrity check (REQUIRED)
303
+
304
+ For every function/type/export that was modified or removed across all tasks:
313
305
  ```bash
314
- sqlite3 .shipfast/brain.db "INSERT INTO decisions (question, decision, reasoning, phase) VALUES ('[what was decided]', '[the choice]', '[why]', '[current task]');"
306
+ grep -r "removed_symbol" --include="*.ts" --include="*.tsx" --include="*.js" .
315
307
  ```
308
+ Any remaining consumers = CRITICAL failure. Report: `Consumers: [CLEAN/N broken]`
316
309
 
317
- If you encountered and fixed any errors, record the pattern:
310
+ ### 7D. Stub scan (REQUIRED)
311
+
312
+ Scan all changed files for incomplete work:
318
313
  ```bash
319
- sqlite3 .shipfast/brain.db "INSERT INTO learnings (pattern, problem, solution, domain, source, confidence) VALUES ('[short pattern name]', '[what went wrong]', '[what fixed it]', '[domain]', 'auto', 0.5);"
314
+ git diff HEAD~[N] --name-only
320
315
  ```
316
+ Then grep each for: TODO, FIXME, HACK, placeholder, console.log, debugger
317
+
318
+ Report: `Stubs: [CLEAN/N found]`
319
+
320
+ ### 7E. Branch audit (REQUIRED when on non-default branch)
321
321
 
322
- If any improvement ideas, future features, or tech debt were surfaced during this task (including OUT_OF_SCOPE items), record them as seeds:
323
322
  ```bash
324
- sqlite3 .shipfast/brain.db "INSERT INTO seeds (idea, source_task, domain, priority) VALUES ('[idea]', '[current task]', '[domain]', 'someday');"
323
+ CURRENT=$(git branch --show-current)
325
324
  ```
325
+ Use the `brain_config` MCP tool with: `{ "action": "get", "key": "default_branch" }` — fall back to `"main"`.
326
+
327
+ If `$CURRENT` ≠ `$DEFAULT`:
328
+ - `git diff $DEFAULT...$CURRENT --diff-filter=D --name-only` → deleted files
329
+ - For removed exports, check consumers via brain.db
330
+ - Report: `Branch audit: [N] migrated | [N] missing | [N] safe`
331
+
332
+ ### 7F. TDD check (when --tdd flag is set)
333
+
334
+ Verify `test(...)` commits come before `feat(...)` commits. Report: `TDD: [VALID/VIOLATION]`
335
+
336
+ ### 7G. Launch Scribe agent (REQUIRED for complex)
337
+
338
+ Launch sf-scribe agent with `model: models.scribe` to record decisions + learnings to brain.db.
339
+
340
+ ### 7H. Score results
341
+
342
+ Combine all checks:
343
+ - All pass → **PASS**
344
+ - Minor issues → **PASS_WITH_WARNINGS** (list them)
345
+ - Critical issues → **FAIL** (list them, attempt auto-fix)
346
+
347
+ ### Auto-Fix on Failure
348
+ If FAIL:
349
+ 1. Generate targeted fix tasks (~200 tokens each)
350
+ 2. Send to Builder for one retry
351
+ 3. Re-verify
352
+ 4. If still failing → DEFERRED
353
+
354
+ Store verification results:
355
+ Use the `brain_context` MCP tool with: `{ "action": "set", "scope": "session", "key": "verification", "value": "[JSON results]" }`
356
+
357
+ Only AFTER 7A-7H are complete, proceed to STEP 8.
358
+
359
+ ---
360
+
361
+ ## STEP 8: LEARN
362
+
363
+ **Explicitly record decisions and learnings using these exact commands:**
364
+
365
+ If you made any architectural decisions during this task, record each one:
366
+
367
+ Use the `brain_decisions` MCP tool with: `{ "action": "add", "question": "[what was decided]", "decision": "[the choice]", "reasoning": "[why]", "phase": "[current task]" }`
368
+
369
+ If you encountered and fixed any errors, record the pattern:
370
+
371
+ Use the `brain_learnings` MCP tool with: `{ "action": "add", "pattern": "[short pattern name]", "problem": "[what went wrong]", "solution": "[what fixed it]", "domain": "[domain]", "source": "auto", "confidence": 0.5 }`
372
+
373
+ If any improvement ideas, future features, or tech debt were surfaced during this task (including OUT_OF_SCOPE items), record them as seeds:
374
+
375
+ Use the `brain_seeds` MCP tool with: `{ "action": "add", "idea": "[idea]", "source_task": "[current task]", "domain": "[domain]", "priority": "someday" }`
326
376
 
327
377
  **These are not optional.** If decisions were made, errors were fixed, or ideas were surfaced, you MUST record them. This is how ShipFast gets smarter over time.
328
378
 
@@ -330,7 +380,18 @@ sqlite3 .shipfast/brain.db "INSERT INTO seeds (idea, source_task, domain, priori
330
380
 
331
381
  ## STEP 9: REPORT
332
382
 
333
- **Trivial tasks** (progressive disclosure minimal output):
383
+ **Before reporting, confirm all post-execution steps completed (complex tasks):**
384
+ - [ ] Progress lines shown [N/M] for every task
385
+ - [ ] Critic reviewed — verdict: ___
386
+ - [ ] Build: ___
387
+ - [ ] Consumer integrity: ___
388
+ - [ ] Stubs: ___
389
+ - [ ] Branch audit (if non-default): ___
390
+ - [ ] Scribe recorded decisions/learnings
391
+
392
+ **If any checkbox is unchecked, go back and complete it now. Do NOT report with incomplete verification.**
393
+
394
+ **Trivial tasks**:
334
395
  ```
335
396
  Done: [one sentence summary]
336
397
  ```
@@ -338,15 +399,22 @@ Done: [one sentence summary]
338
399
  **Medium tasks**:
339
400
  ```
340
401
  Done: [summary]
341
- Commits: [N] | Verification: [PASS/WARN/FAIL]
402
+ Commits: [N] | Build: [PASS/FAIL] | Critic: [verdict] | Consumers: [clean/N broken]
342
403
  ```
343
404
 
344
405
  **Complex tasks** (full dashboard):
345
406
  ```
346
407
  Done: [summary]
347
- Commits: [N] | Tasks: [completed]/[total] | Verification: [PASS/WARN/FAIL]
348
- Tokens: ~[estimate] | Time: [duration]
349
- Deferred: [list of issues that need manual attention, if any]
408
+ Commits: [N] | Tasks: [completed]/[total]
409
+
410
+ Verification:
411
+ Critic: [PASS/WARNINGS/FAIL] — [N findings]
412
+ Build: [PASS/FAIL]
413
+ Consumers: [CLEAN/N broken]
414
+ Stubs: [CLEAN/N found]
415
+ Branch: [N migrated, N missing, N safe] (or N/A if default branch)
416
+
417
+ Deferred: [issues needing manual attention, if any]
350
418
  ```
351
419
 
352
420
  **If session state was saved** (context getting low):
@@ -37,18 +37,23 @@ SHIPPING
37
37
  SESSION
38
38
  /sf-status Brain stats, tasks, checkpoints, version.
39
39
  /sf-resume Resume from previous session.
40
- /sf-undo [task-id] Rollback a completed task.
40
+ /sf-undo [task-id] Rollback a specific task by ID.
41
+ /sf-rollback [last|all|N] Rollback last task, last N, or entire session.
41
42
 
42
43
  KNOWLEDGE
43
44
  /sf-brain <query> Query knowledge graph: files, decisions, learnings, hot files.
44
45
  /sf-learn <pattern> Teach a reusable pattern.
45
46
  /sf-map Generate codebase report from brain.db.
47
+ /sf-cost Token usage breakdown by agent, domain, model.
48
+ /sf-diff Smart diff viewer — changes grouped by task.
46
49
 
47
50
  PARALLEL WORK
48
- /sf-workstream list Show all workstreams.
49
- /sf-workstream create Create namespaced workstream with branch.
50
- /sf-workstream switch Switch active workstream.
51
- /sf-workstream complete Complete and merge workstream.
51
+ /sf-worktree list Show all worktrees.
52
+ /sf-worktree create Create worktree suggests branch name, supports multi-repo.
53
+ /sf-worktree switch Show path to worktree (cd into it).
54
+ /sf-worktree status Show uncommitted changes, commits, tasks for a worktree.
55
+ /sf-worktree check Migration audit: migrated, missing, safe, modified, added.
56
+ /sf-worktree complete Run audit, merge into default branch, remove worktree.
52
57
 
53
58
  CONFIG
54
59
  /sf-config View or set model tiers and preferences.
@@ -15,44 +15,36 @@ Unlike GSD's 7 markdown mapper agents, this queries the existing SQLite brain di
15
15
  Run these queries and format the output. Do NOT modify the queries.
16
16
 
17
17
  ## File structure
18
- ```bash
19
- sqlite3 .shipfast/brain.db "SELECT file_path FROM nodes WHERE kind = 'file' ORDER BY file_path;" 2>/dev/null | head -50
20
- ```
18
+
19
+ Use the `brain_search` MCP tool with: `{ "query": "kind:file", "limit": 50 }` list all file nodes ordered by path.
21
20
 
22
21
  ## Symbol counts by kind
23
- ```bash
24
- sqlite3 .shipfast/brain.db "SELECT kind, COUNT(*) as count FROM nodes GROUP BY kind ORDER BY count DESC;" 2>/dev/null
25
- ```
22
+
23
+ Use the `brain_search` MCP tool with: `{ "query": "group_by:kind" }` get node counts grouped by kind.
26
24
 
27
25
  ## Top functions (most connected)
28
- ```bash
29
- sqlite3 .shipfast/brain.db "SELECT n.name, n.file_path, n.signature, COUNT(e.target) as connections FROM nodes n LEFT JOIN edges e ON n.id = e.source WHERE n.kind = 'function' GROUP BY n.id ORDER BY connections DESC LIMIT 15;" 2>/dev/null
30
- ```
26
+
27
+ Use the `brain_search` MCP tool with: `{ "query": "kind:function order_by:connections", "limit": 15 }` get functions with their connection counts.
31
28
 
32
29
  ## Hot files (most changed)
33
- ```bash
34
- sqlite3 .shipfast/brain.db "SELECT file_path, change_count FROM hot_files ORDER BY change_count DESC LIMIT 15;" 2>/dev/null
35
- ```
30
+
31
+ Use the `brain_hot_files` MCP tool with: `{ "limit": 15 }` returns files ordered by change_count descending.
36
32
 
37
33
  ## Import graph (top connections)
38
- ```bash
39
- sqlite3 .shipfast/brain.db "SELECT REPLACE(source,'file:','') as from_file, REPLACE(target,'file:','') as to_file, kind FROM edges WHERE kind = 'imports' LIMIT 20;" 2>/dev/null
40
- ```
34
+
35
+ Use the `brain_graph_cochanges` MCP tool with: `{ "kind": "imports", "limit": 20 }` get top import edges between files.
41
36
 
42
37
  ## Co-change clusters
43
- ```bash
44
- sqlite3 .shipfast/brain.db "SELECT REPLACE(source,'file:','') as file_a, REPLACE(target,'file:','') as file_b, weight FROM edges WHERE kind = 'co_changes' AND weight > 0.3 ORDER BY weight DESC LIMIT 15;" 2>/dev/null
45
- ```
38
+
39
+ Use the `brain_graph_cochanges` MCP tool with: `{ "min_weight": 0.3, "limit": 15 }` get co-change pairs with weight > 0.3 ordered by weight descending.
46
40
 
47
41
  ## Decisions made
48
- ```bash
49
- sqlite3 .shipfast/brain.db "SELECT question, decision, phase FROM decisions ORDER BY created_at DESC LIMIT 10;" 2>/dev/null
50
- ```
42
+
43
+ Use the `brain_decisions` MCP tool with: `{ "action": "list", "limit": 10 }` — returns decisions ordered by created_at descending.
51
44
 
52
45
  ## Learnings
53
- ```bash
54
- sqlite3 .shipfast/brain.db "SELECT pattern, problem, solution, confidence FROM learnings WHERE confidence > 0.3 ORDER BY confidence DESC LIMIT 10;" 2>/dev/null
55
- ```
46
+
47
+ Use the `brain_learnings` MCP tool with: `{ "action": "list", "min_confidence": 0.3, "limit": 10 }` — returns learnings with confidence > 0.3 ordered by confidence descending.
56
48
 
57
49
  Format as:
58
50