@shipfast-ai/shipfast 1.0.2 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +15 -15
- package/agents/builder.md +26 -7
- package/brain/index.cjs +42 -0
- package/brain/schema.sql +37 -0
- package/commands/sf/brain.md +8 -0
- package/commands/sf/discuss.md +26 -1
- package/commands/sf/do.md +89 -15
- package/commands/sf/milestone.md +1 -1
- package/commands/sf/verify.md +23 -0
- package/core/ambiguity.cjs +106 -0
- package/core/guardrails.cjs +22 -4
- package/core/model-selector.cjs +59 -2
- package/core/skip-logic.cjs +49 -7
- package/core/verify.cjs +126 -3
- package/mcp/server.cjs +42 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -20,20 +20,20 @@ Works on Mac, Windows, and Linux.
|
|
|
20
20
|
|
|
21
21
|
## Why ShipFast?
|
|
22
22
|
|
|
23
|
-
|
|
23
|
+
Context rot kills AI coding quality. As the context window fills up, output degrades.
|
|
24
24
|
|
|
25
|
-
ShipFast
|
|
25
|
+
ShipFast fixes this with a **SQLite knowledge graph** that gives each agent fresh context and gets smarter every session.
|
|
26
26
|
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
27
|
+
- **17 commands, 5 composable agents** — simple to learn, covers the full workflow
|
|
28
|
+
- **SQLite brain** — queryable knowledge graph, no per-task state files
|
|
29
|
+
- **3K-40K tokens per feature** — 70-90% less than typical AI dev workflows
|
|
30
|
+
- **Fresh context per task** — no accumulated garbage between tasks
|
|
31
|
+
- **Cross-session learning** — records decisions and patterns, gets cheaper over time
|
|
32
|
+
- **Codebase indexing in <1 second** — 973 files indexed in 636ms
|
|
33
|
+
- **Graph-derived architecture** — auto-detects layers from import graph
|
|
34
|
+
- **Cross-repo linking** — search across multiple repos with `shipfast link`
|
|
35
|
+
- **17 MCP tools** — structured brain access, no SQL improvisation
|
|
36
|
+
- **Works with 14 AI coding tools** — auto-detects and installs for all
|
|
37
37
|
|
|
38
38
|
---
|
|
39
39
|
|
|
@@ -77,7 +77,7 @@ shipfast help # Show all commands
|
|
|
77
77
|
|
|
78
78
|
## How It Works
|
|
79
79
|
|
|
80
|
-
Already have code? `shipfast init` indexes your codebase in under 1 second — functions, types, imports, git history.
|
|
80
|
+
Already have code? `shipfast init` indexes your codebase in under 1 second — functions, types, imports, git history. All stored in a SQLite database.
|
|
81
81
|
|
|
82
82
|
### 1. Plan Phase
|
|
83
83
|
|
|
@@ -168,7 +168,7 @@ Creates branch, generates PR description from brain.db (decisions, tasks, change
|
|
|
168
168
|
/sf-milestone new v2.0
|
|
169
169
|
```
|
|
170
170
|
|
|
171
|
-
Or for simple tasks,
|
|
171
|
+
Or for simple tasks, just run directly:
|
|
172
172
|
|
|
173
173
|
```
|
|
174
174
|
/sf-do fix the login bug
|
|
@@ -244,7 +244,7 @@ All exposed as MCP tools: `brain_arch_layers`, `brain_arch_folders`, `brain_arch
|
|
|
244
244
|
|
|
245
245
|
## Agents
|
|
246
246
|
|
|
247
|
-
5 composable agents
|
|
247
|
+
5 composable agents with compressed behavioral rules.
|
|
248
248
|
|
|
249
249
|
| Agent | Role | Model | Key Rules |
|
|
250
250
|
|---|---|---|---|
|
package/agents/builder.md
CHANGED
|
@@ -59,6 +59,11 @@ Track every deviation: `[Tier N] Fixed: [what] in [file]`
|
|
|
59
59
|
Only fix issues DIRECTLY caused by your current task.
|
|
60
60
|
Pre-existing problems in other files → do NOT fix. Output:
|
|
61
61
|
`OUT_OF_SCOPE: [file:line] [issue]`
|
|
62
|
+
|
|
63
|
+
For each out-of-scope issue, also record it as a seed for future work:
|
|
64
|
+
```bash
|
|
65
|
+
sqlite3 .shipfast/brain.db "INSERT INTO seeds (idea, source_task, domain, priority) VALUES ('[improvement idea]', '[current task id]', '[domain]', 'someday');"
|
|
66
|
+
```
|
|
62
67
|
</deviation_tiers>
|
|
63
68
|
|
|
64
69
|
<patterns>
|
|
@@ -149,17 +154,31 @@ Check if your changes introduced:
|
|
|
149
154
|
- New external service calls
|
|
150
155
|
- Schema changes at trust boundaries
|
|
151
156
|
|
|
157
|
+
- Schema/model changes without corresponding migrations
|
|
158
|
+
|
|
152
159
|
If found: `THREAT_FLAG: [type] in [file] — [description]`
|
|
160
|
+
If schema drift: `DRIFT_WARNING: [model file] changed without migration. Run: [migrate command]`
|
|
153
161
|
</threat_scan>
|
|
154
162
|
|
|
155
163
|
<tdd_mode>
|
|
156
|
-
## TDD (when --tdd flag
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
|
|
164
|
+
## TDD (when --tdd flag or MODE: TDD is in context)
|
|
165
|
+
|
|
166
|
+
**THIS OVERRIDES THE NORMAL EXECUTION ORDER.** When TDD mode is active, follow this sequence strictly:
|
|
167
|
+
|
|
168
|
+
**Step 1: READ** — Understand what to test. Read relevant files and existing test patterns.
|
|
169
|
+
**Step 2: WRITE TEST** — Write a failing test. Test ONLY, no implementation code.
|
|
170
|
+
**Step 3: RUN TEST** — Run the test. It MUST fail. If it passes, STOP — the test is wrong. Investigate.
|
|
171
|
+
**Step 4: COMMIT RED** — `git add <test files only>` → `test(scope): red - [description]`
|
|
172
|
+
**Step 5: IMPLEMENT** — Write the minimal code to make the test pass. Implementation files only.
|
|
173
|
+
**Step 6: RUN TEST** — Run the test. It MUST pass.
|
|
174
|
+
**Step 7: COMMIT GREEN** — `git add <implementation files only>` → `feat(scope): green - [description]`
|
|
175
|
+
**Step 8: REFACTOR** (optional) — Clean up. Commit as `refactor(scope): [description]`
|
|
176
|
+
|
|
177
|
+
**NON-NEGOTIABLE RULES:**
|
|
178
|
+
- You MUST NOT write implementation code before committing a failing test
|
|
179
|
+
- Test commits MUST contain only test/spec files
|
|
180
|
+
- Feat commits MUST contain only implementation files (no test files)
|
|
181
|
+
- If you cannot write a meaningful failing test, report: `TDD_BLOCKED: [reason]`
|
|
163
182
|
</tdd_mode>
|
|
164
183
|
|
|
165
184
|
<context>
|
package/brain/index.cjs
CHANGED
|
@@ -322,6 +322,41 @@ function buildAgentContext(cwd, { agent, taskDescription, affectedFiles, phase,
|
|
|
322
322
|
return parts.join('\n\n');
|
|
323
323
|
}
|
|
324
324
|
|
|
325
|
+
// ============================================================
|
|
326
|
+
// Model Performance (feedback loop)
|
|
327
|
+
// ============================================================
|
|
328
|
+
|
|
329
|
+
function recordModelOutcome(cwd, { agent, model, domain, taskId, outcome }) {
|
|
330
|
+
run(cwd, `INSERT INTO model_performance (agent, model, domain, task_id, outcome)
|
|
331
|
+
VALUES ('${esc(agent)}', '${esc(model)}', '${esc(domain || '')}', '${esc(taskId || '')}', '${esc(outcome)}')`);
|
|
332
|
+
}
|
|
333
|
+
|
|
334
|
+
// ============================================================
|
|
335
|
+
// Seeds (forward ideas captured during work)
|
|
336
|
+
// ============================================================
|
|
337
|
+
|
|
338
|
+
function addSeed(cwd, { idea, sourceTask, domain, priority }) {
|
|
339
|
+
run(cwd, `INSERT INTO seeds (idea, source_task, domain, priority)
|
|
340
|
+
VALUES ('${esc(idea)}', '${esc(sourceTask || '')}', '${esc(domain || '')}', '${esc(priority || 'someday')}')`);
|
|
341
|
+
}
|
|
342
|
+
|
|
343
|
+
function getSeeds(cwd, opts = {}) {
|
|
344
|
+
const conditions = [];
|
|
345
|
+
if (opts.status) conditions.push(`status = '${esc(opts.status)}'`);
|
|
346
|
+
if (opts.domain) conditions.push(`domain = '${esc(opts.domain)}'`);
|
|
347
|
+
if (opts.priority) conditions.push(`priority = '${esc(opts.priority)}'`);
|
|
348
|
+
const where = conditions.length ? 'WHERE ' + conditions.join(' AND ') : '';
|
|
349
|
+
return query(cwd, `SELECT * FROM seeds ${where} ORDER BY created_at DESC LIMIT 30`);
|
|
350
|
+
}
|
|
351
|
+
|
|
352
|
+
function promoteSeed(cwd, seedId, taskId) {
|
|
353
|
+
run(cwd, `UPDATE seeds SET status = 'promoted', promoted_to = '${esc(taskId)}' WHERE id = ${parseInt(seedId)}`);
|
|
354
|
+
}
|
|
355
|
+
|
|
356
|
+
function dismissSeed(cwd, seedId) {
|
|
357
|
+
run(cwd, `UPDATE seeds SET status = 'dismissed' WHERE id = ${parseInt(seedId)}`);
|
|
358
|
+
}
|
|
359
|
+
|
|
325
360
|
// ============================================================
|
|
326
361
|
// Utils
|
|
327
362
|
// ============================================================
|
|
@@ -366,6 +401,13 @@ module.exports = {
|
|
|
366
401
|
setConfig,
|
|
367
402
|
buildAgentContext,
|
|
368
403
|
esc,
|
|
404
|
+
// Model Performance
|
|
405
|
+
recordModelOutcome,
|
|
406
|
+
// Seeds
|
|
407
|
+
addSeed,
|
|
408
|
+
getSeeds,
|
|
409
|
+
promoteSeed,
|
|
410
|
+
dismissSeed,
|
|
369
411
|
// Requirements
|
|
370
412
|
addRequirement,
|
|
371
413
|
getRequirements,
|
package/brain/schema.sql
CHANGED
|
@@ -222,6 +222,41 @@ CREATE INDEX IF NOT EXISTS idx_req_category ON requirements(category);
|
|
|
222
222
|
CREATE INDEX IF NOT EXISTS idx_req_phase ON requirements(phase);
|
|
223
223
|
CREATE INDEX IF NOT EXISTS idx_req_status ON requirements(status);
|
|
224
224
|
|
|
225
|
+
-- ============================================================
|
|
226
|
+
-- MODEL PERFORMANCE (feedback loop for smart model selection)
|
|
227
|
+
-- ============================================================
|
|
228
|
+
|
|
229
|
+
CREATE TABLE IF NOT EXISTS model_performance (
|
|
230
|
+
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
231
|
+
agent TEXT NOT NULL, -- scout | architect | builder | critic | scribe
|
|
232
|
+
model TEXT NOT NULL, -- haiku | sonnet | opus
|
|
233
|
+
domain TEXT, -- auth, database, ui, etc.
|
|
234
|
+
task_id TEXT, -- which task this was for
|
|
235
|
+
outcome TEXT NOT NULL, -- success | failure | retry
|
|
236
|
+
created_at INTEGER NOT NULL DEFAULT (strftime('%s', 'now'))
|
|
237
|
+
);
|
|
238
|
+
|
|
239
|
+
CREATE INDEX IF NOT EXISTS idx_model_perf_agent ON model_performance(agent);
|
|
240
|
+
CREATE INDEX IF NOT EXISTS idx_model_perf_domain ON model_performance(domain);
|
|
241
|
+
|
|
242
|
+
-- ============================================================
|
|
243
|
+
-- SEEDS (forward ideas captured during work)
|
|
244
|
+
-- ============================================================
|
|
245
|
+
|
|
246
|
+
CREATE TABLE IF NOT EXISTS seeds (
|
|
247
|
+
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
248
|
+
idea TEXT NOT NULL, -- the improvement, feature, or tech debt idea
|
|
249
|
+
source_task TEXT, -- which task surfaced this idea
|
|
250
|
+
domain TEXT, -- relevant domain (auth, ui, database, etc.)
|
|
251
|
+
priority TEXT DEFAULT 'someday', -- someday | next | urgent
|
|
252
|
+
status TEXT DEFAULT 'open', -- open | promoted | dismissed
|
|
253
|
+
promoted_to TEXT, -- task_id if promoted to a real task
|
|
254
|
+
created_at INTEGER NOT NULL DEFAULT (strftime('%s', 'now'))
|
|
255
|
+
);
|
|
256
|
+
|
|
257
|
+
CREATE INDEX IF NOT EXISTS idx_seeds_status ON seeds(status);
|
|
258
|
+
CREATE INDEX IF NOT EXISTS idx_seeds_domain ON seeds(domain);
|
|
259
|
+
|
|
225
260
|
-- ============================================================
|
|
226
261
|
-- MIGRATIONS TRACKING
|
|
227
262
|
-- ============================================================
|
|
@@ -233,3 +268,5 @@ CREATE TABLE IF NOT EXISTS _migrations (
|
|
|
233
268
|
);
|
|
234
269
|
|
|
235
270
|
INSERT OR IGNORE INTO _migrations (version, name) VALUES (1, 'initial_schema');
|
|
271
|
+
INSERT OR IGNORE INTO _migrations (version, name) VALUES (2, 'add_seeds_table');
|
|
272
|
+
INSERT OR IGNORE INTO _migrations (version, name) VALUES (3, 'add_model_performance_table');
|
package/commands/sf/brain.md
CHANGED
|
@@ -49,6 +49,14 @@ WHERE confidence > 0.3 ORDER BY confidence DESC LIMIT 10
|
|
|
49
49
|
SELECT file_path, change_count FROM hot_files ORDER BY change_count DESC LIMIT 15
|
|
50
50
|
```
|
|
51
51
|
|
|
52
|
+
### "seeds" or "ideas" or "future work"
|
|
53
|
+
```sql
|
|
54
|
+
SELECT id, idea, source_task, domain, priority, status FROM seeds
|
|
55
|
+
WHERE status = 'open'
|
|
56
|
+
ORDER BY CASE priority WHEN 'urgent' THEN 0 WHEN 'next' THEN 1 ELSE 2 END, created_at DESC
|
|
57
|
+
LIMIT 20
|
|
58
|
+
```
|
|
59
|
+
|
|
52
60
|
### "stats"
|
|
53
61
|
Show counts: nodes, edges, decisions, learnings, tasks, checkpoints
|
|
54
62
|
|
package/commands/sf/discuss.md
CHANGED
|
@@ -89,7 +89,32 @@ Resolved [N] ambiguities:
|
|
|
89
89
|
Ready for planning. Run /sf-do to continue.
|
|
90
90
|
```
|
|
91
91
|
|
|
92
|
-
|
|
92
|
+
## Assumptions Mode (when `--assume` flag is set)
|
|
93
|
+
|
|
94
|
+
Instead of asking questions, auto-resolve ambiguities using codebase patterns:
|
|
95
|
+
|
|
96
|
+
1. For each detected ambiguity, query brain.db for matching patterns:
|
|
97
|
+
- **WHERE**: Search nodes table for files matching task keywords
|
|
98
|
+
- **HOW**: Reuse past HOW decisions or domain learnings
|
|
99
|
+
- **WHAT**: Infer from task description
|
|
100
|
+
- **RISK**: Auto-confirm if `.env.local` or `.env.development` exists
|
|
101
|
+
- **SCOPE**: Default to "tackle all at once" for medium complexity
|
|
102
|
+
|
|
103
|
+
2. Each auto-resolution has a confidence score (0-1):
|
|
104
|
+
- Confidence >= 0.5: Accept and lock as decision
|
|
105
|
+
- Confidence < 0.5: Fall back to asking the user
|
|
106
|
+
|
|
107
|
+
3. Present assumptions to user before proceeding:
|
|
108
|
+
```
|
|
109
|
+
Assuming (based on codebase patterns):
|
|
110
|
+
WHERE: src/auth/login.ts, src/auth/session.ts (confidence: 0.8)
|
|
111
|
+
HOW: Follow existing pattern: jwt-auth (confidence: 0.7)
|
|
112
|
+
RISK: Confirmed — development environment detected (confidence: 0.7)
|
|
113
|
+
|
|
114
|
+
Say 'no' to override any of these, or press Enter to continue.
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
4. Lock accepted assumptions as decisions in brain.db.
|
|
93
118
|
|
|
94
119
|
</process>
|
|
95
120
|
|
package/commands/sf/do.md
CHANGED
|
@@ -21,6 +21,31 @@ Every step is skippable — trivial tasks burn 3K tokens, complex tasks burn 30K
|
|
|
21
21
|
|
|
22
22
|
<pipeline>
|
|
23
23
|
|
|
24
|
+
## STEP 0: PARSE FLAGS (0 LLM tokens — string matching)
|
|
25
|
+
|
|
26
|
+
Extract flags from `$ARGUMENTS` before processing. Flags start with `--` and are composable.
|
|
27
|
+
|
|
28
|
+
**Supported flags:**
|
|
29
|
+
- `--discuss` — Force discuss step (Step 3) even for trivial tasks
|
|
30
|
+
- `--research` — Force Scout agent to run (override skip-scout heuristics)
|
|
31
|
+
- `--verify` — Force full verification (Step 7) even for trivial tasks
|
|
32
|
+
- `--tdd` — Enable TDD mode: Builder writes failing test first, verification checks commit sequence
|
|
33
|
+
- `--no-plan` — Skip discuss (Step 3) and plan (Step 4), go straight to execute
|
|
34
|
+
- `--cheap` — Force ALL agents to use haiku (fastest, cheapest, ~80% cost reduction)
|
|
35
|
+
- `--quality` — Force builder/architect to sonnet, architect to opus for complex tasks
|
|
36
|
+
|
|
37
|
+
**Parse procedure:**
|
|
38
|
+
1. Extract all `--flag` tokens from the input
|
|
39
|
+
2. Remove them from the task description (remaining text = task)
|
|
40
|
+
3. Store flags as a set for downstream steps to check
|
|
41
|
+
|
|
42
|
+
Example: `/sf-do --tdd --research add user avatars`
|
|
43
|
+
→ flags: `{tdd, research}`, task: `add user avatars`
|
|
44
|
+
|
|
45
|
+
If no flags provided, all steps use their default heuristic-based behavior.
|
|
46
|
+
|
|
47
|
+
---
|
|
48
|
+
|
|
24
49
|
## STEP 1: ANALYZE (0 LLM tokens — rule-based)
|
|
25
50
|
|
|
26
51
|
Classify the user's input using these heuristics:
|
|
@@ -51,6 +76,44 @@ Classify the user's input using these heuristics:
|
|
|
51
76
|
|
|
52
77
|
---
|
|
53
78
|
|
|
79
|
+
## STEP 1.5: OPTIMIZE PIPELINE (0 tokens — brain.db queries only)
|
|
80
|
+
|
|
81
|
+
Call `applyGuardrails()` from `core/guardrails.cjs` to optimize the entire pipeline in one shot.
|
|
82
|
+
|
|
83
|
+
**Input**: Build a task object from Step 1's analysis:
|
|
84
|
+
```javascript
|
|
85
|
+
task = { intent, complexity, domain, affectedFiles, areas, input: taskDescription }
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
**What applyGuardrails() does** (already implemented):
|
|
89
|
+
1. **Skip logic** — Decides which agents to skip based on brain.db knowledge
|
|
90
|
+
2. **Learning acceleration** — If 3+ high-confidence learnings exist, skip scout+architect+critic
|
|
91
|
+
3. **Budget adjustment** — If budget low (<60%), downgrade models; if critical (<20%), builder-only + haiku
|
|
92
|
+
4. **Model selection** — Dynamic per-agent model based on task characteristics:
|
|
93
|
+
- Builder → haiku when domain has 2+ high-confidence learnings, or trivial single-file fix
|
|
94
|
+
- Architect → opus for complex multi-area tasks with no prior patterns
|
|
95
|
+
- Critic → sonnet for security/auth tasks
|
|
96
|
+
5. **Predictive context** — Pre-loads co-change file signatures into context
|
|
97
|
+
|
|
98
|
+
**Output**: `{ pipeline, models, outputLevel, predictedContext, acceleration, budgetNotes }`
|
|
99
|
+
|
|
100
|
+
**Report model plan to user** (for medium/complex tasks):
|
|
101
|
+
```
|
|
102
|
+
Models: Scout=haiku, Architect=sonnet, Builder=sonnet, Critic=haiku
|
|
103
|
+
Pipeline: scout → architect → builder → critic (acceleration: partial, 35% cheaper)
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
**Flag overrides for model selection:**
|
|
107
|
+
- If `--cheap` flag: Override ALL models to `haiku` regardless of guardrails output
|
|
108
|
+
- If `--quality` flag: Override `builder` to `sonnet`, `architect` to `opus` (for complex) or `sonnet` (for medium)
|
|
109
|
+
|
|
110
|
+
**Use the output for ALL downstream steps:**
|
|
111
|
+
- Steps 3-4: Use `pipeline` to decide which agents run (replaces scattered skip-if checks)
|
|
112
|
+
- Step 6: Use `models[agent]` when spawning each agent
|
|
113
|
+
- Step 9: Use `outputLevel` for report format
|
|
114
|
+
|
|
115
|
+
---
|
|
116
|
+
|
|
54
117
|
## STEP 2: CONTEXT GATHERING (0 tokens)
|
|
55
118
|
|
|
56
119
|
**FIX #5: Git diff awareness** — Run `git diff --name-only HEAD` to see what files changed since last commit. Pass this list to Scout so it focuses on recent changes instead of searching blindly.
|
|
@@ -63,7 +126,8 @@ If `.shipfast/brain.db` does not exist, tell user to run `shipfast init` first.
|
|
|
63
126
|
|
|
64
127
|
## STEP 3: DISCUSS (0-3K tokens) — Complex or ambiguous tasks only
|
|
65
128
|
|
|
66
|
-
**Skip if**: trivial tasks,
|
|
129
|
+
**Skip if**: `--no-plan` flag is set, OR (trivial tasks AND `--discuss` flag is NOT set), OR all ambiguity types already have locked decisions in brain.db.
|
|
130
|
+
**Force if**: `--discuss` flag is set, regardless of complexity.
|
|
67
131
|
|
|
68
132
|
**Detect ambiguity** (zero tokens — rule-based):
|
|
69
133
|
- **WHERE**: No file paths or component names mentioned
|
|
@@ -74,14 +138,15 @@ If `.shipfast/brain.db` does not exist, tell user to run `shipfast init` first.
|
|
|
74
138
|
|
|
75
139
|
**For each detected ambiguity**:
|
|
76
140
|
1. Check brain.db for existing locked decisions
|
|
77
|
-
2. If
|
|
78
|
-
3.
|
|
141
|
+
2. If `--discuss` flag is set explicitly, ask the user interactively
|
|
142
|
+
3. For medium tasks (auto-triggered discuss), use assumptions mode: auto-resolve using brain.db patterns, present assumptions, fall back to asking only if confidence < 0.5
|
|
143
|
+
4. Store answer as locked decision in brain.db (never asked again)
|
|
79
144
|
|
|
80
145
|
---
|
|
81
146
|
|
|
82
147
|
## STEP 4: PLAN (0-5K tokens) — Medium/complex only
|
|
83
148
|
|
|
84
|
-
**Skip if**: trivial tasks (go directly to Step 6)
|
|
149
|
+
**Skip if**: `--no-plan` flag is set (go directly to Step 6), OR trivial tasks (go directly to Step 6)
|
|
85
150
|
|
|
86
151
|
**Get plan template** based on intent:
|
|
87
152
|
- `fix` → locate, diagnose, fix, verify
|
|
@@ -89,7 +154,7 @@ If `.shipfast/brain.db` does not exist, tell user to run `shipfast init` first.
|
|
|
89
154
|
- `refactor` → identify, extract, update callers, verify
|
|
90
155
|
- etc. (14 templates pre-computed in core/templates.cjs)
|
|
91
156
|
|
|
92
|
-
**Skip Scout if
|
|
157
|
+
**Skip Scout if** (`--research` flag overrides — if set, Scout always runs):
|
|
93
158
|
- All affected files already indexed in brain.db AND
|
|
94
159
|
- We have high-confidence learnings for this domain AND
|
|
95
160
|
- Intent is `fix` with explicit file paths
|
|
@@ -99,9 +164,9 @@ If `.shipfast/brain.db` does not exist, tell user to run `shipfast init` first.
|
|
|
99
164
|
- Intent is fix/remove/docs/style
|
|
100
165
|
- Task description is under 15 words
|
|
101
166
|
|
|
102
|
-
**If Scout runs**: Launch Scout agent with brain context. Get compact findings (~3K tokens max).
|
|
167
|
+
**If Scout runs**: Launch Scout agent with brain context and `model: models.scout` from Step 1.5. Get compact findings (~3K tokens max).
|
|
103
168
|
|
|
104
|
-
**If Architect runs**: Launch Architect agent with Scout findings + template. Get task list (~3K tokens max).
|
|
169
|
+
**If Architect runs**: Launch Architect agent with Scout findings + template and `model: models.architect` from Step 1.5. Get task list (~3K tokens max).
|
|
105
170
|
- Architect uses goal-backward methodology: define "done" first, derive tasks from that
|
|
106
171
|
- Maximum 6 tasks. Each with specific file paths and verify steps.
|
|
107
172
|
- Flag scope creep and irreversible operations.
|
|
@@ -141,11 +206,12 @@ Execute inline. No planning, no Scout, no Architect, no Critic.
|
|
|
141
206
|
**Redirect**: if work exceeds 3 file edits or needs research → upgrade to medium workflow.
|
|
142
207
|
|
|
143
208
|
### Medium workflow (1 Builder agent):
|
|
144
|
-
Launch ONE Builder agent with ALL tasks batched:
|
|
209
|
+
Launch ONE Builder agent with ALL tasks batched and `model: models.builder` from Step 1.5:
|
|
145
210
|
- Agent gets: base prompt + brain context + all task descriptions
|
|
211
|
+
- If `--tdd` flag is set, prepend to Builder context: `MODE: TDD (red→green→refactor). Write failing test FIRST. See <tdd_mode> in builder prompt.`
|
|
146
212
|
- Agent executes tasks sequentially within its context
|
|
147
213
|
- One agent call instead of one per task = token savings
|
|
148
|
-
- If Critic is not skipped, launch Critic after Builder completes
|
|
214
|
+
- If Critic is not skipped, launch Critic with `model: models.critic` after Builder completes
|
|
149
215
|
|
|
150
216
|
### Complex workflow (per-task agents, fresh context each):
|
|
151
217
|
|
|
@@ -158,16 +224,18 @@ If tasks found in brain.db, execute them. If not, run inline planning first.
|
|
|
158
224
|
|
|
159
225
|
**Per-task execution (fresh context per task):**
|
|
160
226
|
For each pending task in brain.db:
|
|
161
|
-
1. Launch a SEPARATE sf-builder agent with ONLY that task + brain context
|
|
227
|
+
1. Launch a SEPARATE sf-builder agent with ONLY that task + brain context + `model: models.builder` from Step 1.5. If `--tdd` flag is set, prepend `MODE: TDD (red→green→refactor). Write failing test FIRST.` to the task context.
|
|
162
228
|
2. Builder gets fresh context — no accumulated garbage from previous tasks
|
|
163
229
|
3. Builder executes: read → grep consumers → implement → build → verify → commit
|
|
164
|
-
4. After Builder completes, update task status
|
|
230
|
+
4. After Builder completes, update task status and record model outcome:
|
|
165
231
|
```bash
|
|
166
232
|
sqlite3 .shipfast/brain.db "UPDATE tasks SET status='passed', commit_sha='[sha]' WHERE id='[id]';"
|
|
233
|
+
sqlite3 .shipfast/brain.db "INSERT INTO model_performance (agent, model, domain, task_id, outcome) VALUES ('builder', '[model used]', '[domain]', '[id]', 'success');"
|
|
167
234
|
```
|
|
168
235
|
5. If Builder fails after 3 attempts:
|
|
169
236
|
```bash
|
|
170
237
|
sqlite3 .shipfast/brain.db "UPDATE tasks SET status='failed', error='[error]' WHERE id='[id]';"
|
|
238
|
+
sqlite3 .shipfast/brain.db "INSERT INTO model_performance (agent, model, domain, task_id, outcome) VALUES ('builder', '[model used]', '[domain]', '[id]', 'failure');"
|
|
171
239
|
```
|
|
172
240
|
6. Continue to next task regardless
|
|
173
241
|
|
|
@@ -177,8 +245,8 @@ For each pending task in brain.db:
|
|
|
177
245
|
- Tasks touching same files → sequential (never parallel)
|
|
178
246
|
|
|
179
247
|
**After all tasks:**
|
|
180
|
-
- Launch Critic agent (fresh context) to review ALL changes: `git diff HEAD~N`
|
|
181
|
-
- Launch Scribe agent (fresh context) to record decisions + learnings to brain.db
|
|
248
|
+
- Launch Critic agent (fresh context) with `model: models.critic` to review ALL changes: `git diff HEAD~N`
|
|
249
|
+
- Launch Scribe agent (fresh context) with `model: models.scribe` to record decisions + learnings to brain.db
|
|
182
250
|
- Save session state for `/sf-resume`
|
|
183
251
|
|
|
184
252
|
**After execution, run `/sf-verify` for thorough verification.**
|
|
@@ -197,7 +265,8 @@ Send the issue back to Builder for fix (1 additional agent call, not a full re-r
|
|
|
197
265
|
|
|
198
266
|
## STEP 7: VERIFY (0-3K tokens)
|
|
199
267
|
|
|
200
|
-
**Skip if**: trivial tasks with passing build
|
|
268
|
+
**Skip if**: trivial tasks with passing build, UNLESS `--verify` flag is set
|
|
269
|
+
**Force if**: `--verify` flag is set, regardless of complexity
|
|
201
270
|
|
|
202
271
|
Run goal-backward verification:
|
|
203
272
|
1. Extract done-criteria from the original request + plan
|
|
@@ -250,7 +319,12 @@ If you encountered and fixed any errors, record the pattern:
|
|
|
250
319
|
sqlite3 .shipfast/brain.db "INSERT INTO learnings (pattern, problem, solution, domain, source, confidence) VALUES ('[short pattern name]', '[what went wrong]', '[what fixed it]', '[domain]', 'auto', 0.5);"
|
|
251
320
|
```
|
|
252
321
|
|
|
253
|
-
|
|
322
|
+
If any improvement ideas, future features, or tech debt were surfaced during this task (including OUT_OF_SCOPE items), record them as seeds:
|
|
323
|
+
```bash
|
|
324
|
+
sqlite3 .shipfast/brain.db "INSERT INTO seeds (idea, source_task, domain, priority) VALUES ('[idea]', '[current task]', '[domain]', 'someday');"
|
|
325
|
+
```
|
|
326
|
+
|
|
327
|
+
**These are not optional.** If decisions were made, errors were fixed, or ideas were surfaced, you MUST record them. This is how ShipFast gets smarter over time.
|
|
254
328
|
|
|
255
329
|
---
|
|
256
330
|
|
package/commands/sf/milestone.md
CHANGED
|
@@ -11,7 +11,7 @@ allowed-tools:
|
|
|
11
11
|
<objective>
|
|
12
12
|
Manage project milestones. Complete the current milestone (archive phases, tag release)
|
|
13
13
|
or start a new one (reset phases, increment version).
|
|
14
|
-
All state tracked in brain.db
|
|
14
|
+
All state tracked in brain.db.
|
|
15
15
|
</objective>
|
|
16
16
|
|
|
17
17
|
<process>
|
package/commands/sf/verify.md
CHANGED
|
@@ -71,6 +71,29 @@ Check each for:
|
|
|
71
71
|
- debugger statements
|
|
72
72
|
- Commented-out code blocks
|
|
73
73
|
|
|
74
|
+
## Step 5.5: Schema drift detection
|
|
75
|
+
|
|
76
|
+
Check if ORM model/schema files were changed without a corresponding migration:
|
|
77
|
+
|
|
78
|
+
1. Get changed files: `git diff --name-only HEAD~5`
|
|
79
|
+
2. Detect ORM type by file pattern:
|
|
80
|
+
- Prisma: `*.prisma` files
|
|
81
|
+
- Drizzle: files containing `pgTable`/`sqliteTable`/`mysqlTable`
|
|
82
|
+
- TypeORM: files containing `@Entity`/`@Column` decorators
|
|
83
|
+
- Django: `models.py` files
|
|
84
|
+
- Rails: `app/models/` files
|
|
85
|
+
- Knex: `models/*.ts` or `models/*.js`
|
|
86
|
+
3. Check if migration files also changed in the same diff
|
|
87
|
+
4. If model changed without migration → **DRIFT WARNING** (not FAIL)
|
|
88
|
+
|
|
89
|
+
```
|
|
90
|
+
Schema: [ORM type] model changed: [files]
|
|
91
|
+
Migration: MISSING
|
|
92
|
+
Suggest: Run [migration command] to generate migration
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
This check can be suppressed by setting `schema_drift_check = false` in brain.db config.
|
|
96
|
+
|
|
74
97
|
## Step 6: Build verification
|
|
75
98
|
|
|
76
99
|
```bash
|
package/core/ambiguity.cjs
CHANGED
|
@@ -195,6 +195,111 @@ function buildDiscussionPrompt(input, ambiguities, brainContext) {
|
|
|
195
195
|
return parts.join('\n');
|
|
196
196
|
}
|
|
197
197
|
|
|
198
|
+
/**
|
|
199
|
+
* Auto-resolve ambiguities using codebase patterns from brain.db.
|
|
200
|
+
* Used by --assume flag to skip interactive questioning.
|
|
201
|
+
* Returns array of { type, decision, confidence, reasoning }.
|
|
202
|
+
* Falls back to asking the user if confidence < 0.5.
|
|
203
|
+
*/
|
|
204
|
+
function autoResolveAmbiguity(cwd, ambiguities, taskInput) {
|
|
205
|
+
const resolved = [];
|
|
206
|
+
|
|
207
|
+
for (const a of ambiguities) {
|
|
208
|
+
let decision = null;
|
|
209
|
+
let confidence = 0;
|
|
210
|
+
let reasoning = '';
|
|
211
|
+
|
|
212
|
+
switch (a.type) {
|
|
213
|
+
case 'WHERE': {
|
|
214
|
+
// Search brain.db nodes for files matching task keywords
|
|
215
|
+
const keywords = taskInput.split(/\s+/).filter(w => w.length > 3);
|
|
216
|
+
for (const kw of keywords) {
|
|
217
|
+
const matches = brain.query(cwd,
|
|
218
|
+
`SELECT file_path, name FROM nodes WHERE kind = 'file' AND (name LIKE '%${brain.esc(kw)}%' OR file_path LIKE '%${brain.esc(kw)}%') LIMIT 5`
|
|
219
|
+
);
|
|
220
|
+
if (matches.length > 0) {
|
|
221
|
+
decision = matches.map(m => m.file_path).join(', ');
|
|
222
|
+
confidence = matches.length === 1 ? 0.8 : 0.6;
|
|
223
|
+
reasoning = 'Matched ' + matches.length + ' file(s) by keyword "' + kw + '"';
|
|
224
|
+
break;
|
|
225
|
+
}
|
|
226
|
+
}
|
|
227
|
+
if (!decision) {
|
|
228
|
+
confidence = 0.2;
|
|
229
|
+
reasoning = 'No matching files found in brain.db';
|
|
230
|
+
}
|
|
231
|
+
break;
|
|
232
|
+
}
|
|
233
|
+
|
|
234
|
+
case 'HOW': {
|
|
235
|
+
// Reuse past HOW decisions in the same domain
|
|
236
|
+
const pastDecisions = brain.getDecisions(cwd);
|
|
237
|
+
const howDecision = pastDecisions.find(d => d.tags && d.tags.includes('HOW'));
|
|
238
|
+
if (howDecision) {
|
|
239
|
+
decision = howDecision.decision;
|
|
240
|
+
confidence = 0.7;
|
|
241
|
+
reasoning = 'Reusing previous HOW decision: ' + howDecision.question;
|
|
242
|
+
} else {
|
|
243
|
+
// Check learnings for the domain
|
|
244
|
+
const words = taskInput.toLowerCase().split(/\s+/);
|
|
245
|
+
const domains = ['auth', 'database', 'ui', 'api', 'frontend', 'backend', 'cache', 'search', 'payment'];
|
|
246
|
+
const domain = domains.find(d => words.includes(d));
|
|
247
|
+
if (domain) {
|
|
248
|
+
const learnings = brain.findLearnings(cwd, domain, 1);
|
|
249
|
+
if (learnings.length > 0) {
|
|
250
|
+
decision = 'Follow existing pattern: ' + learnings[0].pattern;
|
|
251
|
+
confidence = learnings[0].confidence;
|
|
252
|
+
reasoning = 'Based on learning with confidence ' + learnings[0].confidence;
|
|
253
|
+
}
|
|
254
|
+
}
|
|
255
|
+
if (!decision) { confidence = 0.3; reasoning = 'No prior decisions or learnings found'; }
|
|
256
|
+
}
|
|
257
|
+
break;
|
|
258
|
+
}
|
|
259
|
+
|
|
260
|
+
case 'WHAT': {
|
|
261
|
+
// Use task description as-is for short inputs
|
|
262
|
+
decision = 'Inferred from task description';
|
|
263
|
+
confidence = 0.6;
|
|
264
|
+
reasoning = 'Task description used as behavior spec';
|
|
265
|
+
break;
|
|
266
|
+
}
|
|
267
|
+
|
|
268
|
+
case 'RISK': {
|
|
269
|
+
// Auto-confirm in dev environment
|
|
270
|
+
const isDevEnv = require('fs').existsSync(require('path').join(cwd, '.env.local'))
|
|
271
|
+
|| require('fs').existsSync(require('path').join(cwd, '.env.development'));
|
|
272
|
+
if (isDevEnv) {
|
|
273
|
+
decision = 'Confirmed — development environment detected';
|
|
274
|
+
confidence = 0.7;
|
|
275
|
+
reasoning = '.env.local or .env.development found';
|
|
276
|
+
} else {
|
|
277
|
+
confidence = 0.3;
|
|
278
|
+
reasoning = 'No dev environment indicators — needs user confirmation';
|
|
279
|
+
}
|
|
280
|
+
break;
|
|
281
|
+
}
|
|
282
|
+
|
|
283
|
+
case 'SCOPE': {
|
|
284
|
+
decision = 'Tackle all at once';
|
|
285
|
+
confidence = 0.5;
|
|
286
|
+
reasoning = 'Default: single pass unless complexity warrants phasing';
|
|
287
|
+
break;
|
|
288
|
+
}
|
|
289
|
+
}
|
|
290
|
+
|
|
291
|
+
resolved.push({
|
|
292
|
+
type: a.type,
|
|
293
|
+
question: a.question,
|
|
294
|
+
decision: decision || 'Could not auto-resolve',
|
|
295
|
+
confidence,
|
|
296
|
+
reasoning
|
|
297
|
+
});
|
|
298
|
+
}
|
|
299
|
+
|
|
300
|
+
return resolved;
|
|
301
|
+
}
|
|
302
|
+
|
|
198
303
|
module.exports = {
|
|
199
304
|
detectAmbiguity,
|
|
200
305
|
ambiguityScore,
|
|
@@ -202,5 +307,6 @@ module.exports = {
|
|
|
202
307
|
lockDecision,
|
|
203
308
|
shouldDiscuss,
|
|
204
309
|
buildDiscussionPrompt,
|
|
310
|
+
autoResolveAmbiguity,
|
|
205
311
|
AMBIGUITY_RULES
|
|
206
312
|
};
|
package/core/guardrails.cjs
CHANGED
|
@@ -255,9 +255,12 @@ function formatReport(results, outputLevel) {
|
|
|
255
255
|
* Apply all guardrails to a pipeline.
|
|
256
256
|
* Returns the optimized pipeline with all adjustments.
|
|
257
257
|
*/
|
|
258
|
-
|
|
258
|
+
/**
|
|
259
|
+
* @param {object} [flags] - Composable flags from parseFlags() (--cheap, --quality, etc.)
|
|
260
|
+
*/
|
|
261
|
+
function applyGuardrails(cwd, sessionId, task, basePipeline, flags = {}) {
|
|
259
262
|
// 1. Skip logic (brain.db knowledge)
|
|
260
|
-
let pipeline = skipLogic.getAgentPipeline(cwd, task);
|
|
263
|
+
let pipeline = skipLogic.getAgentPipeline(cwd, task, flags);
|
|
261
264
|
|
|
262
265
|
// 2. Learning acceleration
|
|
263
266
|
const accel = accelerateFromLearnings(cwd, task, pipeline);
|
|
@@ -273,10 +276,25 @@ function applyGuardrails(cwd, sessionId, task, basePipeline) {
|
|
|
273
276
|
models[agent] = budgetAdj.models[agent] || modelSelector.selectModel(cwd, agent, task);
|
|
274
277
|
}
|
|
275
278
|
|
|
276
|
-
// 5.
|
|
279
|
+
// 5. Flag overrides (--cheap / --quality take precedence)
|
|
280
|
+
if (flags.cheap) {
|
|
281
|
+
for (const agent of pipeline) {
|
|
282
|
+
models[agent] = 'haiku';
|
|
283
|
+
}
|
|
284
|
+
} else if (flags.quality) {
|
|
285
|
+
for (const agent of pipeline) {
|
|
286
|
+
if (agent === 'architect') {
|
|
287
|
+
models[agent] = task.complexity === 'complex' ? 'opus' : 'sonnet';
|
|
288
|
+
} else if (agent === 'builder') {
|
|
289
|
+
models[agent] = 'sonnet';
|
|
290
|
+
}
|
|
291
|
+
}
|
|
292
|
+
}
|
|
293
|
+
|
|
294
|
+
// 6. Output level
|
|
277
295
|
const outputLevel = getOutputLevel(task.complexity);
|
|
278
296
|
|
|
279
|
-
//
|
|
297
|
+
// 7. Predictive context
|
|
280
298
|
const predictedContext = buildPredictiveContext(cwd, task);
|
|
281
299
|
|
|
282
300
|
return {
|
package/core/model-selector.cjs
CHANGED
|
@@ -41,8 +41,23 @@ function selectScoutModel(cwd, task) {
|
|
|
41
41
|
}
|
|
42
42
|
|
|
43
43
|
function selectArchitectModel(cwd, task) {
|
|
44
|
-
// Complex multi-area tasks
|
|
45
|
-
|
|
44
|
+
// Complex multi-area tasks with no prior patterns → Opus
|
|
45
|
+
// Opus costs 25x but is used rarely; pays for itself in fewer revision cycles
|
|
46
|
+
if (task.complexity === 'complex' && task.areas && task.areas.length > 3) {
|
|
47
|
+
if (task.domain) {
|
|
48
|
+
const learnings = brain.findLearnings(cwd, task.domain, 3);
|
|
49
|
+
const highConfidence = learnings.filter(l => l.confidence > 0.8 && l.solution);
|
|
50
|
+
if (highConfidence.length === 0) {
|
|
51
|
+
return 'opus'; // uncharted territory + complex = worth the cost
|
|
52
|
+
}
|
|
53
|
+
} else {
|
|
54
|
+
return 'opus'; // no domain = no learnings = needs best reasoning
|
|
55
|
+
}
|
|
56
|
+
return 'sonnet';
|
|
57
|
+
}
|
|
58
|
+
|
|
59
|
+
// Complex but fewer areas → Sonnet
|
|
60
|
+
if (task.complexity === 'complex') {
|
|
46
61
|
return 'sonnet';
|
|
47
62
|
}
|
|
48
63
|
|
|
@@ -55,6 +70,18 @@ function selectArchitectModel(cwd, task) {
|
|
|
55
70
|
}
|
|
56
71
|
|
|
57
72
|
function selectBuilderModel(cwd, task) {
|
|
73
|
+
// Check feedback loop: if haiku failed recently for this domain, upgrade
|
|
74
|
+
if (task.domain) {
|
|
75
|
+
const stats = getModelSuccessRate(cwd, 'builder', task.domain);
|
|
76
|
+
if (stats.haikuRate !== null && stats.haikuRate < 0.6) {
|
|
77
|
+
return 'sonnet'; // haiku struggling in this domain → upgrade
|
|
78
|
+
}
|
|
79
|
+
if (stats.sonnetRate !== null && stats.sonnetRate > 0.9 && stats.sonnetTotal >= 3) {
|
|
80
|
+
// Sonnet consistently succeeds here → try haiku next time to save cost
|
|
81
|
+
return 'haiku';
|
|
82
|
+
}
|
|
83
|
+
}
|
|
84
|
+
|
|
58
85
|
// Key insight: if we've solved similar problems before, Haiku can replicate
|
|
59
86
|
if (task.domain) {
|
|
60
87
|
const learnings = brain.findLearnings(cwd, task.domain, 3);
|
|
@@ -78,6 +105,36 @@ function selectBuilderModel(cwd, task) {
|
|
|
78
105
|
return 'sonnet';
|
|
79
106
|
}
|
|
80
107
|
|
|
108
|
+
/**
|
|
109
|
+
* Get model success rate for an agent+domain combo from the feedback table.
|
|
110
|
+
* Returns { haikuRate, sonnetRate, haikuTotal, sonnetTotal } (null if no data).
|
|
111
|
+
*/
|
|
112
|
+
function getModelSuccessRate(cwd, agent, domain) {
|
|
113
|
+
const rows = brain.query(cwd,
|
|
114
|
+
`SELECT model, outcome, COUNT(*) as c FROM model_performance
|
|
115
|
+
WHERE agent = '${brain.esc(agent)}' AND domain = '${brain.esc(domain)}'
|
|
116
|
+
GROUP BY model, outcome`
|
|
117
|
+
);
|
|
118
|
+
|
|
119
|
+
const stats = { haikuRate: null, sonnetRate: null, haikuTotal: 0, sonnetTotal: 0 };
|
|
120
|
+
const haikuSuccess = rows.find(r => r.model === 'haiku' && r.outcome === 'success');
|
|
121
|
+
const haikuFailure = rows.find(r => r.model === 'haiku' && r.outcome === 'failure');
|
|
122
|
+
const sonnetSuccess = rows.find(r => r.model === 'sonnet' && r.outcome === 'success');
|
|
123
|
+
const sonnetFailure = rows.find(r => r.model === 'sonnet' && r.outcome === 'failure');
|
|
124
|
+
|
|
125
|
+
const hS = haikuSuccess ? haikuSuccess.c : 0;
|
|
126
|
+
const hF = haikuFailure ? haikuFailure.c : 0;
|
|
127
|
+
const sS = sonnetSuccess ? sonnetSuccess.c : 0;
|
|
128
|
+
const sF = sonnetFailure ? sonnetFailure.c : 0;
|
|
129
|
+
|
|
130
|
+
stats.haikuTotal = hS + hF;
|
|
131
|
+
stats.sonnetTotal = sS + sF;
|
|
132
|
+
if (stats.haikuTotal > 0) stats.haikuRate = hS / stats.haikuTotal;
|
|
133
|
+
if (stats.sonnetTotal > 0) stats.sonnetRate = sS / stats.sonnetTotal;
|
|
134
|
+
|
|
135
|
+
return stats;
|
|
136
|
+
}
|
|
137
|
+
|
|
81
138
|
function selectCriticModel(cwd, task) {
|
|
82
139
|
// Security-related reviews need better reasoning
|
|
83
140
|
if (task.intent === 'security' || (task.areas && task.areas.includes('auth'))) {
|
package/core/skip-logic.cjs
CHANGED
|
@@ -10,8 +10,12 @@ const brain = require('../brain/index.cjs');
|
|
|
10
10
|
/**
|
|
11
11
|
* Should we skip Scout (research agent)?
|
|
12
12
|
* Skip if: all files are indexed AND we have relevant learnings
|
|
13
|
+
* @param {object} [flags] - Composable flags from /sf-do (--research, --discuss, etc.)
|
|
13
14
|
*/
|
|
14
|
-
function shouldSkipScout(cwd, task) {
|
|
15
|
+
function shouldSkipScout(cwd, task, flags = {}) {
|
|
16
|
+
// --research flag forces Scout to run
|
|
17
|
+
if (flags.research) return false;
|
|
18
|
+
|
|
15
19
|
// Always need Scout for complex tasks
|
|
16
20
|
if (task.complexity === 'complex') return false;
|
|
17
21
|
|
|
@@ -41,8 +45,12 @@ function shouldSkipScout(cwd, task) {
|
|
|
41
45
|
/**
|
|
42
46
|
* Should we skip Architect (planning agent)?
|
|
43
47
|
* Skip if: single-file change OR known template with high confidence
|
|
48
|
+
* @param {object} [flags] - Composable flags from /sf-do
|
|
44
49
|
*/
|
|
45
|
-
function shouldSkipArchitect(cwd, task) {
|
|
50
|
+
function shouldSkipArchitect(cwd, task, flags = {}) {
|
|
51
|
+
// --no-plan flag skips Architect
|
|
52
|
+
if (flags.noPlan) return true;
|
|
53
|
+
|
|
46
54
|
// Never skip for complex tasks
|
|
47
55
|
if (task.complexity === 'complex') return false;
|
|
48
56
|
|
|
@@ -61,8 +69,11 @@ function shouldSkipArchitect(cwd, task) {
|
|
|
61
69
|
/**
|
|
62
70
|
* Should we skip Critic (review agent)?
|
|
63
71
|
* Skip if: trivial change OR docs-only OR test-only
|
|
72
|
+
* @param {object} [flags] - Composable flags from /sf-do
|
|
64
73
|
*/
|
|
65
|
-
function shouldSkipCritic(cwd, task) {
|
|
74
|
+
function shouldSkipCritic(cwd, task, flags = {}) {
|
|
75
|
+
// --verify flag forces Critic to run
|
|
76
|
+
if (flags.verify) return false;
|
|
66
77
|
// Always review complex tasks
|
|
67
78
|
if (task.complexity === 'complex') return false;
|
|
68
79
|
|
|
@@ -89,25 +100,55 @@ function shouldSkipScribe(cwd, task) {
|
|
|
89
100
|
return false;
|
|
90
101
|
}
|
|
91
102
|
|
|
103
|
+
/**
|
|
104
|
+
* Parse composable flags from user input.
|
|
105
|
+
* Returns { flags, task } where task is the input with flags stripped.
|
|
106
|
+
*/
|
|
107
|
+
function parseFlags(input) {
|
|
108
|
+
const flags = {};
|
|
109
|
+
const flagMap = {
|
|
110
|
+
'--discuss': 'discuss',
|
|
111
|
+
'--research': 'research',
|
|
112
|
+
'--verify': 'verify',
|
|
113
|
+
'--tdd': 'tdd',
|
|
114
|
+
'--no-plan': 'noPlan',
|
|
115
|
+
'--cheap': 'cheap',
|
|
116
|
+
'--quality': 'quality'
|
|
117
|
+
};
|
|
118
|
+
|
|
119
|
+
let task = input;
|
|
120
|
+
for (const [flag, key] of Object.entries(flagMap)) {
|
|
121
|
+
if (task.includes(flag)) {
|
|
122
|
+
flags[key] = true;
|
|
123
|
+
task = task.replace(flag, '').trim();
|
|
124
|
+
}
|
|
125
|
+
}
|
|
126
|
+
|
|
127
|
+
// Clean up extra whitespace
|
|
128
|
+
task = task.replace(/\s+/g, ' ').trim();
|
|
129
|
+
return { flags, task };
|
|
130
|
+
}
|
|
131
|
+
|
|
92
132
|
/**
|
|
93
133
|
* Get the optimized agent pipeline for a task.
|
|
94
134
|
* Returns only the agents that should run.
|
|
135
|
+
* @param {object} [flags] - Composable flags from parseFlags()
|
|
95
136
|
*/
|
|
96
|
-
function getAgentPipeline(cwd, task) {
|
|
137
|
+
function getAgentPipeline(cwd, task, flags = {}) {
|
|
97
138
|
const pipeline = [];
|
|
98
139
|
|
|
99
|
-
if (!shouldSkipScout(cwd, task)) {
|
|
140
|
+
if (!shouldSkipScout(cwd, task, flags)) {
|
|
100
141
|
pipeline.push('scout');
|
|
101
142
|
}
|
|
102
143
|
|
|
103
|
-
if (!shouldSkipArchitect(cwd, task)) {
|
|
144
|
+
if (!shouldSkipArchitect(cwd, task, flags)) {
|
|
104
145
|
pipeline.push('architect');
|
|
105
146
|
}
|
|
106
147
|
|
|
107
148
|
// Builder always runs
|
|
108
149
|
pipeline.push('builder');
|
|
109
150
|
|
|
110
|
-
if (!shouldSkipCritic(cwd, task)) {
|
|
151
|
+
if (!shouldSkipCritic(cwd, task, flags)) {
|
|
111
152
|
pipeline.push('critic');
|
|
112
153
|
}
|
|
113
154
|
|
|
@@ -142,6 +183,7 @@ function estimateSavings(fullPipeline, optimizedPipeline) {
|
|
|
142
183
|
}
|
|
143
184
|
|
|
144
185
|
module.exports = {
|
|
186
|
+
parseFlags,
|
|
145
187
|
shouldSkipScout,
|
|
146
188
|
shouldSkipArchitect,
|
|
147
189
|
shouldSkipCritic,
|
package/core/verify.cjs
CHANGED
|
@@ -400,9 +400,14 @@ function verifyWithAutoFix(cwd, criteria, executeFixFn) {
|
|
|
400
400
|
/**
|
|
401
401
|
* Verify TDD commit sequence: test(...) → feat(...) → optional refactor(...)
|
|
402
402
|
*/
|
|
403
|
+
/**
|
|
404
|
+
* Verify TDD commit sequence: test(...) → feat(...) → optional refactor(...)
|
|
405
|
+
* Enhanced: also checks that test commits contain only test files and feat commits
|
|
406
|
+
* contain only implementation files.
|
|
407
|
+
*/
|
|
403
408
|
function verifyTddSequence(cwd, numCommits) {
|
|
404
409
|
try {
|
|
405
|
-
const log =
|
|
410
|
+
const log = safeExec('git', ['log', '--oneline', '-' + (numCommits || 10)], {
|
|
406
411
|
cwd, encoding: 'utf8'
|
|
407
412
|
}).trim().split('\n');
|
|
408
413
|
|
|
@@ -418,7 +423,40 @@ function verifyTddSequence(cwd, numCommits) {
|
|
|
418
423
|
|
|
419
424
|
// test commit should come BEFORE feat commit (higher index = older in git log)
|
|
420
425
|
if (featCommit && testIdx < featIdx) {
|
|
421
|
-
|
|
426
|
+
// Verify test commit contains only test files
|
|
427
|
+
const violations = [];
|
|
428
|
+
const testSha = testCommit.split(' ')[0];
|
|
429
|
+
const featSha = featCommit.split(' ')[0];
|
|
430
|
+
|
|
431
|
+
try {
|
|
432
|
+
const testFiles = safeExec('git', ['diff-tree', '--no-commit-id', '--name-only', '-r', testSha], {
|
|
433
|
+
cwd, encoding: 'utf8'
|
|
434
|
+
}).trim().split('\n').filter(Boolean);
|
|
435
|
+
|
|
436
|
+
const nonTestFiles = testFiles.filter(f =>
|
|
437
|
+
!f.includes('test') && !f.includes('spec') && !f.includes('__tests__')
|
|
438
|
+
);
|
|
439
|
+
if (nonTestFiles.length > 0) {
|
|
440
|
+
violations.push('RED commit contains non-test files: ' + nonTestFiles.join(', '));
|
|
441
|
+
}
|
|
442
|
+
|
|
443
|
+
const featFiles = safeExec('git', ['diff-tree', '--no-commit-id', '--name-only', '-r', featSha], {
|
|
444
|
+
cwd, encoding: 'utf8'
|
|
445
|
+
}).trim().split('\n').filter(Boolean);
|
|
446
|
+
|
|
447
|
+
const testInFeat = featFiles.filter(f =>
|
|
448
|
+
f.includes('test') || f.includes('spec') || f.includes('__tests__')
|
|
449
|
+
);
|
|
450
|
+
if (testInFeat.length > 0) {
|
|
451
|
+
violations.push('GREEN commit contains test files: ' + testInFeat.join(', '));
|
|
452
|
+
}
|
|
453
|
+
} catch { /* git diff-tree may fail for initial commits */ }
|
|
454
|
+
|
|
455
|
+
if (violations.length > 0) {
|
|
456
|
+
return { passed: false, detail: 'TDD sequence valid but file separation violated:\n' + violations.join('\n') };
|
|
457
|
+
}
|
|
458
|
+
|
|
459
|
+
return { passed: true, detail: 'TDD sequence valid: test → feat (file separation OK)' };
|
|
422
460
|
}
|
|
423
461
|
|
|
424
462
|
if (!featCommit) {
|
|
@@ -431,9 +469,94 @@ function verifyTddSequence(cwd, numCommits) {
|
|
|
431
469
|
}
|
|
432
470
|
}
|
|
433
471
|
|
|
472
|
+
// ============================================================
|
|
473
|
+
// Schema Drift Detection
|
|
474
|
+
// ============================================================
|
|
475
|
+
|
|
476
|
+
/**
|
|
477
|
+
* ORM/schema file patterns and their corresponding migration directories.
|
|
478
|
+
* Detects when model files change without a corresponding migration.
|
|
479
|
+
*/
|
|
480
|
+
const SCHEMA_PATTERNS = [
|
|
481
|
+
// Prisma
|
|
482
|
+
{ model: /\.prisma$/, migration: /prisma\/migrations\//, name: 'Prisma', migrateCmd: 'npx prisma migrate dev' },
|
|
483
|
+
// Drizzle
|
|
484
|
+
{ model: /pgTable|sqliteTable|mysqlTable/, migration: /drizzle\/|migrations\/\d/, name: 'Drizzle', migrateCmd: 'npx drizzle-kit generate' },
|
|
485
|
+
// TypeORM
|
|
486
|
+
{ model: /@Entity|@Column|@ManyToOne|@OneToMany/, migration: /migrations\/\d/, name: 'TypeORM', migrateCmd: 'npx typeorm migration:generate' },
|
|
487
|
+
// Django
|
|
488
|
+
{ model: /models\.py$/, migration: /\/migrations\/\d/, name: 'Django', migrateCmd: 'python manage.py makemigrations' },
|
|
489
|
+
// Rails
|
|
490
|
+
{ model: /app\/models\//, migration: /db\/migrate\//, name: 'Rails', migrateCmd: 'rails generate migration' },
|
|
491
|
+
// Knex
|
|
492
|
+
{ model: /models\/.*\.(js|ts)$/, migration: /migrations\/\d/, name: 'Knex', migrateCmd: 'npx knex migrate:make' },
|
|
493
|
+
];
|
|
494
|
+
|
|
495
|
+
/**
|
|
496
|
+
* Detect schema drift: model/schema files changed without corresponding migrations.
|
|
497
|
+
* Returns { hasDrift, modelChanges, migrationChanges, ormType, migrateCmd }
|
|
498
|
+
*/
|
|
499
|
+
function detectSchemaDrift(cwd, numCommits) {
|
|
500
|
+
let changedFiles;
|
|
501
|
+
try {
|
|
502
|
+
changedFiles = safeExec('git', ['diff', '--name-only', 'HEAD~' + (numCommits || 5)], {
|
|
503
|
+
cwd, encoding: 'utf8'
|
|
504
|
+
}).trim().split('\n').filter(Boolean);
|
|
505
|
+
} catch {
|
|
506
|
+
return { hasDrift: false, detail: 'Could not read git diff' };
|
|
507
|
+
}
|
|
508
|
+
|
|
509
|
+
if (changedFiles.length === 0) {
|
|
510
|
+
return { hasDrift: false, detail: 'No changed files' };
|
|
511
|
+
}
|
|
512
|
+
|
|
513
|
+
// Check file contents for ORM patterns (for content-based detection like Drizzle/TypeORM)
|
|
514
|
+
function fileMatchesContentPattern(filePath, pattern) {
|
|
515
|
+
if (pattern.source.includes('/') || pattern.source.endsWith('$')) {
|
|
516
|
+
// Path-based pattern
|
|
517
|
+
return pattern.test(filePath);
|
|
518
|
+
}
|
|
519
|
+
// Content-based pattern — read the file
|
|
520
|
+
const fullPath = path.join(cwd, filePath);
|
|
521
|
+
if (!fs.existsSync(fullPath)) return false;
|
|
522
|
+
try {
|
|
523
|
+
const content = fs.readFileSync(fullPath, 'utf8').slice(0, 5000);
|
|
524
|
+
return pattern.test(content);
|
|
525
|
+
} catch { return false; }
|
|
526
|
+
}
|
|
527
|
+
|
|
528
|
+
for (const schema of SCHEMA_PATTERNS) {
|
|
529
|
+
const modelChanges = changedFiles.filter(f => {
|
|
530
|
+
if (schema.model.source.includes('/') || schema.model.source.endsWith('$')) {
|
|
531
|
+
return schema.model.test(f);
|
|
532
|
+
}
|
|
533
|
+
return fileMatchesContentPattern(f, schema.model);
|
|
534
|
+
});
|
|
535
|
+
|
|
536
|
+
if (modelChanges.length === 0) continue;
|
|
537
|
+
|
|
538
|
+
const migrationChanges = changedFiles.filter(f => schema.migration.test(f));
|
|
539
|
+
|
|
540
|
+
if (migrationChanges.length === 0) {
|
|
541
|
+
return {
|
|
542
|
+
hasDrift: true,
|
|
543
|
+
ormType: schema.name,
|
|
544
|
+
modelChanges,
|
|
545
|
+
migrationChanges: [],
|
|
546
|
+
migrateCmd: schema.migrateCmd,
|
|
547
|
+
detail: schema.name + ' model files changed without migration: ' + modelChanges.join(', ')
|
|
548
|
+
+ '. Run: ' + schema.migrateCmd
|
|
549
|
+
};
|
|
550
|
+
}
|
|
551
|
+
}
|
|
552
|
+
|
|
553
|
+
return { hasDrift: false, detail: 'No schema drift detected' };
|
|
554
|
+
}
|
|
555
|
+
|
|
434
556
|
module.exports = {
|
|
435
557
|
extractDoneCriteria, runVerification, scoreResults, recordVerification, formatResults,
|
|
436
558
|
verifyBuild, verifyNoStubs, verifyNoStubsDeep, detectBuildCommand,
|
|
437
559
|
verifyArtifact3Level, verifyDataFlow,
|
|
438
|
-
generateFixTasks, verifyWithAutoFix, verifyTddSequence
|
|
560
|
+
generateFixTasks, verifyWithAutoFix, verifyTddSequence,
|
|
561
|
+
detectSchemaDrift
|
|
439
562
|
};
|
package/mcp/server.cjs
CHANGED
|
@@ -103,7 +103,8 @@ const TOOLS = {
|
|
|
103
103
|
"UNION ALL SELECT 'learnings', COUNT(*) FROM learnings " +
|
|
104
104
|
"UNION ALL SELECT 'tasks', COUNT(*) FROM tasks " +
|
|
105
105
|
"UNION ALL SELECT 'checkpoints', COUNT(*) FROM checkpoints " +
|
|
106
|
-
"UNION ALL SELECT 'hot_files', COUNT(*) FROM hot_files"
|
|
106
|
+
"UNION ALL SELECT 'hot_files', COUNT(*) FROM hot_files " +
|
|
107
|
+
"UNION ALL SELECT 'seeds', COUNT(*) FROM seeds WHERE status = 'open'"
|
|
107
108
|
);
|
|
108
109
|
const stats = {};
|
|
109
110
|
rows.forEach(r => stats[r.metric] = r.count);
|
|
@@ -245,6 +246,46 @@ const TOOLS = {
|
|
|
245
246
|
}
|
|
246
247
|
},
|
|
247
248
|
|
|
249
|
+
brain_seeds: {
|
|
250
|
+
description: 'List, add, promote, or dismiss forward ideas (seeds). Seeds capture improvement ideas surfaced during work for future milestones.',
|
|
251
|
+
inputSchema: {
|
|
252
|
+
type: 'object',
|
|
253
|
+
properties: {
|
|
254
|
+
action: { type: 'string', description: 'list, add, promote, or dismiss', enum: ['list', 'add', 'promote', 'dismiss'] },
|
|
255
|
+
idea: { type: 'string', description: 'The idea text (required for add)' },
|
|
256
|
+
source_task: { type: 'string', description: 'Which task surfaced this idea (optional)' },
|
|
257
|
+
domain: { type: 'string', description: 'Domain: frontend, backend, database, auth, etc. (optional)' },
|
|
258
|
+
priority: { type: 'string', description: 'someday, next, or urgent (optional, default: someday)', enum: ['someday', 'next', 'urgent'] },
|
|
259
|
+
seed_id: { type: 'number', description: 'Seed ID (required for promote/dismiss)' },
|
|
260
|
+
task_id: { type: 'string', description: 'Task ID to promote seed to (required for promote)' }
|
|
261
|
+
},
|
|
262
|
+
required: ['action']
|
|
263
|
+
},
|
|
264
|
+
handler({ action, idea, source_task, domain, priority, seed_id, task_id }) {
|
|
265
|
+
if (action === 'add') {
|
|
266
|
+
if (!idea) return { error: 'idea is required' };
|
|
267
|
+
const ok = run(
|
|
268
|
+
`INSERT INTO seeds (idea, source_task, domain, priority) ` +
|
|
269
|
+
`VALUES ('${esc(idea)}', '${esc(source_task || '')}', '${esc(domain || '')}', '${esc(priority || 'someday')}')`
|
|
270
|
+
);
|
|
271
|
+
return ok ? { status: 'recorded', idea, domain, priority: priority || 'someday' } : { error: 'failed to insert' };
|
|
272
|
+
}
|
|
273
|
+
if (action === 'promote') {
|
|
274
|
+
if (!seed_id || !task_id) return { error: 'seed_id and task_id are required' };
|
|
275
|
+
const ok = run(`UPDATE seeds SET status = 'promoted', promoted_to = '${esc(task_id)}' WHERE id = ${parseInt(seed_id)}`);
|
|
276
|
+
return ok ? { status: 'promoted', seed_id, task_id } : { error: 'failed to update' };
|
|
277
|
+
}
|
|
278
|
+
if (action === 'dismiss') {
|
|
279
|
+
if (!seed_id) return { error: 'seed_id is required' };
|
|
280
|
+
const ok = run(`UPDATE seeds SET status = 'dismissed' WHERE id = ${parseInt(seed_id)}`);
|
|
281
|
+
return ok ? { status: 'dismissed', seed_id } : { error: 'failed to update' };
|
|
282
|
+
}
|
|
283
|
+
// list
|
|
284
|
+
const filter = domain ? `AND domain = '${esc(domain)}'` : '';
|
|
285
|
+
return query(`SELECT id, idea, source_task, domain, priority, status, created_at FROM seeds WHERE status = 'open' ${filter} ORDER BY CASE priority WHEN 'urgent' THEN 0 WHEN 'next' THEN 1 ELSE 2 END, created_at DESC LIMIT 30`);
|
|
286
|
+
}
|
|
287
|
+
},
|
|
288
|
+
|
|
248
289
|
// Feature #6: Graph traversal tools
|
|
249
290
|
|
|
250
291
|
brain_graph_traverse: {
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@shipfast-ai/shipfast",
|
|
3
|
-
"version": "1.0
|
|
3
|
+
"version": "1.1.0",
|
|
4
4
|
"description": "Autonomous context-engineered development system with SQLite brain. 5 agents, 14 commands, per-task fresh context, 70-90% fewer tokens.",
|
|
5
5
|
"bin": {
|
|
6
6
|
"shipfast": "bin/install.js"
|