specpipe 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (60) hide show
  1. package/README.md +1319 -0
  2. package/bin/devkit.js +3 -0
  3. package/package.json +61 -0
  4. package/src/cli.js +76 -0
  5. package/src/commands/check.js +33 -0
  6. package/src/commands/diff.js +84 -0
  7. package/src/commands/init-adopt.js +54 -0
  8. package/src/commands/init-agents.js +118 -0
  9. package/src/commands/init-global.js +102 -0
  10. package/src/commands/init.js +311 -0
  11. package/src/commands/list.js +54 -0
  12. package/src/commands/remove.js +133 -0
  13. package/src/commands/upgrade.js +215 -0
  14. package/src/lib/agent-guards.js +100 -0
  15. package/src/lib/agent-install.js +161 -0
  16. package/src/lib/agents.js +280 -0
  17. package/src/lib/claude-global.js +183 -0
  18. package/src/lib/detector.js +93 -0
  19. package/src/lib/hasher.js +21 -0
  20. package/src/lib/installer.js +213 -0
  21. package/src/lib/logger.js +16 -0
  22. package/src/lib/manifest.js +102 -0
  23. package/src/lib/reconcile.js +56 -0
  24. package/templates/.claude/CLAUDE.md +79 -0
  25. package/templates/.claude/hooks/comment-guard.js +126 -0
  26. package/templates/.claude/hooks/file-guard.js +216 -0
  27. package/templates/.claude/hooks/glob-guard.js +104 -0
  28. package/templates/.claude/hooks/path-guard.sh +118 -0
  29. package/templates/.claude/hooks/self-review.sh +27 -0
  30. package/templates/.claude/hooks/sensitive-guard.sh +227 -0
  31. package/templates/.claude/settings.json +68 -0
  32. package/templates/docs/WORKFLOW.md +325 -0
  33. package/templates/docs/specs/.gitkeep +0 -0
  34. package/templates/hooks/specpipe-read-guard.sh +42 -0
  35. package/templates/hooks/specpipe-shell-guard.sh +65 -0
  36. package/templates/rules/specpipe-guards.md +40 -0
  37. package/templates/scripts/test-hooks.sh +66 -0
  38. package/templates/skills/sp-build/SKILL.md +776 -0
  39. package/templates/skills/sp-challenge/SKILL.md +255 -0
  40. package/templates/skills/sp-commit/SKILL.md +174 -0
  41. package/templates/skills/sp-explore/SKILL.md +730 -0
  42. package/templates/skills/sp-fix/SKILL.md +266 -0
  43. package/templates/skills/sp-humanize/SKILL.md +212 -0
  44. package/templates/skills/sp-investigate/SKILL.md +648 -0
  45. package/templates/skills/sp-md-render/SKILL.md +200 -0
  46. package/templates/skills/sp-md-render/components.md +415 -0
  47. package/templates/skills/sp-md-render/template.html +283 -0
  48. package/templates/skills/sp-plan/SKILL.md +947 -0
  49. package/templates/skills/sp-review/SKILL.md +268 -0
  50. package/templates/skills/sp-scaffold/SKILL.md +237 -0
  51. package/templates/skills/sp-scaffold/references/ARCHITECTURE.md.tmpl +228 -0
  52. package/templates/skills/sp-scaffold/references/DESIGN.md.tmpl +113 -0
  53. package/templates/skills/sp-scaffold/references/adr/NNNN-template.md +92 -0
  54. package/templates/skills/sp-scaffold/references/stack-profiles/react.md +36 -0
  55. package/templates/skills/sp-spec-render/SKILL.md +254 -0
  56. package/templates/skills/sp-spec-render/components.md +418 -0
  57. package/templates/skills/sp-spec-render/examples/user-auth.html +749 -0
  58. package/templates/skills/sp-spec-render/examples/user-auth.md +114 -0
  59. package/templates/skills/sp-spec-render/template.html +222 -0
  60. package/templates/skills/sp-voices/SKILL.md +1184 -0
@@ -0,0 +1,268 @@
1
+ ---
2
+ description: |
3
+ Pre-merge code review — security, correctness, spec alignment. Reviews diff
4
+ against the base branch with smart focus by blast radius.
5
+ Use when asked to "review this PR", "review code", "review trước khi merge",
6
+ "kiểm tra code", "check my diff", "pre-merge review", or "review my changes".
7
+ Proactively suggest before /sp-commit or when the user is about to merge,
8
+ especially after /sp-build produces a non-trivial diff.
9
+ Catches: SQL safety issues, security gaps, spec drift, regressions in
10
+ modified-not-added lines, and changes to sensitive layers (auth, payment, core).
11
+ allowed-tools: Read, Bash, Glob, Grep, AskUserQuestion, mcp__graphatlas__*
12
+ ---
13
+ Pre-merge code review — security, correctness, spec alignment.
14
+
15
+ ## Phase 0a — Graphatlas probe (run once, silently)
16
+
17
+ Before Phase 0:
18
+
19
+ 1. Call `mcp__graphatlas__ga_architecture` with `max_modules: 1`.
20
+ 2. Interpret:
21
+ - Returns `modules` → **GA available.** Use `ga_impact`, `ga_risk`, `ga_architecture` for blast-radius and layer checks below. Manual grep is fallback.
22
+ - Error `STALE_INDEX` → call `mcp__graphatlas__ga_reindex` (mode `"full"`), retry once, then treat as available. (Reviewing the diff against a stale index gives wrong impact — reindex matters here.)
23
+ - Tool not found / connection error / any other failure → **GA unavailable.** Skip `ga_*` steps and review the diff manually. Do not re-probe.
24
+ 3. Carry the outcome through Phase 0 - 4.
25
+
26
+ ---
27
+
28
+ ## Phase 0: Understand Intent
29
+
30
+ 1. Read commit messages:
31
+ ```
32
+ BASE=$(git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's|refs/remotes/origin/||') || BASE="main"
33
+ git log --oneline "$BASE"...HEAD
34
+ ```
35
+ 2. Check for spec in `docs/specs/<feature>/<feature>.md` — review against INTENT.
36
+ 3. Read the diff: `git diff "$BASE"...HEAD`
37
+ 4. **Expand blast radius.** **If GA available (per Phase 0a):** run `ga_impact(diff=<full diff>)` (or `changed_files=[...]`) to get impacted files, affected tests, affected routes/configs, and a 4-dim risk score in one call — this is the flagship review tool, prefer it over any chain of grep + manual reading. Cross-check with `ga_architecture` for module/layer membership (auth, payment, core) and `ga_risk(changed_files=[...])` for a refactor-safety gate. **If GA unavailable:** grep for each changed function/type name across the rest of the tree to find affected files; identify sensitive paths (`auth/`, `payment/`, `core/`) by directory.
38
+ 5. **What already exists:** List any code/flows that already partially solve the problem in this diff. Flag if the diff rebuilds something that already exists.
39
+
40
+ If `$ARGUMENTS` provided → scope to those files only.
41
+ If diff > 500 lines → review file-by-file, prioritize by smart focus below.
42
+
43
+ ---
44
+
45
+ ## Phase 1: Smart Focus
46
+
47
+ Auto-detect primary focus from diff content:
48
+
49
+ | Diff contains | Focus heavily on |
50
+ |--------------|-----------------|
51
+ | auth, login, token, session, password, JWT | Security — full depth |
52
+ | SQL, query, database, migration | Injection + data integrity |
53
+ | API, endpoint, route, controller, handler | Input validation + error handling |
54
+ | .env, config, secret, key, credential | Secret exposure |
55
+ | Test files only | Test quality (skip security deep-dive) |
56
+ | Docs/comments only | Accuracy only (minimal review) |
57
+ | Payment, billing, transaction | Correctness + idempotency |
58
+
59
+ Spend 60% of analysis on the primary focus. Cover all categories, but proportionally.
60
+
61
+ ---
62
+
63
+ ## Phase 2: Checklist
64
+
65
+ ### Security (Critical)
66
+ - **Injection:** Search diff for string concatenation in SQL/shell/HTML. Look for `${var}` in queries, `.innerHTML`, template literals in SQL. Flag any user input reaching a query without parameterization.
67
+ - **Auth/Authz:** New endpoint → has auth middleware? Can user A access user B's data? ID in URL without ownership check?
68
+ - **Secrets:** Hardcoded strings matching `sk-`, `ghp_`, `Bearer `, long base64. New env vars committed?
69
+ - **Error exposure:** Catch blocks sending raw errors to users? Stack traces, file paths, DB schemas in responses?
70
+ - **Dependencies:** New packages — maintained? >1000 weekly downloads? Known CVEs?
71
+
72
+ ### Correctness (High)
73
+ - **Logic vs intent:** Does the code do what commits/spec claim? "Add validation" but code just logs?
74
+ - **Edge cases:** null, empty, 0, negative, MAX_INT, unicode, very long strings — handled?
75
+ - **Error handling:** For each try/catch — error logged with context? User shown safe message? Resources cleaned in finally?
76
+ - **Concurrency:** Shared state without locks? Read-then-write without atomicity? Non-atomic DB updates?
77
+ - **Null safety:** Optionals used without guards? `object!.property` without nil check?
78
+
79
+ ### API/Backend (High)
80
+
81
+ - **Unvalidated input** — request body/params used without schema validation
82
+ - **Missing rate limiting** — public endpoints without throttling
83
+ - **Missing timeouts** — external HTTP calls without timeout configuration
84
+ - **Missing CORS configuration** — APIs accessible from unintended origins
85
+ - **Error message leakage** — stack traces, file paths, DB schemas in responses
86
+
87
+ ### Spec-Test Alignment (Medium)
88
+ - Source changed but no spec update in `docs/specs/<feature>/`? → flag
89
+ - Source changed but no test update? → flag
90
+ - Spec changed but acceptance scenarios or tests not updated? → flag
91
+ - Code removed but dead tests remain? → flag
92
+ - Spec contains vague requirements without metrics ("fast", "secure", "easy", "scalable")? → flag with suggestion to add SC-NNN with concrete numbers
93
+ - **AS-to-test name check:** Read the spec's `## Stories` section. For each AS-NNN, check if a test file contains a test named or described with that AS ID or its short description. Flag:
94
+ - AS in spec with no matching test → "AS-NNN: \<description\> has no corresponding test"
95
+ - Test referencing an AS-NNN that no longer exists in the spec → "Test references removed AS-NNN"
96
+ Keep this lightweight — match on AS-NNN identifiers and story name substrings, not semantic analysis.
97
+
98
+ ### Code Quality (Medium)
99
+ - Dead code: removed functions still imported elsewhere?
100
+ - Obvious duplication: copy-pasted blocks that should be shared?
101
+ - Naming: consistent with codebase? Descriptive?
102
+ - Complexity: functions > 40 lines or > 3 nesting levels?
103
+ - **Diagram maintenance:** Diff touches code with ASCII diagrams in nearby comments? Check if those diagrams are still accurate. Stale diagrams are worse than no diagrams — they actively mislead. Flag even if outside immediate scope.
104
+
105
+ ### Performance (Low)
106
+ - Flag N+1 queries, unbounded collections, redundant computation in loops.
107
+
108
+ ### When Reviewing AI-Generated Code
109
+
110
+ Prioritize these concerns above standard checklist:
111
+ - **Behavioral regressions** — does changed code break edge cases the AI didn't consider?
112
+ - **Trust boundaries** — does the AI code implicitly trust external input it shouldn't?
113
+ - **Architecture drift** — does it introduce hidden coupling or deviate from existing patterns?
114
+ - **Model cost escalation** — flag workflows that escalate to higher-cost models without clear reasoning; recommend lower-cost tiers for deterministic operations.
115
+
116
+ ### Failure Mode Grid
117
+ For each new codepath in the diff, evaluate 3 dimensions:
118
+
119
+ | Codepath | Test covers it? | Error handling? | User sees clear error? |
120
+ |----------|----------------|-----------------|----------------------|
121
+ | (path) | ✓/✗ | ✓/✗ | clear / silent |
122
+
123
+ **Critical gap** = all 3 are ✗ → flag as High severity, non-optional.
124
+
125
+ ---
126
+
127
+ ## Confidence Calibration
128
+
129
+ Every finding MUST include a confidence score:
130
+
131
+ | Score | Meaning | Display rule |
132
+ |-------|---------|-------------|
133
+ | 9–10 | Verified by reading code directly. Concrete bug demonstrated. | Show normally |
134
+ | 7–8 | High-confidence pattern match. Very likely correct. | Show normally |
135
+ | 5–6 | Possible false positive. | Show with caveat: "verify this" |
136
+ | 3–4 | Low confidence. | Appendix only |
137
+ | 1–2 | Speculation. | Only report if severity Critical |
138
+
139
+ **Finding format:** `**[C-1] (confidence: 9/10) file.ts:42 — description**`
140
+
141
+ ---
142
+
143
+ ## Phase 3: TL;DR Output
144
+
145
+ Print ONLY this block to terminal — concise, no full finding bodies yet. Keep all finding detail internal for Phase 5.
146
+
147
+ ```
148
+ ## Code Review: <branch>
149
+ Scope: X files, +Y/-Z lines | Focus: <detected> | Verdict: APPROVE | REQUEST CHANGES | NEEDS DISCUSSION
150
+ Counts: N Critical · N High · N Medium · N Low (total: N)
151
+
152
+ Top blockers (Critical + High only, one-liner each — cap 5):
153
+ - [C-1] file.ts:42 — SQL injection (conf 9/10)
154
+ - [H-1] api.ts:15 — empty catch swallows DB errors (conf 8/10)
155
+
156
+ Positive: <1 line — reinforce one good pattern from the diff>
157
+ Not in scope: <1 line, or "None identified.">
158
+ ```
159
+
160
+ If total findings = 0 → print TL;DR with "No findings." and STOP. Skip Phase 4–6.
161
+
162
+ After printing TL;DR, append one line:
163
+ > 💡 Want a second opinion? Run `/sp-voices` on this diff for a multi-LLM cross-check before triaging — especially useful for security/payment changes or when most findings sit at confidence 5–7.
164
+
165
+ ---
166
+
167
+ ## Phase 4: Bulk triage
168
+
169
+ Use `AskUserQuestion`. Recommendation logic for the question text:
170
+ - Any Critical or High present → recommend **A (Review each)**
171
+ - Only Medium/Low, majority confidence ≥7 → recommend **B (Accept all)**
172
+ - Majority confidence ≤6 → recommend **C (Reject all)**
173
+
174
+ Append `(Recommended)` to the matching option.
175
+
176
+ ```json
177
+ {
178
+ "questions": [
179
+ {
180
+ "question": "<N> findings (<C>C / <H>H / <M>M / <L>L). How to triage? RECOMMENDATION: Choose <X> — <one-line reason based on severity/confidence mix>.",
181
+ "header": "Triage Mode",
182
+ "multiSelect": false,
183
+ "options": [
184
+ {"label": "A) Review each — walk through finding by finding with full details"},
185
+ {"label": "B) Accept all — add every finding to action list, skip per-item review"},
186
+ {"label": "C) Reject all — dismiss all findings, verdict stands, no action list"},
187
+ {"label": "D) Exit — keep the TL;DR above, stop here"}
188
+ ]
189
+ }
190
+ ]
191
+ }
192
+ ```
193
+
194
+ Routing: A → Phase 5. B → mark all Accepted, jump to Phase 6. C → mark all Rejected, jump to Phase 6. D → stop.
195
+
196
+ ---
197
+
198
+ ## Phase 5: Per-finding loop (only if A chosen)
199
+
200
+ Iterate findings in order: Critical → High → Medium → Low. For EACH, print the full detail block:
201
+
202
+ ```
203
+ [<ID>] <severity> | confidence: <N>/10 | <file:line>
204
+ Title: <title>
205
+ Description: <what's wrong — concrete>
206
+ Evidence: <code snippet or direct quote from diff>
207
+ Failure scenario: <step-by-step how this hits production>
208
+ Suggested fix: <specific, actionable>
209
+ ```
210
+
211
+ Then ask. Append `(Recommended)` to the matching option:
212
+ - **Accept** if: severity ≥ High AND confidence ≥ 7
213
+ - **Reject** if: confidence ≤ 6
214
+ - **Defer** if: severity Medium/Low AND confidence ≥ 7
215
+
216
+ ```json
217
+ {
218
+ "questions": [
219
+ {
220
+ "question": "Finding [<ID>]: <title>\n<1-line flaw summary>\nRECOMMENDATION: Choose <X> — <rationale: severity × confidence>.",
221
+ "header": "Finding <ID>",
222
+ "multiSelect": false,
223
+ "options": [
224
+ {"label": "A) Accept — add to action list"},
225
+ {"label": "B) Reject — false positive, dismiss"},
226
+ {"label": "C) Defer — note in PR description, don't fix now"}
227
+ ]
228
+ }
229
+ ]
230
+ }
231
+ ```
232
+
233
+ *(Move `(Recommended)` to whichever option matches the rule above.)*
234
+
235
+ Escape hatch: if user hits Reject 3 times in a row on High/Critical items, ask once: "Skip remaining per-finding prompts? A) Continue B) Reject all remaining C) Accept all remaining" — avoids fatigue on noisy reviews.
236
+
237
+ ---
238
+
239
+ ## Phase 6: Summary
240
+
241
+ Print final tally:
242
+
243
+ ```
244
+ Triage complete.
245
+ Accepted: N | Rejected: N | Deferred: N
246
+
247
+ Action list (accepted):
248
+ - [<ID>] file:line — <title> → /sp-fix "<title>"
249
+ - ...
250
+
251
+ Deferred (note in PR description):
252
+ - [<ID>] file:line — <title>
253
+ ```
254
+
255
+ If accepted = 0 → print "No action items. Verdict stands: <verdict>." and stop.
256
+ Do **NOT** spawn `/sp-fix` automatically — user runs it per item.
257
+
258
+ ---
259
+
260
+ ## Rules
261
+ 1. **Never auto-fix.** Report only — triage classifies, doesn't edit code.
262
+ 2. **Specific.** Every finding has `file:line` and concrete description.
263
+ 3. **Severity matches impact.** Style nits = Low. Injection = Critical.
264
+ 4. **Positive notes mandatory.** Reviews aren't just about problems.
265
+ 5. **Review against intent.** Not just "clean code?" but "does this match spec/commits?"
266
+ 6. **Proportional.** 5-line doc change ≠ 500-line auth rewrite.
267
+ 7. **TL;DR first, details on demand.** Never dump all finding bodies to terminal upfront — reveal detail only inside Phase 5.
268
+ 8. **Recommendation mandatory.** Every `AskUserQuestion` includes `RECOMMENDATION:` in question text AND `(Recommended)` suffix on the matching option.
@@ -0,0 +1,237 @@
1
+ ---
2
+ description: |
3
+ Greenfield bootstrap — turn a decided app-type + tech stack into a RUNNABLE walking
4
+ skeleton plus canonical docs (ARCHITECTURE.md, ADRs, optional DESIGN.md), before any
5
+ feature spec or TDD. Generator-first (real pinned deps, no hallucinated packages),
6
+ gated on a green smoke test (`install → build → start`), structured core/modules/tests.
7
+ Use when asked to "scaffold the project", "khởi tạo dự án", "set up the codebase",
8
+ "bootstrap a new app", "dựng nền dự án", "init the repo", "start a new project from scratch",
9
+ "create the project skeleton", or after /sp-explore confirms a greenfield build.
10
+ Proactively invoke this (do NOT hand-write project files) when the target directory has
11
+ no runnable project yet (no package.json / pyproject.toml / Cargo.toml / go.mod, empty src)
12
+ and the user wants to start building — sp-build assumes a runnable harness exists; this is
13
+ what creates it.
14
+ Hands off to /sp-plan (first feature spec) → /sp-build. Skip if a runnable project already
15
+ exists — go straight to /sp-plan.
16
+ allowed-tools: Read, Write, Edit, Bash, Glob, Grep, AskUserQuestion, Agent, WebSearch
17
+ ---
18
+ Greenfield bootstrap — decided stack → runnable walking skeleton + canonical docs, before any spec.
19
+
20
+ This skill exists because the rest of the kit assumes a runnable codebase: `/sp-build`'s TDD loop needs a resolvable `TEST_CMD` and an app to import, which an empty repo does not have. Folding "set up the project" into the first story (as a hand-written foundation) is the failure this fixes — scaffolding is infrastructure, not behaviour, so it lives here, not in a spec.
21
+
22
+ **Pipeline:** `/sp-explore` (greenfield branch — decides app-type + stack) → **`/sp-scaffold`** (this) → `/sp-plan` (first feature spec) → `/sp-build`.
23
+
24
+ **The one success metric:** not "files generated" — **"it builds and a smoke test passes."** A scaffold that doesn't run is worse than none; the gate in Phase 3 is non-negotiable.
25
+
26
+ ---
27
+
28
+ ## Phase 0 — Precondition & input
29
+
30
+ 1. **Greenfield check (idempotent).** Look for an existing runnable project in the target dir: any of `package.json`, `pyproject.toml` / `requirements.txt`, `Cargo.toml`, `go.mod`, `build.gradle`, `*.sln`, `Package.swift`, `Gemfile`, or a non-empty `src/`.
31
+ - Found AND it builds → **wrong tool.** Stop: "A project already exists here — run `/sp-plan` to spec the next feature, then `/sp-build`." Do not re-scaffold.
32
+ - Found but partial/broken (e.g. a `package.json` but no installable tree) → treat as a **resume**: finish the skeleton and drive Phase 3 to green; do not blow away existing files without asking.
33
+ - Empty / docs-only → normal greenfield, continue.
34
+
35
+ 2. **Resolve the bootstrap brief.** This skill needs: **app-type**, **stack** (with one-line rationale per major choice), preferred **scaffold command**, and the **smoke-test command**.
36
+ - From `/sp-explore` greenfield branch → read its bootstrap brief (in `docs/explore/<feature>.md` or `$ARGUMENTS`).
37
+ - Missing or invoked standalone → gather it in Phase 1. Never silently default the stack.
38
+
39
+ 3. **Resume protocol (partial / half-scaffolded repo).** If 0.1 found a partial tree, do NOT blindly run the generator over it — Phase 2 generator-first assumes an empty dir; `create-*` onto existing files either refuses or clobbers. Instead:
40
+ - **Detect** which of {manifest, lockfile, src skeleton, test runner, canonical docs} already exist and whether they install/build.
41
+ - **Usable generated base exists** → skip generation; go straight to imposing structure (Phase 2.3), filling gaps, then the Phase 3 gate.
42
+ - **Unusable AND the user confirms it's throwaway** → generate into a clean temp dir, then move in — **asking before overwriting any non-empty file you did not create this run**.
43
+ - **Never delete or overwrite a user file silently.** On any doubt, list what you'd change and ask. A wrong clobber destroys work — treat it as a destructive action.
44
+
45
+ ---
46
+
47
+ ## Phase 1 — App-type & stack (DECLARED, never silently defaulted)
48
+
49
+ The field's honest ceiling is "LLM reads + asks", not a signal classifier — so confirm, don't guess.
50
+
51
+ 1. **App-type** — classify into one of: `web-frontend` · `backend-API` · `full-stack` · `mobile` (iOS/Android/RN/Flutter) · `desktop` (native: Swift/macOS, C++/Qt, C#/WinUI — or web-wrapped: Electron/Tauri) · `CLI` · `library` — with `monorepo` as an orthogonal modifier (can co-occur). If the brief already pins it, confirm in one line. If ambiguous, ask the disambiguating questions (each collapses branches):
52
+ - "Install from an app store, or open in a browser?" (mobile vs web)
53
+ - "Something people *use*, or something other devs *import*?" (app vs library)
54
+ - "Run by typing commands, or opening a window?" (CLI vs GUI)
55
+ - "Server/login/DB behind it, or all on-device?" (full-stack vs frontend-only)
56
+
57
+ 2. **Stack — research current versions, propose, then confirm (never silently default).**
58
+ **Research first (WebSearch):** training memory of versions and "current best practice" goes stale — before proposing, search the *current* stable/LTS releases + current best practice for the candidate area, using the current year from `date +%Y` (never a hardcoded year). The default + rationale must reflect what you find, not cutoff memory. **If a `/sp-explore` Bootstrap Brief already pinned the stack, research happened upstream — trust it and skip re-searching.**
59
+ Then decide along the axes below; for each pick a default WITH a one-line rationale, then confirm before scaffolding. "Use whatever" is not an answer — pin it; every downstream file depends on it. Each major choice's rationale becomes one ADR (Phase 4). If the brief already pinned the stack with rationale, skip the matrix + question — just confirm in one line.
60
+
61
+ **Stack-decision axes** (resolve each that applies to the app-type):
62
+
63
+ | Axis | Default heuristic |
64
+ |---|---|
65
+ | Language / runtime | The app-type's mainstream: TS for web/node, Python for data/ML, Go/Rust for CLI/perf, Swift/Kotlin or RN/Flutter for mobile. |
66
+ | Framework | Prefer one with an official scaffolder (Reference table) over a hand-rolled setup. |
67
+ | Datastore (if any) | Postgres for relational by default; justify anything exotic. |
68
+ | Repo shape | Single package unless the app genuinely has ≥2 deployable units → then monorepo + a workspace tool. |
69
+ | Test runner | The framework's blessed runner (this resolves `TEST_CMD` for sp-build). |
70
+ | Architecture conventions | State mgmt · validation · data layer · forms · UI · API/response shape — the patterns the example module will demonstrate. Source them from a stack profile (see **Stack profiles** note below) or the project's house conventions; else research current best-practice. |
71
+
72
+ **Confirmation question** — sp-scaffold is self-contained; do not depend on reading another skill for the format:
73
+ ```json
74
+ {"questions":[{"question":"Proposed stack: <one-line summary>. RECOMMENDATION: <X> because <reason>. Confirm or change?","header":"Stack","multiSelect":false,"options":[
75
+ {"label":"Confirm — <stack> | Completeness: N/10 | Trade-off: <gain vs lose>"},
76
+ {"label":"Change — I'll specify"}]}]}
77
+ ```
78
+
79
+ 3. **Resolve the scaffold command + smoke command** for the confirmed stack (see Reference table). Prefer an official generator over freeform.
80
+
81
+ **Stack profiles (optional, LAYERED — they must survive kit upgrades).** A profile is a reusable opinion-as-data file naming a stack's library/pattern defaults. The kit installs globally, so the profile *store* lives OUTSIDE the skill bundle; look up in precedence order, first found = the starting suggestion:
82
+ 1. **Project** — `./.claude/stack-profiles/<stack>.md` (or the project's CLAUDE.md house-conventions). Wins for this repo.
83
+ 2. **User / global** — `~/.claude/stack-profiles/<stack>.md`. Your personal cross-project defaults; survives `devkit upgrade`.
84
+ 3. **Kit seed** — the bundled `references/stack-profiles/<stack>.md`. Examples/fallback only.
85
+
86
+ A profile is only a suggestion: verify its currency (it carries a date), and the **Bootstrap Brief always overrides it**. The kit-bundled seeds are OVERWRITTEN on `devkit upgrade` — to customize, **copy** a seed to `~/.claude/stack-profiles/` (global) or `./.claude/stack-profiles/` (this project); never hand-edit the bundled copy.
87
+
88
+ ---
89
+
90
+ ## Phase 2 — Skeleton (generator-first)
91
+
92
+ **Generator-first, always when one exists.** An official `create-*` / framework CLI / `degit` template gives a guaranteed-buildable base with **real, pinned** dependencies — eliminating the highest-risk LLM failure (a non-runnable base wired to hallucinated packages; ~1 in 5 LLM-suggested packages don't exist, and the fake names repeat, so attackers pre-register them).
93
+
94
+ **Monorepo (≥2 packages)? Orchestrate root-first — generators are NOT workspace-aware.** Running a per-package generator blind will fight the workspace. Sequence:
95
+ - Write the **root** manifest + workspace file (`package.json` + `pnpm-workspace.yaml` / equivalent) FIRST.
96
+ - Run each package generator with **install skipped** (`--skip-install` or equivalent) into its package dir.
97
+ - **De-conflict generator output:** generators often drop their OWN nested workspace file, lockfile, or `.git` inside the package (e.g. `create-next-app` writes a nested `pnpm-workspace.yaml`; generators `git init`). Remove/hoist them — a nested workspace file silently breaks resolution; a nested `.git` makes a repo-in-repo.
98
+ - Then ONE install at the root (single lockfile).
99
+
100
+ A single-package project skips all this — generate, install, done.
101
+
102
+ 1. **Run the generator** (Reference table) — *only into an empty/clean dir*. Use its `@latest` / current invocation; if unsure of the generator's current name or flags, WebSearch it — `create-*` tools get renamed and deprecated (e.g. CRA). If Phase 0.3 flagged a partial repo, follow the resume protocol there first; never run `create-*` over existing files. Let the generator own dependency selection + lockfile. **Version drift:** if the Brief pinned a major (e.g. "Next 15") but `@latest` has moved past it, do NOT silently diverge — pin the generator to the brief's major, OR surface the drift in one line and record it as an ADR. Don't let `@latest` quietly override a declared stack.
103
+ 2. **Freeform only as fallback** (no blessed template for the stack). Then, before any install, **sanity-check that each proposed dependency actually exists** in the registry (Reference §dep-verify). This is a minimum guard against hallucinated / typosquatted *names* — NOT a supply-chain audit: an existing name can still be a typosquat, unmaintained, or malicious, and the check does nothing for transitive deps. When supply-chain safety matters, defer to the lockfile + a real audit (`npm audit` / `pip-audit` / `cargo audit`). Pin versions; commit the lockfile.
104
+ 3. **Impose the two-layer structure** (core + features under ONE root, siblings) on the generated base — read the ARCHITECTURE codemap (`references/ARCHITECTURE.md.tmpl` §4); it defines the principle, the per-language mechanism, and the anti-pattern to avoid:
105
+ - **Core layer** — reusable foundation (entrypoint/bootstrap, config/env, IO plumbing, DI, errors, logging, shared utils/types). Feature-independent.
106
+ - **Feature layer** — one self-contained unit per capability; the scale axis. **Seed exactly ONE example unit that DEMONSTRATES the architecture conventions** (from the Brief): the thinnest end-to-end slice through the real pattern — e.g. React → data call via the chosen data layer (React Query) → input validated by Zod → render; backend → one endpoint through the chosen validation + response envelope; native → one use-case through the chosen architecture (a guarded op behind a repository protocol). NOT an empty stub: this is the *pattern template* every sp-build story copies. Still ONE slice — thin, not a feature. **Native GUI app whose GUI target is deferred (Phase 3):** also seed a **headless composition root** (a small CLI/executable that wires core + feature) so `build`/`test` prove the wiring without the IDE; the GUI `@main` belongs to the deferred GUI target.
107
+ - **Map the two layers onto the stack's unit of modularity** — directories (JS), packages/targets (Swift SPM, Go, C#/Java), crates/modules (Rust). Keep core + features **siblings under one discoverable root**; never bury the real code deep inside a wrapper dir, and never split into disconnected top-level trees (a `core/` package next to a separate `app/` tree is the §4 anti-pattern — put the app/GUI target in the same package/solution at root).
108
+ - **tests — follow the §4 test rule** in the language-idiomatic location (JS co-located sibling, ONE suffix, never mix `.spec`/`.test`; Swift `Tests/` mirror; Go/Rust inline; integration separate; e2e its own package).
109
+ - A **test-only package** (e.g. an e2e package) has no core/feature shape — its trivial seed is one hermetic passing test (no running servers), scaffolded freeform if no non-interactive generator exists (verify deps per 2.2).
110
+ 4. **`.env.example`** with every config/secret key (no real values). No secret in client-shipped code.
111
+
112
+ Keep the skeleton minimal — it's a walking skeleton, not the app. One thin end-to-end path, not features.
113
+
114
+ ---
115
+
116
+ ## Phase 3 — Smoke-test gate *(non-negotiable)*
117
+
118
+ The skeleton is not done until it **runs**. Drive, in order:
119
+
120
+ 1. **install** — dependencies resolve and install clean.
121
+ 2. **build / typecheck** — compiles (`tsc --noEmit`, `cargo check`, `go build`, `swift build`, etc.).
122
+ 3. **prove it runs** — demonstrate liveness the way THIS app-type requires (a server "boots" is not a library "imports" — don't conflate them). There MUST be at least one real, passing test so `TEST_CMD` resolves — this is exactly what unblocks `/sp-build`'s Phase 0b foundation gate.
123
+
124
+ **Smoke contract — what "green" means per app-type** (resolve to the one that fits):
125
+
126
+ | App-type | "Runs" = | TEST_CMD anchor |
127
+ |---|---|---|
128
+ | backend-API / full-stack BE | server boots AND a health/route request returns 2xx, then **shut it down** | ≥1 passing route/unit test |
129
+ | web-frontend | production build succeeds AND dev server reaches "ready" within a timeout, then stop it | ≥1 passing component/unit test |
130
+ | CLI | binary builds AND runs `--help` (or a no-op) exiting 0 | ≥1 passing test |
131
+ | library | builds/packages AND a sample consumer imports the public entry | ≥1 passing public-API test |
132
+ | mobile | JS/Flutter layer builds AND bundler / `flutter test` passes (native shell build best-effort — note if skipped) | ≥1 passing test |
133
+ | desktop | build succeeds AND the app launches headless OR the runner passes | ≥1 passing test |
134
+
135
+ For anything long-running (servers, dev servers, bundlers): use a **readiness signal + hard timeout + guaranteed teardown**. A smoke check that hangs or leaks a process is a FAILED gate, not a pass.
136
+
137
+ **Monorepo:** run the smoke for EVERY package (each per its own app-type row above) AND the aggregate run-all (`pnpm -r test` / workspace equivalent). "Green" = every package green AND the aggregate green; record both the per-package and the root `TEST_CMD`.
138
+
139
+ **Native desktop (Swift/Qt/WinUI):** these usually have TWO build systems — a **headless testable core** and an **IDE/toolchain-bound GUI target**. Gate the smoke on the core (build + `swift test`/`ctest`/`dotnet test` GREEN); treat the GUI target as **best-effort** — defer + note it if there's no non-interactive build path (e.g. no committed `.xcodeproj` and no `xcodegen`). Never block the gate on the GUI target, and never fake it green.
140
+
141
+ Record the resolved **`TEST_CMD`** (run-all + filtered) and the run command in the handoff (Phase 5) and in ARCHITECTURE §13.
142
+
143
+ - **Green** → proceed to Phase 4.
144
+ - **Not green after 3 attempts** → STOP, report **BLOCKED** with the raw failure output. Do NOT hand off a skeleton that doesn't run, and do NOT paper over it by deleting the failing check. A non-running scaffold is the failure this skill exists to prevent.
145
+
146
+ > Success here = "builds + smoke passes", never "files generated". A green light is the only handoff condition.
147
+
148
+ ---
149
+
150
+ ## Phase 4 — Canonical docs (thin + true)
151
+
152
+ Fill the templates in `references/`. At bootstrap the docs describe the **skeleton honestly** — not a hallucinated full system. They thicken as `/sp-build` adds stories.
153
+
154
+ | Doc | Template | Seed at bootstrap |
155
+ |---|---|---|
156
+ | `ARCHITECTURE.md` | `ARCHITECTURE.md.tmpl` | §1 quality goal + §2 scope from the brief; §4 Codemap = the core/modules layout just created + the test-split rule; §5 data model if any; **§7 Invariants** = the system invariants you already know (these become `INV-NNN` in specs, with `applies-to:` surfaces — the cross-surface test discipline); §12 ADRs (below); §13 run/deploy = the verified commands from Phase 3. |
157
+ | `docs/adr/NNNN-*.md` | `adr/NNNN-template.md` | One ADR per major stack choice (language, framework, datastore, auth transport, sync-vs-async). ADR-0001 = "Record architecture decisions". While ≤~6, they may live inline in ARCHITECTURE §12 instead. The stack *rationale* lives here, not in §1. |
158
+ | `DESIGN.md` | `DESIGN.md.tmpl` | OPTIONAL — only if the initial system design is non-trivial/contested. Per-feature forward-looking design is normally written later, alongside its spec. |
159
+
160
+ Do not over-document an empty repo. A greenfield ARCHITECTURE.md is short and honest; that's correct, not lazy.
161
+
162
+ ---
163
+
164
+ ## Phase 5 — Hygiene & handoff
165
+
166
+ 1. **Secret/safety scan** (reuse `/sp-commit` discipline): no hardcoded secret; `.gitignore` excludes `.env`, `*.pem`, `*.key`, build dirs, `node_modules`/vendor; `.env.example` present and committed.
167
+ 2. **Initial commit** (if the user wants one): conventional `chore: scaffold <stack> walking skeleton`.
168
+ 3. **Handoff summary:**
169
+ - App-type + confirmed stack.
170
+ - Skeleton layout (core / modules / tests) — one line.
171
+ - **Smoke-test gate: PASS** + the resolved `TEST_CMD` and run command.
172
+ - Docs written (ARCHITECTURE / ADRs / DESIGN?).
173
+ - Next: "Run `/sp-plan <feature>` for the first feature spec, then `/sp-build`. `/sp-build`'s Phase 0b will re-verify the harness."
174
+
175
+ ---
176
+
177
+ ## Reference — generators & dependency verification
178
+
179
+ **Scaffold command by stack** — covers the common ~80%. Prefer the official generator over freeform; **resolve `@latest` and confirm the command name at run time** (several moved recently — see the caveat below). Stacks marked *(freeform)* have NO official generator → write minimal files yourself and verify deps (below).
180
+
181
+ | App-type | Stack | Generator (+ non-interactive hint) |
182
+ |---|---|---|
183
+ | Web FE | React | `npm create vite@latest <app> -- --template react-ts` |
184
+ | Web FE | Next.js | `npx create-next-app@latest <app> --ts --app --yes` |
185
+ | Web FE | Vue | `npm create vue@latest <app>` (flags `--router --typescript`) |
186
+ | Web FE | SvelteKit | `npx sv create <app>` *(was `create-svelte`)* |
187
+ | Web FE | Angular | `ng new <app> --defaults --skip-git` |
188
+ | Web FE | Astro | `npm create astro@latest <app> -- --template minimal --yes` |
189
+ | Backend | NestJS | `npx @nestjs/cli new <app> --skip-git --package-manager npm` |
190
+ | Backend | Django | `django-admin startproject <name>` (no prompts) |
191
+ | Backend | Spring Boot | `curl https://start.spring.io/starter.zip -d dependencies=web -d type=maven-project -o app.zip` |
192
+ | Backend | Rails | `rails new <app> --api -d postgresql --skip-git` |
193
+ | Backend | Laravel | `laravel new <app> --no-interaction` |
194
+ | Backend | .NET | `dotnet new webapi -n <Name>` |
195
+ | Backend | FastAPI / Go | *(freeform)* — `uv init` + `fastapi` / `go mod init <mod>` + layout |
196
+ | Full-stack | Nuxt | `npx nuxi@latest init <app>` |
197
+ | Full-stack | React Router 7 | `npx create-react-router@latest <app>` *(was `create-remix`)* |
198
+ | Full-stack | T3 | `npm create t3-app@latest <app> -- --CI` |
199
+ | Mobile | Expo (RN) | `npx create-expo-app@latest <app> --template blank-typescript` |
200
+ | Mobile | bare RN | `npx @react-native-community/cli@latest init <App>` *(not `react-native init`)* |
201
+ | Mobile | Flutter | `flutter create <app> --org com.example` |
202
+ | Desktop | Electron | `npx create-electron-app@latest <app> --template=vite-typescript` |
203
+ | Desktop | Tauri | `npm create tauri-app@latest -- --template react-ts --yes` |
204
+ | Desktop | native (Swift/Qt/WinUI) | *(freeform core + IDE GUI)* — `swift package init` / CMake / `dotnet new wpf`; GUI via Xcode/XcodeGen — see the Native-desktop smoke note (Phase 3) |
205
+ | CLI | Node / Go / Rust / Python | oclif `npx oclif generate <cli>` · Go `cobra-cli init` · Rust `cargo new <cli>` · Python `uv init` + `typer` |
206
+ | Library | npm / Python / Rust / Go / .NET | `npm init -y` · `uv init --lib` (or `poetry new`) · `cargo new --lib` · `go mod init` · `dotnet new classlib` |
207
+ | any (no blessed gen) | template | `degit <template-repo>`, then verify deps |
208
+
209
+ **Recently changed — VERIFY before trusting:** CRA is dead (use Vite/Next); SvelteKit `create-svelte` → `sv create`; Remix → React Router 7 (`create-react-router`); bare RN → `@react-native-community/cli init`; Laravel installer preferred over `composer create-project`; RedwoodJS split (RedwoodSDK vs winding-down Redwood GraphQL). If a stack isn't listed, WebSearch its current official generator.
210
+
211
+ **Dependency existence check (freeform mode only — generators already pin real deps):** an existence *sanity-check*, NOT a supply-chain audit (Phase 2.2). Confirm each proposed package resolves before install:
212
+
213
+ | Manager | Check | Gotcha |
214
+ |---|---|---|
215
+ | npm / pnpm | `npm view <pkg> version` / `pnpm view <pkg> version` | missing → `E404`, exit 1 |
216
+ | yarn (Berry) | `yarn npm info <pkg>` | plain `yarn info` reports the project tree, not the registry |
217
+ | PyPI | `curl -fsSL https://pypi.org/pypi/<pkg>/json` | missing → 404 (cheaper than `pip index`) |
218
+ | cargo | `curl -fsSL https://crates.io/api/v1/crates/<name>` | **NOT `cargo search`** (fuzzy/substring); API 404 = absent |
219
+ | Go | `go list -m <module>@<ver>` | proxy lookup; new tags lag minutes; private → `GOPRIVATE` |
220
+ | SPM / any Git-URL | `git ls-remote <url> <tag>` | **exit 0 even if the ref is absent → check stdout is non-empty**; no registry |
221
+ | Maven / Gradle | `curl -fsSL https://repo1.maven.org/maven2/<grp/path>/<artifact>/<ver>/` | group dots → slashes; 404 = absent |
222
+ | NuGet | `dotnet package search <id> --exact-match` | id must be lowercased for the raw API |
223
+ | RubyGems | `gem list -r -e <name>` | `-r` remote, `-e` exact |
224
+ | Composer | `composer show <pkg> --all` | Packagist p2 holds tagged releases only (dev-only → 404) |
225
+
226
+ A package that doesn't resolve is dropped, not guessed at. Pin versions; commit the lockfile. (Caveat: existence ≠ safety — a registered typosquat still resolves; for supply-chain safety use the lockfile + `npm/pip/cargo audit`.)
227
+
228
+ ---
229
+
230
+ ## Rules
231
+
232
+ 1. **Runnable or BLOCKED.** Never hand off a skeleton that fails the Phase 3 smoke gate. "Files generated" is not done.
233
+ 2. **Generator-first.** Use the blessed scaffolder when it exists; freeform only as fallback, and then verify every dependency exists.
234
+ 3. **Stack is declared.** Confirm app-type + stack explicitly; never silently default and never bury the rationale.
235
+ 4. **Structure by principle.** core (foundation) vs modules (scale axis) vs co-located tests — adapt names to the stack, don't copy one blindly.
236
+ 5. **Docs thin + true.** Describe the skeleton honestly; let it thicken with stories. Stack rationale → ADRs; shape + invariants → ARCHITECTURE.
237
+ 6. **Scaffolding is not behaviour.** This skill writes no acceptance scenarios and runs no TDD loop — that's `/sp-plan` + `/sp-build`. It only makes the harness those need exist.