specpipe 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (60) hide show
  1. package/README.md +1319 -0
  2. package/bin/devkit.js +3 -0
  3. package/package.json +61 -0
  4. package/src/cli.js +76 -0
  5. package/src/commands/check.js +33 -0
  6. package/src/commands/diff.js +84 -0
  7. package/src/commands/init-adopt.js +54 -0
  8. package/src/commands/init-agents.js +118 -0
  9. package/src/commands/init-global.js +102 -0
  10. package/src/commands/init.js +311 -0
  11. package/src/commands/list.js +54 -0
  12. package/src/commands/remove.js +133 -0
  13. package/src/commands/upgrade.js +215 -0
  14. package/src/lib/agent-guards.js +100 -0
  15. package/src/lib/agent-install.js +161 -0
  16. package/src/lib/agents.js +280 -0
  17. package/src/lib/claude-global.js +183 -0
  18. package/src/lib/detector.js +93 -0
  19. package/src/lib/hasher.js +21 -0
  20. package/src/lib/installer.js +213 -0
  21. package/src/lib/logger.js +16 -0
  22. package/src/lib/manifest.js +102 -0
  23. package/src/lib/reconcile.js +56 -0
  24. package/templates/.claude/CLAUDE.md +79 -0
  25. package/templates/.claude/hooks/comment-guard.js +126 -0
  26. package/templates/.claude/hooks/file-guard.js +216 -0
  27. package/templates/.claude/hooks/glob-guard.js +104 -0
  28. package/templates/.claude/hooks/path-guard.sh +118 -0
  29. package/templates/.claude/hooks/self-review.sh +27 -0
  30. package/templates/.claude/hooks/sensitive-guard.sh +227 -0
  31. package/templates/.claude/settings.json +68 -0
  32. package/templates/docs/WORKFLOW.md +325 -0
  33. package/templates/docs/specs/.gitkeep +0 -0
  34. package/templates/hooks/specpipe-read-guard.sh +42 -0
  35. package/templates/hooks/specpipe-shell-guard.sh +65 -0
  36. package/templates/rules/specpipe-guards.md +40 -0
  37. package/templates/scripts/test-hooks.sh +66 -0
  38. package/templates/skills/sp-build/SKILL.md +776 -0
  39. package/templates/skills/sp-challenge/SKILL.md +255 -0
  40. package/templates/skills/sp-commit/SKILL.md +174 -0
  41. package/templates/skills/sp-explore/SKILL.md +730 -0
  42. package/templates/skills/sp-fix/SKILL.md +266 -0
  43. package/templates/skills/sp-humanize/SKILL.md +212 -0
  44. package/templates/skills/sp-investigate/SKILL.md +648 -0
  45. package/templates/skills/sp-md-render/SKILL.md +200 -0
  46. package/templates/skills/sp-md-render/components.md +415 -0
  47. package/templates/skills/sp-md-render/template.html +283 -0
  48. package/templates/skills/sp-plan/SKILL.md +947 -0
  49. package/templates/skills/sp-review/SKILL.md +268 -0
  50. package/templates/skills/sp-scaffold/SKILL.md +237 -0
  51. package/templates/skills/sp-scaffold/references/ARCHITECTURE.md.tmpl +228 -0
  52. package/templates/skills/sp-scaffold/references/DESIGN.md.tmpl +113 -0
  53. package/templates/skills/sp-scaffold/references/adr/NNNN-template.md +92 -0
  54. package/templates/skills/sp-scaffold/references/stack-profiles/react.md +36 -0
  55. package/templates/skills/sp-spec-render/SKILL.md +254 -0
  56. package/templates/skills/sp-spec-render/components.md +418 -0
  57. package/templates/skills/sp-spec-render/examples/user-auth.html +749 -0
  58. package/templates/skills/sp-spec-render/examples/user-auth.md +114 -0
  59. package/templates/skills/sp-spec-render/template.html +222 -0
  60. package/templates/skills/sp-voices/SKILL.md +1184 -0
@@ -0,0 +1,255 @@
1
+ ---
2
+ description: |
3
+ Adversarial review — spawn hostile reviewers to break the plan before coding.
4
+ Stress-tests assumptions, attacks decisions, finds blind spots in a spec.
5
+ Use when asked to "challenge this plan", "phản biện", "stress test the spec",
6
+ "tìm lỗ hổng", "break this", "red team this", or "attack this design".
7
+ Proactively suggest after /sp-plan produces a spec but before /sp-build —
8
+ catches design issues while they are still cheap to fix.
9
+ Skip for trivial spec changes or pure bug fixes.
10
+ allowed-tools: Read, Bash, Glob, Grep, AskUserQuestion, Agent
11
+ ---
12
+ Adversarial review — spawn hostile reviewers to break the plan before coding.
13
+
14
+ ## Input
15
+
16
+ Target: $ARGUMENTS
17
+
18
+ If argument is a file path → use that.
19
+ If argument is a feature name → search `docs/specs/` for matches.
20
+ If no argument → list recent files in `docs/specs/`, ask user which to challenge.
21
+
22
+ ## Phase 1: Read and Map
23
+
24
+ Read the ENTIRE target file. The spec contains both the feature definition and acceptance scenarios (in `## Stories` section).
25
+
26
+ Map the plan's attack surface:
27
+ - Decisions made (and what was rejected)
28
+ - Assumptions (stated AND implied)
29
+ - Dependencies (external services, APIs, libraries, infra)
30
+ - Scope boundaries (in/out/suspiciously unmentioned)
31
+ - Risk acknowledgments (mentioned vs. conspicuously absent)
32
+ - Story↔AS consistency (stories without acceptance scenarios? contradictions?)
33
+
34
+ Collect all file paths the reviewers will need to read.
35
+
36
+ ## Phase 2: Scale Reviewers
37
+
38
+ Assess plan complexity and select which lenses to deploy:
39
+
40
+ | Complexity Signal | Reviewers | Lenses |
41
+ |-------------------|-----------|--------|
42
+ | Simple (1 spec section, <20 acceptance scenarios, no auth/data) | 2 | Assumptions + Scope |
43
+ | Standard (multiple sections, auth or data involved) | 3 | + Security |
44
+ | Complex (multiple integrations, concurrency, migrations, 6+ phases) | 4 | + Failure Modes |
45
+
46
+ When in doubt, use 3 reviewers. 4 is for genuinely complex plans.
47
+
48
+ ## Phase 3: Spawn Parallel Reviewers
49
+
50
+ Launch reviewers simultaneously using the Agent tool. Each reviewer is an independent subagent that reads the plan files directly and returns findings.
51
+
52
+ **CRITICAL:** Each reviewer prompt MUST include:
53
+ 1. The file paths to read (so they can access the plan directly)
54
+ 2. Their specific adversarial persona and lens
55
+ 3. The exact output format (so you can parse findings consistently)
56
+ 4. The rules of engagement
57
+
58
+ ### Reviewer Prompts
59
+
60
+ For each selected lens, spawn an agent with this structure:
61
+
62
+ ```
63
+ You are a hostile reviewer. Your job is to DESTROY this plan by finding every flaw through the {LENS_NAME} lens.
64
+
65
+ Read these files first:
66
+ {LIST OF FILE PATHS}
67
+
68
+ --- YOUR LENS ---
69
+
70
+ {LENS-SPECIFIC INSTRUCTIONS — see below}
71
+
72
+ --- OUTPUT FORMAT ---
73
+
74
+ For EACH flaw found, output exactly:
75
+
76
+ ### Finding: <title>
77
+ - **Severity:** Critical | High | Medium
78
+ - **Confidence:** N/10 — (9-10: verified in code; 7-8: strong pattern match; 5-6: possible false positive, note caveat; ≤4: omit unless Critical)
79
+ - **Location:** <exact section or heading in the plan>
80
+ - **Flaw:** <what's wrong — be specific>
81
+ - **Evidence:** "<direct quote from the plan>"
82
+ - **Failure scenario:** <step-by-step: how this causes a real problem in production>
83
+ - **Root cause:** <why does this flaw exist? Missing requirement? Wrong assumption?>
84
+ - **Suggested fix:** <specific, actionable — not just "fix it">
85
+
86
+ --- RULES ---
87
+
88
+ - 3-7 findings per lens. Quality over quantity.
89
+ - Be HOSTILE. No praise. No "overall looks good."
90
+ - Be SPECIFIC. Cite exact sections. Quote the plan.
91
+ - Be CONCRETE. Failure scenarios must be step-by-step, not "could be a problem."
92
+ - Skip trivial issues (naming, formatting, style).
93
+ - If the plan is solid for your lens, 1-2 findings is honest. Don't manufacture problems.
94
+ ```
95
+
96
+ ### Lens-Specific Instructions
97
+
98
+ **Security Adversary:**
99
+ ```
100
+ You are an attacker with knowledge of the tech stack and access to the public API.
101
+
102
+ Examine the plan for:
103
+ - Authentication/authorization bypass: Can auth be skipped? Can user A access user B's data? Are role checks at every layer?
104
+ - Injection vectors: Where does user input enter? SQL, shell, HTML, template, log injection? Parameterized queries?
105
+ - Data exposure: What leaks in error messages, logs, API responses? Stack traces? Internal paths? DB schemas?
106
+ - Cryptography: Password hashing (bcrypt/argon2, not MD5/SHA)? Secrets in env vars not code? TLS?
107
+ - Supply chain: New dependencies? Maintained? Known CVEs?
108
+ - OWASP Top 10 (2021): Broken Access Control, Crypto Failures, Injection, Insecure Design, Security Misconfiguration, Vulnerable Components, Identity Failures, Integrity Failures, Logging Failures, SSRF
109
+ ```
110
+
111
+ **Failure Mode Analyst:**
112
+ ```
113
+ You believe Murphy's Law: everything that can go wrong, will — simultaneously, at 3 AM, during peak traffic.
114
+
115
+ Examine the plan for:
116
+ - Partial failures: What if step 3 of 5 fails? Rollback? Atomic writes? Inconsistent state?
117
+ - Concurrency: Race conditions? Two users editing same resource? Shared mutable state? Deadlocks?
118
+ - Cascading failures: Service A down → B also fails? Circuit-breaking? Graceful degradation?
119
+ - Data integrity: Data loss? Corruption? Duplication? DB-level constraints or app-only validation?
120
+ - Recovery: How to recover from each failure? Reversible migrations? Backup restoration time?
121
+ - Deployment: What breaks during deploy? Rollback plan? Migration failures?
122
+ - Idempotency: Retried requests duplicate data? Double-charge? Double-email?
123
+ - Observability: How do you KNOW something failed? Logging? Monitoring? Alerts? Or angry users?
124
+ ```
125
+
126
+ **Assumption Destroyer:**
127
+ ```
128
+ You are a radical skeptic. "It should work" is not evidence. "We assume X" means X is unverified.
129
+
130
+ Examine the plan for:
131
+ - Unverified claims: "The API returns X" — tested? "The library supports Y" — checked docs?
132
+ - Scale assumptions: Expected load? Works at 10x? 100x? O(n²) hiding in "iterate all items"?
133
+ - Environment gaps: Same behavior in dev/staging/prod? Different OS? Docker vs bare metal?
134
+ - Integration risk: Third-party SLA? Rate limits? Their service down → your plan?
135
+ - Data assumptions: Always clean? Unicode? Emoji? Null bytes? 10MB payloads? Empty strings?
136
+ - User behavior: Will users actually do this? What if they click 50 times? Upload 2GB? Use mobile?
137
+ - Timing: "A before B" — always? What if B first? Implicit ordering dependencies?
138
+ - Hidden dependencies: Services, configs, env vars, or manual steps that must exist but aren't documented?
139
+ ```
140
+
141
+ **Scope & Complexity Critic (YAGNI Enforcer):**
142
+ ```
143
+ You believe the best code is no code. The best feature is the one you didn't build.
144
+
145
+ Examine the plan for:
146
+ - Over-engineering: Solving problems that don't exist yet? "In case we need it later" = YAGNI.
147
+ - Premature abstraction: Generic framework for 1 use case? Plugin system nobody asked for?
148
+ - Missing MVP: What's the absolute minimum viable delivery? Can 40% be deferred?
149
+ - Complexity vs value: Distributed system for 5 users? Proportional?
150
+ - Gold plating: Nice-to-have mixed with must-have? Can you ship without the nice-to-haves?
151
+ - Simpler alternative: Boring 10-line solution vs clever 500-line solution?
152
+ - Test burden: Test cases harder to maintain than the feature itself?
153
+ ```
154
+
155
+ ## Phase 4: Collect and Consolidate
156
+
157
+ After all reviewers complete:
158
+
159
+ 1. **Collect** all findings from all reviewers
160
+ 2. **Deduplicate** — if two lenses found the same root issue, merge into one finding noting both lenses
161
+ 3. **Rate severity** using Likelihood × Impact:
162
+
163
+ | | Low Impact | Medium Impact | High Impact |
164
+ |---|-----------|---------------|-------------|
165
+ | **Likely** | Medium | High | Critical |
166
+ | **Possible** | Low | Medium | High |
167
+ | **Unlikely** | Low | Low | Medium |
168
+
169
+ 4. **Sort** by severity: Critical → High → Medium → Low
170
+ 5. **Cap** at 15 findings: keep all Critical, top High by specificity, note how many Medium were dropped
171
+ 6. **Cross-reference check** (you, not reviewers): Flag any stories without acceptance scenarios, and any AS that contradicts the story description
172
+
173
+ ## Phase 5: Adjudicate
174
+
175
+ For each finding, YOU (the coordinator) evaluate and propose a disposition:
176
+
177
+ | Disposition | When to use |
178
+ |-------------|-------------|
179
+ | **Accept** | Valid flaw. Plan should be updated. |
180
+ | **Reject** | False positive, acceptable risk, or already handled elsewhere. |
181
+
182
+ Include 1-sentence rationale for each disposition. Be honest — don't reject valid findings to be nice, and don't accept trivial findings to pad the list.
183
+
184
+ ## Phase 6: Present to User
185
+
186
+ Show adjudicated findings using the reviewer output format plus Disposition and Rationale fields.
187
+
188
+ Then present the decision using the `AskUserQuestion` tool:
189
+
190
+ ```json
191
+ {
192
+ "questions": [
193
+ {
194
+ "question": "How to proceed with N accepted findings? RECOMMENDATION: Choose A if mostly Medium fixes, B if any Critical/High findings.",
195
+ "header": "Apply Findings",
196
+ "multiSelect": false,
197
+ "options": [
198
+ {"label": "A) Apply all accepted — bulk-apply all fixes at once | (human: ~30m / CC: ~10m) | Completeness: 8/10 | Trade-off: fast vs. no per-finding control"},
199
+ {"label": "B) Review each — walk through one by one, accept/reject/modify | (human: ~1h / CC: ~20m) | Completeness: 10/10 | Trade-off: precise control vs. slower"}
200
+ ]
201
+ }
202
+ ]
203
+ }
204
+ ```
205
+
206
+ Score: if most findings are High/Critical, recommend B. If mostly Medium with clear fixes, recommend A.
207
+
208
+ If user picks B: for each finding, use `AskUserQuestion`. Append `(Recommended)` to option A if the Phase 5 adjudication = Accept, or to option C if adjudication = Reject:
209
+
210
+ ```json
211
+ {
212
+ "questions": [
213
+ {
214
+ "question": "Finding [C-1]: <title>\n<flaw summary>\nRECOMMENDATION: Choose A — <adjudication rationale>.",
215
+ "header": "Finding C-1",
216
+ "multiSelect": false,
217
+ "options": [
218
+ {"label": "A) Accept — apply the suggested fix (Recommended)"},
219
+ {"label": "B) Modify — accept with changes (describe your modification)"},
220
+ {"label": "C) Reject — skip this finding"}
221
+ ]
222
+ }
223
+ ]
224
+ }
225
+ ```
226
+
227
+ *(Example above shows adjudication = Accept. If adjudication = Reject, move `(Recommended)` to option C instead.)*
228
+
229
+ ## Phase 7: Apply
230
+
231
+ For each accepted finding:
232
+ 1. Edit the target file at the exact location cited
233
+ 2. Apply the fix (or user's modified version)
234
+ 3. Surgical edits only — do NOT rewrite surrounding sections
235
+
236
+ After all edits, show summary:
237
+ ```
238
+ Challenge complete.
239
+ Reviewers: N lenses
240
+ Findings: X total → Y accepted, Z rejected
241
+ Severity: N Critical, N High, N Medium
242
+ Files modified: [list]
243
+ Next: /sp-build to implement, or /sp-plan to regenerate if major changes.
244
+ ```
245
+
246
+ If a reviewer returns > 7 findings, take only top 7 by severity. If a reviewer fails, proceed with remaining reviewers.
247
+
248
+ ## Rules — Non-Negotiable
249
+
250
+ 1. **Spawn reviewers in parallel.** Don't run lenses in your own context.
251
+ 2. **Reviewers read files directly.** Pass paths, not content.
252
+ 3. **Be hostile.** No praise. Not in reviewers, not in adjudication.
253
+ 4. **Quote the plan.** Every finding needs a direct quote in Evidence.
254
+ 5. **Don't manufacture findings.** 3 honest findings > 15 padded ones.
255
+ 6. **Skip style/formatting.** Substance only: logic, security, assumptions, scope.
@@ -0,0 +1,174 @@
1
+ ---
2
+ description: |
3
+ Stage changes, scan for secrets and debug code, generate conventional commit message.
4
+ Use when asked to "commit", "commit changes", "tạo commit", "viết commit message",
5
+ "stage and commit", or "git commit".
6
+ Proactively invoke this skill (do NOT run git commit directly) when the user
7
+ asks to commit changes — the secret scan and debug-code detection covers gaps
8
+ that raw git commit skips.
9
+ Output: conventional commit format — type(scope): description (feat, fix, docs,
10
+ refactor, test, chore, perf, build, ci).
11
+ allowed-tools: Bash, AskUserQuestion
12
+ ---
13
+ Stage, scan secrets, generate conventional commit message.
14
+
15
+ ## Step 1 — Analyze (single compound command)
16
+
17
+ ```bash
18
+ echo "=== STATUS ===" && \
19
+ git status --short 2>/dev/null && \
20
+ echo "=== DIFF STAT ===" && \
21
+ git diff --stat 2>/dev/null && \
22
+ git diff --cached --stat 2>/dev/null && \
23
+ echo "=== METRICS ===" && \
24
+ { git diff --shortstat 2>/dev/null; git diff --cached --shortstat 2>/dev/null; } && \
25
+ echo "=== SECRETS ===" && \
26
+ (git diff 2>/dev/null; git diff --cached 2>/dev/null) | grep -ciE "(api[_-]?key|token|password|secret|private[_-]?key|credential|auth[_-]?token)" || echo "0" && \
27
+ echo "=== DEBUG ===" && \
28
+ (git diff 2>/dev/null; git diff --cached 2>/dev/null) | grep -ciE "(console\.log|debugger|print\(|TODO:.*remove|HACK:|FIXME:.*temp|binding\.pry|var_dump)" || echo "0"
29
+ ```
30
+
31
+ ---
32
+
33
+ ## Step 2 — Safety checks
34
+
35
+ **Secrets (hard block):** If count > 0, show matched lines and STOP. Do not commit.
36
+
37
+ **Debug code (soft warn):** If count > 0, show matched lines. Use `AskUserQuestion` to confirm:
38
+
39
+ ```json
40
+ {
41
+ "questions": [
42
+ {
43
+ "question": "Found <N> debug statements (console.log, debugger, etc.) in the diff. Are these intentional?",
44
+ "header": "Debug Code",
45
+ "multiSelect": false,
46
+ "options": [
47
+ {"label": "Yes, intentional — proceed with commit"},
48
+ {"label": "No, remove them first"}
49
+ ]
50
+ }
51
+ ]
52
+ }
53
+ ```
54
+
55
+ **Large diff:** If > 10 files or > 300 lines, use `AskUserQuestion` to confirm:
56
+
57
+ ```json
58
+ {
59
+ "questions": [
60
+ {
61
+ "question": "Large commit detected (<N> files, <M> lines). Large commits are harder to review and revert.",
62
+ "header": "Large Commit",
63
+ "multiSelect": false,
64
+ "options": [
65
+ {"label": "Proceed — commit everything as one"},
66
+ {"label": "Split — I'll stage specific files myself"}
67
+ ]
68
+ }
69
+ ]
70
+ }
71
+ ```
72
+
73
+ ---
74
+
75
+ ## Step 3 — Stage files
76
+
77
+ Prefer staging specific files by name. Do NOT use `git add -A`.
78
+
79
+ Never stage: `.env`, credentials, build artifacts, generated files, binaries > 1MB.
80
+
81
+ ---
82
+
83
+ ## Step 4 — Generate commit message
84
+
85
+ **Format:** `type(scope): description`
86
+
87
+ | Type | When |
88
+ |------|------|
89
+ | `feat` | New feature |
90
+ | `fix` | Bug fix |
91
+ | `docs` | Documentation only |
92
+ | `test` | Tests only |
93
+ | `refactor` | Code change, no behavior change |
94
+ | `chore` | Maintenance, deps, config |
95
+ | `perf` | Performance improvement |
96
+ | `build` | Build system |
97
+ | `ci` | CI/CD changes |
98
+
99
+ **Breaking changes:** If diff removes/renames a public function, export, or API endpoint → use `feat!` or `fix!` type, or add `BREAKING CHANGE:` footer.
100
+
101
+ **Story link (optional — only when the commit implements a spec story):** If the change maps to an `S-NNN` story (a `docs/specs/<feature>/.build-progress` exists, or the context/`$ARGUMENTS` names a story), add a footer line `Story: S-NNN`. This lets `/sp-build` find the commit on resume (`git log --grep`). Omit it entirely for ordinary commits — most commits have no story, and their format is unchanged.
102
+
103
+ ```
104
+ feat(auth): add refresh-token rotation
105
+
106
+ Story: S-003
107
+ ```
108
+
109
+ **Rules:** Subject under 72 chars (the `Story:` footer does not count toward the subject). Imperative tense ("add" not "added"). No period. WHAT+WHY, not HOW.
110
+
111
+ **Bad examples — avoid:**
112
+ - ❌ `Updated some files` — not descriptive
113
+ - ❌ `feat(auth): added login validation using bcrypt with salt rounds of 12` — too long, describes HOW
114
+ - ❌ `Fix bug` — not specific
115
+ - ❌ `WIP` — never commit unfinished work
116
+
117
+ ---
118
+
119
+ ## Step 5 — Commit
120
+
121
+ ```bash
122
+ git commit -m "type(scope): description"
123
+ ```
124
+
125
+ ---
126
+
127
+ ## Step 6 — Push?
128
+
129
+ Check if a remote exists:
130
+
131
+ ```bash
132
+ git remote
133
+ ```
134
+
135
+ If no remote → skip this step entirely.
136
+
137
+ If remote exists, use `AskUserQuestion`:
138
+
139
+ ```json
140
+ {
141
+ "questions": [
142
+ {
143
+ "question": "Commit successful. Push to remote now?",
144
+ "header": "Push",
145
+ "multiSelect": false,
146
+ "options": [
147
+ {"label": "Yes — push now (git push, or git push -u origin <branch> if no upstream)"},
148
+ {"label": "No — push later"}
149
+ ]
150
+ }
151
+ ]
152
+ }
153
+ ```
154
+
155
+ If user chooses Yes → run `git push` (or `git push -u origin <branch>` if upstream not set).
156
+
157
+ ---
158
+
159
+ ## Output
160
+
161
+ ```
162
+ staged: N files (+X/-Y lines)
163
+ checks: secrets ✓ | debug ✓
164
+ commit: abc1234 type(scope): description
165
+ pushed: yes → origin/<branch> (or "no")
166
+ ```
167
+
168
+ Keep under 5 lines. No explanations.
169
+
170
+ ## Rules
171
+ 1. **Specific files, not `git add -A`.** Stage intentionally.
172
+ 2. **Secrets = hard block.** No exceptions.
173
+ 3. **Ask before pushing.** Push only if user confirms in Step 6.
174
+ 4. **One concern per commit.** Mixed features → suggest separate commits.