bigpowers 2.34.2 → 2.36.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/.pi/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "bigpowers",
3
- "version": "2.34.2",
4
- "description": "70 skills — 70 agent skills for spec-driven, test-first software development by solo developers",
3
+ "version": "2.36.0",
4
+ "description": "71 skills — 70 agent skills for spec-driven, test-first software development by solo developers",
5
5
  "keywords": [
6
6
  "pi-package"
7
7
  ],
@@ -13,10 +13,11 @@ Orchestrates the **build** flow for a single epic: survey → plan tasks → kic
13
13
  >
14
14
  > **HARD GATE** — Not on `main`/`master` before step 3 (kickoff-branch).
15
15
 
16
- ## Eight steps (`epic_cycle` in state.yaml)
16
+ ## Nine steps (`epic_cycle` in state.yaml)
17
17
 
18
18
  | Step | Skill / action |
19
19
  |------|----------------|
20
+ | 0 | `security-review` — threat-model epic scope → `specs/security/epics/<id>/THREAT_MODEL.md` |
20
21
  | 1 | `survey-context` — confirm epic + story |
21
22
  | 2 | `plan-work` — flesh out story `tasks[]` in `specs/epics/eNN-slug/epic.yaml` |
22
23
  | 3 | `kickoff-branch` — feature branch + clean baseline |
@@ -24,17 +25,18 @@ Orchestrates the **build** flow for a single epic: survey → plan tasks → kic
24
25
  | 5 | `verify-work` — UAT + mechanical gates |
25
26
  | 6 | `audit-code` — **non-optional gate** (pass/fail; fail → loop back to step 4) |
26
27
  | 7 | `commit-message` — Conventional Commits draft |
27
- | 8 | `release-branch` — PR or solo land (supports `--squash-state`) |
28
+ | 8 | `release-branch` — PR or solo land (supports `--squash-state`) | |
28
29
 
29
30
  ## Process
30
31
 
31
32
  1. Read `specs/state.yaml`, `specs/execution-status.yaml`, `specs/release-plan.yaml`, active `specs/epics/eNN-slug/epic.yaml`.
32
- 2. **Assess Impact (Step 2):** Before writing tasks, run `assess-impact --lightweight` on the proposed change. If the risk score exceeds 7, gate — require a `grill-me` session. Write the impact report to `specs/IMPACT-<epic>-<story>.md`. For net-new code with no existing dependents, skip.
33
- 3. **BCP Tracking (Step 2):** After `plan-work` completes, read the `bcps:` count (Business Complexity Points story size) from the epic capsule and carry it into `state.yaml` as `epic_cycle.story_bcps = N`.
34
- 3. If `epic_cycle.step` missing, set to `1`.
35
- 4. Run **only the current step** (resume mode) unless user asked for full auto-run.
36
- 5. After step verify passes, increment `epic_cycle.step` in `state.yaml` (or `bash scripts/bp-yaml-set.sh` if available).
37
- 6. On story complete, set `execution-status.yaml` story key to `done`; run `bash scripts/sync-status-from-epics.sh`.
33
+ 2. **Step 0 Threat Model:** Run `security-review` against the epic's scope (read from the epic capsule). Output `specs/security/epics/<epic-id>/THREAT_MODEL.md` with surface area, vulnerability categories, risk level, and mitigation guidance.
34
+ 3. **Assess Impact (Step 2):** Before writing tasks, run `assess-impact --lightweight` on the proposed change. If the risk score exceeds 7, gate require a `grill-me` session. Write the impact report to `specs/IMPACT-<epic>-<story>.md`. For net-new code with no existing dependents, skip.
35
+ 4. **BCP Tracking (Step 2):** After `plan-work` completes, read the `bcps:` count (Business Complexity Points story size) from the epic capsule and carry it into `state.yaml` as `epic_cycle.story_bcps = N`.
36
+ 5. If `epic_cycle.step` missing, set to `1`.
37
+ 6. Run **only the current step** (resume mode) unless user asked for full auto-run.
38
+ 7. After step verify passes, increment `epic_cycle.step` in `state.yaml` (or `bash scripts/bp-yaml-set.sh` if available).
39
+ 8. On story complete, set `execution-status.yaml` story key to `done`; run `bash scripts/sync-status-from-epics.sh`.
38
40
 
39
41
  ### Step 6 — audit-code gate (non-optional)
40
42
 
@@ -0,0 +1,323 @@
1
+ ---
2
+ description: > AI-powered security analysis of code changes — traces data flow, detects injection, auth bypass, secrets exposure, and unsafe deserialization across files. Use when reviewing pending changes, before release-branch, during verify-work Phase 5, during build-epic Step 0 threat modeling, or when the user says "security review" or "scan for vulns".
3
+ ---
4
+
5
+
6
+ # Security Review
7
+
8
+ > **HARD GATE** — Requires git context (branch with merge-base or diff). Never
9
+ > writes files outside `specs/security/`. Findings below confidence 8/10 are
10
+ > suppressed. **→ verify:** `git rev-parse HEAD >/dev/null 2>&1 && echo "ok" || echo "BLOCKED"`
11
+
12
+ ## 5-phase scan
13
+
14
+ | # | Phase | What |
15
+ |---|-------|------|
16
+ | 1 | **Scope Resolution** | Detect diff via `git diff --merge-base origin/HEAD`; resolve languages/frameworks from dependency files |
17
+ | 2 | **Context Research** | Identify existing security patterns, sanitization, auth model in the codebase |
18
+ | 3 | **Vulnerability Assessment** | Trace user input → sink; check auth boundaries, crypto, deserialization, path ops |
19
+ | 4 | **False-Positive Filtering** | Cross-check each finding against exclusion rules; reject confidence < 8 |
20
+ | 5 | **Report Generation** | Output structured markdown: file:line, severity, category, exploit scenario, fix |
21
+
22
+ ## Categories
23
+
24
+ Covered: SQLi, XSS, SSRF, command injection, auth bypass, unsafe deserialization, path traversal, IDOR, crypto flaws, secrets exposure, template injection, NoSQLi
25
+
26
+ ## Integration points
27
+
28
+ | Skill | Touchpoint |
29
+ |-------|------------|
30
+ | `build-epic` | Step 0 — threat-model epic scope → `specs/security/epics/<id>/THREAT_MODEL.md` |
31
+ | `plan-work` | `security:` field (none/low/medium/high) on story tasks |
32
+ | `plan-release` | +2 WSJF risk boost for HIGH+ risk epics |
33
+ | `audit-code` | Checklist: "diff scanned — no unaddressed HIGH findings" |
34
+ | `request-review` | Inject threat model categories + false-positive rules into reviewer prompt |
35
+ | `investigate-bug` | Security-impact assessment in RCA (NONE→CRITICAL) |
36
+ | `validate-fix` | Recurrence hardening check for security bugs |
37
+ | `verify-work` | Phase 5 — blocks on HIGH findings ≥ 8 confidence |
38
+ | `release-branch` | Hard gate — blocks merge if unresolved HIGH findings |
39
+
40
+ ## Report format
41
+
42
+ Each finding: **`File:Line` — Severity — Category**
43
+ - Description: how the vulnerability manifests
44
+ - Exploit scenario: concrete attack path
45
+ - Recommendation: fix with code example
46
+
47
+ ## Reference files
48
+
49
+ - [Vuln categories](REFERENCE-vuln-categories.md) — detection guidance per vuln type
50
+ - [False positives](REFERENCE-false-positives.md) — hard exclusions + precedent
51
+ - [Confidence rubric](REFERENCE-confidence-rubric.md) — scoring methodology (0–10)
52
+
53
+ ## Verify
54
+
55
+ ```bash
56
+ test -d specs/security && echo "OK: specs/security/ exists" || mkdir -p specs/security
57
+ grep -q "Merge-base\|merge.base\|git diff" SKILL.md && echo "OK: git context verified"
58
+ ```
59
+
60
+ ---
61
+
62
+ # Confidence Scoring Rubric
63
+
64
+ Every finding that survives Phase 4 false-positive filtering receives a confidence
65
+ score from 1 (speculative) to 10 (certain). Only findings ≥ 8 are reported.
66
+
67
+ ## Score 9–10: Certain Exploit Path
68
+
69
+ **Criteria:**
70
+ - Concrete, testable exploit with clear reproduction steps
71
+ - No assumptions about uncommon configurations
72
+ - No chain of multiple unlikely conditions
73
+ - Attacker has full control over the input vector
74
+
75
+ **Examples:**
76
+ - User-supplied SQL in a `SELECT` statement with no parameterization
77
+ - `os.system(f"rm {user_path}")` where user controls the path
78
+ - Pickle deserialization of user-supplied data without any wrapping
79
+
80
+ **Severity:** HIGH
81
+
82
+ ## Score 8: Clear Vulnerability Pattern
83
+
84
+ **Criteria:**
85
+ - Well-known vulnerability pattern with standard exploitation method
86
+ - Requires specific conditions but conditions are commonly met
87
+ - Exploitability is well-documented in OWASP / CVE databases
88
+
89
+ **Examples:**
90
+ - JWT without signature verification in authentication middleware
91
+ - SSRF where attacker controls the full URL including host
92
+ - Hardcoded AWS secret key in source code
93
+
94
+ **Severity:** HIGH or MEDIUM
95
+
96
+ ## Score 7: Suspicious Pattern
97
+
98
+ **Criteria:**
99
+ - Unusual code that may indicate a vulnerability
100
+ - Requires specific conditions that may not be present
101
+ - Alternative secure interpretation is equally likely
102
+ - Defense-in-depth concern rather than direct exploit
103
+
104
+ **Examples:**
105
+ - A function accepting user input that passes through multiple layers before reaching a sink (unclear if sanitized)
106
+ - Custom encryption implementation (likely weak, but may not process sensitive data)
107
+ - Path construction that looks safe but has a subtle bypass
108
+
109
+ **Severity:** LOW or suppress
110
+
111
+ ## Score < 7: Do Not Report
112
+
113
+ **Criteria:**
114
+ - Theoretical concern without exploit path
115
+ - Requires unrealistic attacker capabilities
116
+ - Violates one or more hard exclusion rules
117
+ - Better handled by separate tooling (dependency scanner, SAST, secret scanner)
118
+ - Purely stylistic or best-practice concern without security impact
119
+
120
+ **Examples:**
121
+ - "This function doesn't validate all inputs" without proving the validated input is the attack surface
122
+ - "This uses MD5" where the hash is not used for security (e.g., cache key)
123
+ - "This function could consume too much memory" (DOS exclusion)
124
+
125
+ **Action:** Suppress entirely. Do not include in report.
126
+
127
+ ## Severity Mapping
128
+
129
+ Once confidence ≥ 8 is confirmed, map to severity:
130
+
131
+ | Severity | Impact | Examples |
132
+ |----------|--------|---------|
133
+ | **CRITICAL** | Remote compromise, full data breach | RCE, auth bypass with admin escalation, SQLi with data exfiltration |
134
+ | **HIGH** | Significant security boundary crossed | SSRF to internal services, hardcoded cloud credentials, insecure deserialization |
135
+ | **MEDIUM** | Limited impact or requires conditions | Stored XSS behind auth, IDOR on non-sensitive data, weak but not broken crypto |
136
+ | **LOW** | Defense-in-depth, minimal blast radius | Missing security header, verbose error messages in non-production |
137
+
138
+ ## Quality Gate
139
+
140
+ The confidence rubric double-checks each finding against three lenses:
141
+
142
+ | Lens | Question |
143
+ |------|----------|
144
+ | **Exploitability** | Can a real attacker trigger this from a trust boundary? |
145
+ | **Actionability** | Would a security engineer accept a fix recommendation for this? |
146
+ | **Precedent** | Has this type of finding passed/failed human review before? |
147
+
148
+ ---
149
+
150
+ # False-Positive Exclusion Rules
151
+
152
+ Applied during Phase 4 of the scan. Findings matching any hard exclusion are
153
+ automatically suppressed. Precedents from prior reviews guide borderline cases.
154
+
155
+ ## Hard Exclusions
156
+
157
+ Automatically exclude findings matching these patterns:
158
+
159
+ | # | Rule | Rationale |
160
+ |---|------|-----------|
161
+ | 1 | **Denial of Service (DOS)** — resource exhaustion, CPU/memory attacks | Handled separately; not actionable in code review |
162
+ | 2 | **Secrets on disk** if otherwise secured | Secrets management is a separate concern |
163
+ | 3 | **Rate limiting** concerns | Operational, not a code vulnerability |
164
+ | 4 | **Memory consumption / CPU exhaustion** | Not actionable in diff review |
165
+ | 5 | **Input validation on non-security-critical fields** without proven exploit path | Theoretical, not concrete |
166
+ | 6 | **GitHub Actions input sanitization** unless clearly triggerable via untrusted input | Most workflow vulns are not exploitable |
167
+ | 7 | **Lack of hardening measures** | Code is not expected to implement all best practices |
168
+ | 8 | **Race conditions / timing attacks** that are theoretical | Only report if concretely problematic |
169
+ | 9 | **Outdated third-party libraries** | Managed separately by dependency scanners |
170
+ | 10 | **Memory safety** in Rust or other memory-safe languages | Impossible by language guarantees |
171
+ | 11 | **Unit test files only** | Not production risk |
172
+ | 12 | **Log spoofing** | Outputting unsanitized input to logs is not a vuln |
173
+ | 13 | **SSRF that only controls path** | Only host/protocol control is exploitable |
174
+ | 14 | **User-controlled content in AI system prompts** | Not a security vulnerability |
175
+ | 15 | **Regex injection** | Injecting untrusted content into regex is not a vuln |
176
+ | 16 | **Regex DOS** | Excluded alongside general DOS |
177
+ | 17 | **Documentation files** (.md, .txt) | Insecure docs are not code vulnerabilities |
178
+ | 18 | **Lack of audit logs** | Not a vulnerability |
179
+
180
+ ## Precedent Rules
181
+
182
+ These guide borderline cases based on prior human review decisions:
183
+
184
+ | # | Precedent | Reasoning |
185
+ |---|-----------|-----------|
186
+ | 1 | **Logging high-value secrets in plaintext IS a vuln.** Logging URLs is safe. | Secrets in logs = credential exposure; URLs are not secrets |
187
+ | 2 | **UUIDs are unguessable** — no validation needed | Cryptographic property of UUID v4/v7 |
188
+ | 3 | **Environment variables and CLI flags are trusted values** | Attackers cannot modify these in secure environments |
189
+ | 4 | **Resource management issues** (memory leaks, fd leaks) are NOT valid | Operational, not security |
190
+ | 5 | **Tabnabbing, XS-Leaks, prototype pollution, open redirects** — do NOT report unless extremely high confidence | Subtle, low-impact, high false-positive rate |
191
+ | 6 | **React/Angular XSS** — safe unless `dangerouslySetInnerHTML`, `bypassSecurityTrustHtml`, etc. | Framework auto-escapes |
192
+ | 7 | **GitHub Action workflow vulns** — verify concrete attack path before reporting | Most are theoretical |
193
+ | 8 | **Client-side JS/TS auth checks** — not a vuln; server is authoritative | Client code is untrusted |
194
+ | 9 | **IPython notebook vulns** — only report if concrete untrusted-input trigger | Most are not exploitable |
195
+ | 10 | **Logging non-PII data** — not a vuln even if sensitive. Only PII/secrets/passwords. | Intent: operational logging vs credential exposure |
196
+ | 11 | **Shell script command injection** — only report if concrete untrusted-input path | Most shell scripts don't process untrusted input |
197
+
198
+ ## Confidence Scoring
199
+
200
+ Findings that survive exclusions get a confidence score (1–10):
201
+
202
+ | Range | Meaning | Action |
203
+ |-------|---------|--------|
204
+ | 9–10 | Certain exploit path, testable | Report as HIGH |
205
+ | 8 | Clear vulnerability pattern | Report as HIGH/MEDIUM |
206
+ | 7 | Suspicious, needs conditions | Report as LOW or suppress |
207
+ | <7 | Too speculative | **Do not report** |
208
+
209
+ **Hard threshold:** Only report findings with confidence ≥ 8.
210
+
211
+ ## Signal Quality Criteria
212
+
213
+ For remaining findings, assess:
214
+ 1. Is there a concrete, exploitable vulnerability with a clear attack path?
215
+ 2. Does this represent a real security risk (vs theoretical best practice)?
216
+ 3. Are there specific code locations and reproduction steps?
217
+ 4. Would this finding be actionable for a security team?
218
+
219
+ ---
220
+
221
+ # Vulnerability Categories — Detection Guidance
222
+
223
+ Each category: vulnerable pattern → safe pattern → code example.
224
+
225
+ ## SQL Injection
226
+
227
+ | Aspect | Detail |
228
+ |--------|--------|
229
+ | **Vulnerable** | String interpolation in SQL queries: `f"SELECT * FROM users WHERE id = {uid}"` |
230
+ | **Safe** | Parameterized queries / ORM: `cursor.execute("SELECT * FROM users WHERE id = %s", (uid,))` |
231
+ | **Look for** | f-strings, `+` concatenation, `format()` in query builders; raw SQL in ORM `.raw()` / `.execute()` |
232
+ | **False-positive guard** | Not a FP if the input is user-controlled (HTTP param, file, env var, CLI arg). Env vars are trusted (see exclusion rules). |
233
+
234
+ ## Cross-Site Scripting (XSS)
235
+
236
+ | Aspect | Detail |
237
+ |--------|--------|
238
+ | **Vulnerable** | `element.innerHTML = userInput`, `dangerouslySetInnerHTML={{__html: userInput}}` |
239
+ | **Safe** | `element.textContent = userInput`, React JSX (auto-escaped), template engines with auto-escaping |
240
+ | **Look for** | `.innerHTML`, `document.write()`, `dangerouslySetInnerHTML`, `v-html` (Vue), `bypassSecurityTrustHtml` (Angular) |
241
+ | **False-positive guard** | React/Angular components without unsafe methods are NOT vulnerable (see exclusion rules). |
242
+
243
+ ## Server-Side Request Forgery (SSRF)
244
+
245
+ | Aspect | Detail |
246
+ |--------|--------|
247
+ | **Vulnerable** | User-controlled URL passed to server-side HTTP client: `requests.get(user_url)` |
248
+ | **Safe** | URL allowlist validation, internal-network blocking, protocol/host restriction |
249
+ | **Look for** | User input → `fetch`, `requests.get`, `axios.get`, `urllib`, `curl`, `http.get`; host control only (path-only is excluded) |
250
+
251
+ ## Command Injection
252
+
253
+ | Aspect | Detail |
254
+ |--------|--------|
255
+ | **Vulnerable** | User input in shell commands: `os.system(f"ping {host}")`, `subprocess.run(f"grep {pattern} file", shell=True)` |
256
+ | **Safe** | `subprocess.run(["ping", host])` with arguments as list; `shlex.quote()` |
257
+ | **Look for** | `shell=True`, `os.system`, `os.popen`, `exec()`, `eval()`, `$()`, backticks |
258
+ | **False-positive guard** | Shell scripts without untrusted user input are generally not exploitable. |
259
+
260
+ ## Authentication/Authorization Bypass
261
+
262
+ | Aspect | Detail |
263
+ |--------|--------|
264
+ | **Vulnerable** | Missing auth check on protected endpoint; JWT without signature verification; hardcoded admin tokens |
265
+ | **Safe** | Consistent auth middleware; JWT with `RS256`/`HS256` verification; role-based access control |
266
+ | **Look for** | Routes without auth decorators; `@login_required` / `@require_auth` missing; JWT without `.verify()`; client-side auth checks only |
267
+
268
+ ## Unsafe Deserialization
269
+
270
+ | Aspect | Detail |
271
+ |--------|--------|
272
+ | **Vulnerable** | `pickle.load(user_data)`, `yaml.load(user_input)`, `JSON.parse()` on untrusted tokens, `eval(input())` |
273
+ | **Safe** | `yaml.safe_load()`, `json.loads()` (safe for JSON), `pickle.load(weights_only=True)` (PyTorch), schema validation |
274
+ | **Look for** | `pickle.load`, `yaml.load` (not safe_load), `torch.load(weights_only=False)`, `eval`, `marshal.load`, `node-serialize` |
275
+
276
+ ## Path Traversal
277
+
278
+ | Aspect | Detail |
279
+ |--------|--------|
280
+ | **Vulnerable** | User input in file paths: `open(f"/data/{filename}")`, `path.join(base, user_path)` |
281
+ | **Safe** | Path normalization + prefix check: `os.path.realpath(path).startswith(BASE_DIR)`; allowlist of valid filenames |
282
+ | **Look for** | `open()`, `read_file()`, `os.path.join` with user input; `../` traversal without normalization |
283
+
284
+ ## Insecure Direct Object Reference (IDOR)
285
+
286
+ | Aspect | Detail |
287
+ |--------|--------|
288
+ | **Vulnerable** | API endpoint uses user-supplied ID without ownership check: `GET /api/order/{order_id}` — returns any user's order |
289
+ | **Safe** | Ownership verification: verify `order.user_id == current_user.id` before returning data |
290
+ | **Look for** | CRUD endpoints that accept IDs without authorization; horizontal/vertical privilege checks missing |
291
+
292
+ ## Weak Cryptography
293
+
294
+ | Aspect | Detail |
295
+ |--------|--------|
296
+ | **Vulnerable** | MD5/SHA1 for passwords; ECB mode; hardcoded keys; `random` module (not `secrets`); short key lengths |
297
+ | **Safe** | `bcrypt`/`argon2` for passwords; AES-GCM; `secrets` module; RSA 2048+; proper IV generation |
298
+ | **Look for** | `md5`, `sha1`, `DES`, `ECB`, `PKCS1_v1_5`, `random` for crypto, hardcoded `key=`, `Crypto.Cipher` without AEAD |
299
+
300
+ ## Secrets Exposure
301
+
302
+ | Aspect | Detail |
303
+ |--------|--------|
304
+ | **Vulnerable** | Hardcoded API keys, passwords, tokens in source code; secrets in logs; secrets in client-side code |
305
+ | **Safe** | Environment variables; secret manager (AWS Secrets Manager, HashiCorp Vault); `.env` excluded from VCS |
306
+ | **Look for** | `API_KEY=`, `password=`, `secret=`, `token=` in code; AWS keys, GitHub tokens, Stripe keys, JWTs in source |
307
+ | **False-positive guard** | Secrets stored on disk but otherwise secured ARE excluded. Logging high-value secrets IS a vuln. Logging URLs is safe. |
308
+
309
+ ## Template Injection (SSTI)
310
+
311
+ | Aspect | Detail |
312
+ |--------|--------|
313
+ | **Vulnerable** | User input in template rendering: `Template(user_input).render()`, `render_template_string(user_input)` |
314
+ | **Safe** | Static templates; input passed as context variable, not template string |
315
+ | **Look for** | `render_template_string`, `Template()()` with user string; `eval` in template context; `${user_input}` in JS template literals on server |
316
+
317
+ ## NoSQL Injection
318
+
319
+ | Aspect | Detail |
320
+ |--------|--------|
321
+ | **Vulnerable** | User input in MongoDB queries: `db.users.find({username: user_input})` where input is `{"$gt": ""}` |
322
+ | **Safe** | Schema validation; type checking on query params; ORM sanitization |
323
+ | **Look for** | MongoDB `$where`, `$gt`, `$regex` from user input; raw mongo queries without type coercion |
@@ -15,10 +15,11 @@ Orchestrates the **build** flow for a single epic: survey → plan tasks → kic
15
15
  >
16
16
  > **HARD GATE** — Not on `main`/`master` before step 3 (kickoff-branch).
17
17
 
18
- ## Eight steps (`epic_cycle` in state.yaml)
18
+ ## Nine steps (`epic_cycle` in state.yaml)
19
19
 
20
20
  | Step | Skill / action |
21
21
  |------|----------------|
22
+ | 0 | `security-review` — threat-model epic scope → `specs/security/epics/<id>/THREAT_MODEL.md` |
22
23
  | 1 | `survey-context` — confirm epic + story |
23
24
  | 2 | `plan-work` — flesh out story `tasks[]` in `specs/epics/eNN-slug/epic.yaml` |
24
25
  | 3 | `kickoff-branch` — feature branch + clean baseline |
@@ -26,17 +27,18 @@ Orchestrates the **build** flow for a single epic: survey → plan tasks → kic
26
27
  | 5 | `verify-work` — UAT + mechanical gates |
27
28
  | 6 | `audit-code` — **non-optional gate** (pass/fail; fail → loop back to step 4) |
28
29
  | 7 | `commit-message` — Conventional Commits draft |
29
- | 8 | `release-branch` — PR or solo land (supports `--squash-state`) |
30
+ | 8 | `release-branch` — PR or solo land (supports `--squash-state`) | |
30
31
 
31
32
  ## Process
32
33
 
33
34
  1. Read `specs/state.yaml`, `specs/execution-status.yaml`, `specs/release-plan.yaml`, active `specs/epics/eNN-slug/epic.yaml`.
34
- 2. **Assess Impact (Step 2):** Before writing tasks, run `assess-impact --lightweight` on the proposed change. If the risk score exceeds 7, gate — require a `grill-me` session. Write the impact report to `specs/IMPACT-<epic>-<story>.md`. For net-new code with no existing dependents, skip.
35
- 3. **BCP Tracking (Step 2):** After `plan-work` completes, read the `bcps:` count (Business Complexity Points story size) from the epic capsule and carry it into `state.yaml` as `epic_cycle.story_bcps = N`.
36
- 3. If `epic_cycle.step` missing, set to `1`.
37
- 4. Run **only the current step** (resume mode) unless user asked for full auto-run.
38
- 5. After step verify passes, increment `epic_cycle.step` in `state.yaml` (or `bash scripts/bp-yaml-set.sh` if available).
39
- 6. On story complete, set `execution-status.yaml` story key to `done`; run `bash scripts/sync-status-from-epics.sh`.
35
+ 2. **Step 0 Threat Model:** Run `security-review` against the epic's scope (read from the epic capsule). Output `specs/security/epics/<epic-id>/THREAT_MODEL.md` with surface area, vulnerability categories, risk level, and mitigation guidance.
36
+ 3. **Assess Impact (Step 2):** Before writing tasks, run `assess-impact --lightweight` on the proposed change. If the risk score exceeds 7, gate require a `grill-me` session. Write the impact report to `specs/IMPACT-<epic>-<story>.md`. For net-new code with no existing dependents, skip.
37
+ 4. **BCP Tracking (Step 2):** After `plan-work` completes, read the `bcps:` count (Business Complexity Points story size) from the epic capsule and carry it into `state.yaml` as `epic_cycle.story_bcps = N`.
38
+ 5. If `epic_cycle.step` missing, set to `1`.
39
+ 6. Run **only the current step** (resume mode) unless user asked for full auto-run.
40
+ 7. After step verify passes, increment `epic_cycle.step` in `state.yaml` (or `bash scripts/bp-yaml-set.sh` if available).
41
+ 8. On story complete, set `execution-status.yaml` story key to `done`; run `bash scripts/sync-status-from-epics.sh`.
40
42
 
41
43
  ### Step 6 — audit-code gate (non-optional)
42
44
 
@@ -0,0 +1,324 @@
1
+ ---
2
+ name: security-review
3
+ description: "> AI-powered security analysis of code changes — traces data flow, detects injection, auth bypass, secrets exposure, and unsafe deserialization across files. Use when reviewing pending changes, before release-branch, during verify-work Phase 5, during build-epic Step 0 threat modeling, or when the user says \"security review\" or \"scan for vulns\"."
4
+ ---
5
+
6
+
7
+ # Security Review
8
+
9
+ > **HARD GATE** — Requires git context (branch with merge-base or diff). Never
10
+ > writes files outside `specs/security/`. Findings below confidence 8/10 are
11
+ > suppressed. **→ verify:** `git rev-parse HEAD >/dev/null 2>&1 && echo "ok" || echo "BLOCKED"`
12
+
13
+ ## 5-phase scan
14
+
15
+ | # | Phase | What |
16
+ |---|-------|------|
17
+ | 1 | **Scope Resolution** | Detect diff via `git diff --merge-base origin/HEAD`; resolve languages/frameworks from dependency files |
18
+ | 2 | **Context Research** | Identify existing security patterns, sanitization, auth model in the codebase |
19
+ | 3 | **Vulnerability Assessment** | Trace user input → sink; check auth boundaries, crypto, deserialization, path ops |
20
+ | 4 | **False-Positive Filtering** | Cross-check each finding against exclusion rules; reject confidence < 8 |
21
+ | 5 | **Report Generation** | Output structured markdown: file:line, severity, category, exploit scenario, fix |
22
+
23
+ ## Categories
24
+
25
+ Covered: SQLi, XSS, SSRF, command injection, auth bypass, unsafe deserialization, path traversal, IDOR, crypto flaws, secrets exposure, template injection, NoSQLi
26
+
27
+ ## Integration points
28
+
29
+ | Skill | Touchpoint |
30
+ |-------|------------|
31
+ | `build-epic` | Step 0 — threat-model epic scope → `specs/security/epics/<id>/THREAT_MODEL.md` |
32
+ | `plan-work` | `security:` field (none/low/medium/high) on story tasks |
33
+ | `plan-release` | +2 WSJF risk boost for HIGH+ risk epics |
34
+ | `audit-code` | Checklist: "diff scanned — no unaddressed HIGH findings" |
35
+ | `request-review` | Inject threat model categories + false-positive rules into reviewer prompt |
36
+ | `investigate-bug` | Security-impact assessment in RCA (NONE→CRITICAL) |
37
+ | `validate-fix` | Recurrence hardening check for security bugs |
38
+ | `verify-work` | Phase 5 — blocks on HIGH findings ≥ 8 confidence |
39
+ | `release-branch` | Hard gate — blocks merge if unresolved HIGH findings |
40
+
41
+ ## Report format
42
+
43
+ Each finding: **`File:Line` — Severity — Category**
44
+ - Description: how the vulnerability manifests
45
+ - Exploit scenario: concrete attack path
46
+ - Recommendation: fix with code example
47
+
48
+ ## Reference files
49
+
50
+ - [Vuln categories](REFERENCE-vuln-categories.md) — detection guidance per vuln type
51
+ - [False positives](REFERENCE-false-positives.md) — hard exclusions + precedent
52
+ - [Confidence rubric](REFERENCE-confidence-rubric.md) — scoring methodology (0–10)
53
+
54
+ ## Verify
55
+
56
+ ```bash
57
+ test -d specs/security && echo "OK: specs/security/ exists" || mkdir -p specs/security
58
+ grep -q "Merge-base\|merge.base\|git diff" SKILL.md && echo "OK: git context verified"
59
+ ```
60
+
61
+ ---
62
+
63
+ # Confidence Scoring Rubric
64
+
65
+ Every finding that survives Phase 4 false-positive filtering receives a confidence
66
+ score from 1 (speculative) to 10 (certain). Only findings ≥ 8 are reported.
67
+
68
+ ## Score 9–10: Certain Exploit Path
69
+
70
+ **Criteria:**
71
+ - Concrete, testable exploit with clear reproduction steps
72
+ - No assumptions about uncommon configurations
73
+ - No chain of multiple unlikely conditions
74
+ - Attacker has full control over the input vector
75
+
76
+ **Examples:**
77
+ - User-supplied SQL in a `SELECT` statement with no parameterization
78
+ - `os.system(f"rm {user_path}")` where user controls the path
79
+ - Pickle deserialization of user-supplied data without any wrapping
80
+
81
+ **Severity:** HIGH
82
+
83
+ ## Score 8: Clear Vulnerability Pattern
84
+
85
+ **Criteria:**
86
+ - Well-known vulnerability pattern with standard exploitation method
87
+ - Requires specific conditions but conditions are commonly met
88
+ - Exploitability is well-documented in OWASP / CVE databases
89
+
90
+ **Examples:**
91
+ - JWT without signature verification in authentication middleware
92
+ - SSRF where attacker controls the full URL including host
93
+ - Hardcoded AWS secret key in source code
94
+
95
+ **Severity:** HIGH or MEDIUM
96
+
97
+ ## Score 7: Suspicious Pattern
98
+
99
+ **Criteria:**
100
+ - Unusual code that may indicate a vulnerability
101
+ - Requires specific conditions that may not be present
102
+ - Alternative secure interpretation is equally likely
103
+ - Defense-in-depth concern rather than direct exploit
104
+
105
+ **Examples:**
106
+ - A function accepting user input that passes through multiple layers before reaching a sink (unclear if sanitized)
107
+ - Custom encryption implementation (likely weak, but may not process sensitive data)
108
+ - Path construction that looks safe but has a subtle bypass
109
+
110
+ **Severity:** LOW or suppress
111
+
112
+ ## Score < 7: Do Not Report
113
+
114
+ **Criteria:**
115
+ - Theoretical concern without exploit path
116
+ - Requires unrealistic attacker capabilities
117
+ - Violates one or more hard exclusion rules
118
+ - Better handled by separate tooling (dependency scanner, SAST, secret scanner)
119
+ - Purely stylistic or best-practice concern without security impact
120
+
121
+ **Examples:**
122
+ - "This function doesn't validate all inputs" without proving the validated input is the attack surface
123
+ - "This uses MD5" where the hash is not used for security (e.g., cache key)
124
+ - "This function could consume too much memory" (DOS exclusion)
125
+
126
+ **Action:** Suppress entirely. Do not include in report.
127
+
128
+ ## Severity Mapping
129
+
130
+ Once confidence ≥ 8 is confirmed, map to severity:
131
+
132
+ | Severity | Impact | Examples |
133
+ |----------|--------|---------|
134
+ | **CRITICAL** | Remote compromise, full data breach | RCE, auth bypass with admin escalation, SQLi with data exfiltration |
135
+ | **HIGH** | Significant security boundary crossed | SSRF to internal services, hardcoded cloud credentials, insecure deserialization |
136
+ | **MEDIUM** | Limited impact or requires conditions | Stored XSS behind auth, IDOR on non-sensitive data, weak but not broken crypto |
137
+ | **LOW** | Defense-in-depth, minimal blast radius | Missing security header, verbose error messages in non-production |
138
+
139
+ ## Quality Gate
140
+
141
+ The confidence rubric double-checks each finding against three lenses:
142
+
143
+ | Lens | Question |
144
+ |------|----------|
145
+ | **Exploitability** | Can a real attacker trigger this from a trust boundary? |
146
+ | **Actionability** | Would a security engineer accept a fix recommendation for this? |
147
+ | **Precedent** | Has this type of finding passed/failed human review before? |
148
+
149
+ ---
150
+
151
+ # False-Positive Exclusion Rules
152
+
153
+ Applied during Phase 4 of the scan. Findings matching any hard exclusion are
154
+ automatically suppressed. Precedents from prior reviews guide borderline cases.
155
+
156
+ ## Hard Exclusions
157
+
158
+ Automatically exclude findings matching these patterns:
159
+
160
+ | # | Rule | Rationale |
161
+ |---|------|-----------|
162
+ | 1 | **Denial of Service (DOS)** — resource exhaustion, CPU/memory attacks | Handled separately; not actionable in code review |
163
+ | 2 | **Secrets on disk** if otherwise secured | Secrets management is a separate concern |
164
+ | 3 | **Rate limiting** concerns | Operational, not a code vulnerability |
165
+ | 4 | **Memory consumption / CPU exhaustion** | Not actionable in diff review |
166
+ | 5 | **Input validation on non-security-critical fields** without proven exploit path | Theoretical, not concrete |
167
+ | 6 | **GitHub Actions input sanitization** unless clearly triggerable via untrusted input | Most workflow vulns are not exploitable |
168
+ | 7 | **Lack of hardening measures** | Code is not expected to implement all best practices |
169
+ | 8 | **Race conditions / timing attacks** that are theoretical | Only report if concretely problematic |
170
+ | 9 | **Outdated third-party libraries** | Managed separately by dependency scanners |
171
+ | 10 | **Memory safety** in Rust or other memory-safe languages | Impossible by language guarantees |
172
+ | 11 | **Unit test files only** | Not production risk |
173
+ | 12 | **Log spoofing** | Outputting unsanitized input to logs is not a vuln |
174
+ | 13 | **SSRF that only controls path** | Only host/protocol control is exploitable |
175
+ | 14 | **User-controlled content in AI system prompts** | Not a security vulnerability |
176
+ | 15 | **Regex injection** | Injecting untrusted content into regex is not a vuln |
177
+ | 16 | **Regex DOS** | Excluded alongside general DOS |
178
+ | 17 | **Documentation files** (.md, .txt) | Insecure docs are not code vulnerabilities |
179
+ | 18 | **Lack of audit logs** | Not a vulnerability |
180
+
181
+ ## Precedent Rules
182
+
183
+ These guide borderline cases based on prior human review decisions:
184
+
185
+ | # | Precedent | Reasoning |
186
+ |---|-----------|-----------|
187
+ | 1 | **Logging high-value secrets in plaintext IS a vuln.** Logging URLs is safe. | Secrets in logs = credential exposure; URLs are not secrets |
188
+ | 2 | **UUIDs are unguessable** — no validation needed | Cryptographic property of UUID v4/v7 |
189
+ | 3 | **Environment variables and CLI flags are trusted values** | Attackers cannot modify these in secure environments |
190
+ | 4 | **Resource management issues** (memory leaks, fd leaks) are NOT valid | Operational, not security |
191
+ | 5 | **Tabnabbing, XS-Leaks, prototype pollution, open redirects** — do NOT report unless extremely high confidence | Subtle, low-impact, high false-positive rate |
192
+ | 6 | **React/Angular XSS** — safe unless `dangerouslySetInnerHTML`, `bypassSecurityTrustHtml`, etc. | Framework auto-escapes |
193
+ | 7 | **GitHub Action workflow vulns** — verify concrete attack path before reporting | Most are theoretical |
194
+ | 8 | **Client-side JS/TS auth checks** — not a vuln; server is authoritative | Client code is untrusted |
195
+ | 9 | **IPython notebook vulns** — only report if concrete untrusted-input trigger | Most are not exploitable |
196
+ | 10 | **Logging non-PII data** — not a vuln even if sensitive. Only PII/secrets/passwords. | Intent: operational logging vs credential exposure |
197
+ | 11 | **Shell script command injection** — only report if concrete untrusted-input path | Most shell scripts don't process untrusted input |
198
+
199
+ ## Confidence Scoring
200
+
201
+ Findings that survive exclusions get a confidence score (1–10):
202
+
203
+ | Range | Meaning | Action |
204
+ |-------|---------|--------|
205
+ | 9–10 | Certain exploit path, testable | Report as HIGH |
206
+ | 8 | Clear vulnerability pattern | Report as HIGH/MEDIUM |
207
+ | 7 | Suspicious, needs conditions | Report as LOW or suppress |
208
+ | <7 | Too speculative | **Do not report** |
209
+
210
+ **Hard threshold:** Only report findings with confidence ≥ 8.
211
+
212
+ ## Signal Quality Criteria
213
+
214
+ For remaining findings, assess:
215
+ 1. Is there a concrete, exploitable vulnerability with a clear attack path?
216
+ 2. Does this represent a real security risk (vs theoretical best practice)?
217
+ 3. Are there specific code locations and reproduction steps?
218
+ 4. Would this finding be actionable for a security team?
219
+
220
+ ---
221
+
222
+ # Vulnerability Categories — Detection Guidance
223
+
224
+ Each category: vulnerable pattern → safe pattern → code example.
225
+
226
+ ## SQL Injection
227
+
228
+ | Aspect | Detail |
229
+ |--------|--------|
230
+ | **Vulnerable** | String interpolation in SQL queries: `f"SELECT * FROM users WHERE id = {uid}"` |
231
+ | **Safe** | Parameterized queries / ORM: `cursor.execute("SELECT * FROM users WHERE id = %s", (uid,))` |
232
+ | **Look for** | f-strings, `+` concatenation, `format()` in query builders; raw SQL in ORM `.raw()` / `.execute()` |
233
+ | **False-positive guard** | Not a FP if the input is user-controlled (HTTP param, file, env var, CLI arg). Env vars are trusted (see exclusion rules). |
234
+
235
+ ## Cross-Site Scripting (XSS)
236
+
237
+ | Aspect | Detail |
238
+ |--------|--------|
239
+ | **Vulnerable** | `element.innerHTML = userInput`, `dangerouslySetInnerHTML={{__html: userInput}}` |
240
+ | **Safe** | `element.textContent = userInput`, React JSX (auto-escaped), template engines with auto-escaping |
241
+ | **Look for** | `.innerHTML`, `document.write()`, `dangerouslySetInnerHTML`, `v-html` (Vue), `bypassSecurityTrustHtml` (Angular) |
242
+ | **False-positive guard** | React/Angular components without unsafe methods are NOT vulnerable (see exclusion rules). |
243
+
244
+ ## Server-Side Request Forgery (SSRF)
245
+
246
+ | Aspect | Detail |
247
+ |--------|--------|
248
+ | **Vulnerable** | User-controlled URL passed to server-side HTTP client: `requests.get(user_url)` |
249
+ | **Safe** | URL allowlist validation, internal-network blocking, protocol/host restriction |
250
+ | **Look for** | User input → `fetch`, `requests.get`, `axios.get`, `urllib`, `curl`, `http.get`; host control only (path-only is excluded) |
251
+
252
+ ## Command Injection
253
+
254
+ | Aspect | Detail |
255
+ |--------|--------|
256
+ | **Vulnerable** | User input in shell commands: `os.system(f"ping {host}")`, `subprocess.run(f"grep {pattern} file", shell=True)` |
257
+ | **Safe** | `subprocess.run(["ping", host])` with arguments as list; `shlex.quote()` |
258
+ | **Look for** | `shell=True`, `os.system`, `os.popen`, `exec()`, `eval()`, `$()`, backticks |
259
+ | **False-positive guard** | Shell scripts without untrusted user input are generally not exploitable. |
260
+
261
+ ## Authentication/Authorization Bypass
262
+
263
+ | Aspect | Detail |
264
+ |--------|--------|
265
+ | **Vulnerable** | Missing auth check on protected endpoint; JWT without signature verification; hardcoded admin tokens |
266
+ | **Safe** | Consistent auth middleware; JWT with `RS256`/`HS256` verification; role-based access control |
267
+ | **Look for** | Routes without auth decorators; `@login_required` / `@require_auth` missing; JWT without `.verify()`; client-side auth checks only |
268
+
269
+ ## Unsafe Deserialization
270
+
271
+ | Aspect | Detail |
272
+ |--------|--------|
273
+ | **Vulnerable** | `pickle.load(user_data)`, `yaml.load(user_input)`, `JSON.parse()` on untrusted tokens, `eval(input())` |
274
+ | **Safe** | `yaml.safe_load()`, `json.loads()` (safe for JSON), `pickle.load(weights_only=True)` (PyTorch), schema validation |
275
+ | **Look for** | `pickle.load`, `yaml.load` (not safe_load), `torch.load(weights_only=False)`, `eval`, `marshal.load`, `node-serialize` |
276
+
277
+ ## Path Traversal
278
+
279
+ | Aspect | Detail |
280
+ |--------|--------|
281
+ | **Vulnerable** | User input in file paths: `open(f"/data/{filename}")`, `path.join(base, user_path)` |
282
+ | **Safe** | Path normalization + prefix check: `os.path.realpath(path).startswith(BASE_DIR)`; allowlist of valid filenames |
283
+ | **Look for** | `open()`, `read_file()`, `os.path.join` with user input; `../` traversal without normalization |
284
+
285
+ ## Insecure Direct Object Reference (IDOR)
286
+
287
+ | Aspect | Detail |
288
+ |--------|--------|
289
+ | **Vulnerable** | API endpoint uses user-supplied ID without ownership check: `GET /api/order/{order_id}` — returns any user's order |
290
+ | **Safe** | Ownership verification: verify `order.user_id == current_user.id` before returning data |
291
+ | **Look for** | CRUD endpoints that accept IDs without authorization; horizontal/vertical privilege checks missing |
292
+
293
+ ## Weak Cryptography
294
+
295
+ | Aspect | Detail |
296
+ |--------|--------|
297
+ | **Vulnerable** | MD5/SHA1 for passwords; ECB mode; hardcoded keys; `random` module (not `secrets`); short key lengths |
298
+ | **Safe** | `bcrypt`/`argon2` for passwords; AES-GCM; `secrets` module; RSA 2048+; proper IV generation |
299
+ | **Look for** | `md5`, `sha1`, `DES`, `ECB`, `PKCS1_v1_5`, `random` for crypto, hardcoded `key=`, `Crypto.Cipher` without AEAD |
300
+
301
+ ## Secrets Exposure
302
+
303
+ | Aspect | Detail |
304
+ |--------|--------|
305
+ | **Vulnerable** | Hardcoded API keys, passwords, tokens in source code; secrets in logs; secrets in client-side code |
306
+ | **Safe** | Environment variables; secret manager (AWS Secrets Manager, HashiCorp Vault); `.env` excluded from VCS |
307
+ | **Look for** | `API_KEY=`, `password=`, `secret=`, `token=` in code; AWS keys, GitHub tokens, Stripe keys, JWTs in source |
308
+ | **False-positive guard** | Secrets stored on disk but otherwise secured ARE excluded. Logging high-value secrets IS a vuln. Logging URLs is safe. |
309
+
310
+ ## Template Injection (SSTI)
311
+
312
+ | Aspect | Detail |
313
+ |--------|--------|
314
+ | **Vulnerable** | User input in template rendering: `Template(user_input).render()`, `render_template_string(user_input)` |
315
+ | **Safe** | Static templates; input passed as context variable, not template string |
316
+ | **Look for** | `render_template_string`, `Template()()` with user string; `eval` in template context; `${user_input}` in JS template literals on server |
317
+
318
+ ## NoSQL Injection
319
+
320
+ | Aspect | Detail |
321
+ |--------|--------|
322
+ | **Vulnerable** | User input in MongoDB queries: `db.users.find({username: user_input})` where input is `{"$gt": ""}` |
323
+ | **Safe** | Schema validation; type checking on query params; ORM sanitization |
324
+ | **Look for** | MongoDB `$where`, `$gt`, `$regex` from user input; raw mongo queries without type coercion |
package/CHANGELOG.md CHANGED
@@ -1,3 +1,17 @@
1
+ # [2.36.0](https://github.com/danielvm-git/bigpowers/compare/v2.35.0...v2.36.0) (2026-06-27)
2
+
3
+
4
+ ### Features
5
+
6
+ * **build-epic:** add Step 0 threat model to epic cycle ([9977f3e](https://github.com/danielvm-git/bigpowers/commit/9977f3e491b9cdefef644e145fa2331fe0ce1154))
7
+
8
+ # [2.35.0](https://github.com/danielvm-git/bigpowers/compare/v2.34.2...v2.35.0) (2026-06-27)
9
+
10
+
11
+ ### Features
12
+
13
+ * **security-review:** add security-review skill with lifecycle integration ([932171a](https://github.com/danielvm-git/bigpowers/commit/932171a6526c4f9465c0d3768877dd2ad7775917))
14
+
1
15
  ## [2.34.2](https://github.com/danielvm-git/bigpowers/compare/v2.34.1...v2.34.2) (2026-06-27)
2
16
 
3
17
 
package/SKILL-INDEX.md CHANGED
@@ -3,8 +3,8 @@
3
3
  > **DO NOT EDIT** — This file is auto-generated by `scripts/generate-skill-index.sh`.
4
4
  > Edit `SKILL.md` source files or `skills-lock.json` instead. Run `bash scripts/sync-skills.sh` to regenerate.
5
5
 
6
- **Generated:** 2026-06-27T00:46:04Z
7
- **Skills:** 70
6
+ **Generated:** 2026-06-27T16:37:32Z
7
+ **Skills:** 71
8
8
 
9
9
  ---
10
10
 
@@ -14,10 +14,11 @@ Orchestrates the **build** flow for a single epic: survey → plan tasks → kic
14
14
  >
15
15
  > **HARD GATE** — Not on `main`/`master` before step 3 (kickoff-branch).
16
16
 
17
- ## Eight steps (`epic_cycle` in state.yaml)
17
+ ## Nine steps (`epic_cycle` in state.yaml)
18
18
 
19
19
  | Step | Skill / action |
20
20
  |------|----------------|
21
+ | 0 | `security-review` — threat-model epic scope → `specs/security/epics/<id>/THREAT_MODEL.md` |
21
22
  | 1 | `survey-context` — confirm epic + story |
22
23
  | 2 | `plan-work` — flesh out story `tasks[]` in `specs/epics/eNN-slug/epic.yaml` |
23
24
  | 3 | `kickoff-branch` — feature branch + clean baseline |
@@ -25,17 +26,18 @@ Orchestrates the **build** flow for a single epic: survey → plan tasks → kic
25
26
  | 5 | `verify-work` — UAT + mechanical gates |
26
27
  | 6 | `audit-code` — **non-optional gate** (pass/fail; fail → loop back to step 4) |
27
28
  | 7 | `commit-message` — Conventional Commits draft |
28
- | 8 | `release-branch` — PR or solo land (supports `--squash-state`) |
29
+ | 8 | `release-branch` — PR or solo land (supports `--squash-state`) | |
29
30
 
30
31
  ## Process
31
32
 
32
33
  1. Read `specs/state.yaml`, `specs/execution-status.yaml`, `specs/release-plan.yaml`, active `specs/epics/eNN-slug/epic.yaml`.
33
- 2. **Assess Impact (Step 2):** Before writing tasks, run `assess-impact --lightweight` on the proposed change. If the risk score exceeds 7, gate — require a `grill-me` session. Write the impact report to `specs/IMPACT-<epic>-<story>.md`. For net-new code with no existing dependents, skip.
34
- 3. **BCP Tracking (Step 2):** After `plan-work` completes, read the `bcps:` count (Business Complexity Points story size) from the epic capsule and carry it into `state.yaml` as `epic_cycle.story_bcps = N`.
35
- 3. If `epic_cycle.step` missing, set to `1`.
36
- 4. Run **only the current step** (resume mode) unless user asked for full auto-run.
37
- 5. After step verify passes, increment `epic_cycle.step` in `state.yaml` (or `bash scripts/bp-yaml-set.sh` if available).
38
- 6. On story complete, set `execution-status.yaml` story key to `done`; run `bash scripts/sync-status-from-epics.sh`.
34
+ 2. **Step 0 Threat Model:** Run `security-review` against the epic's scope (read from the epic capsule). Output `specs/security/epics/<epic-id>/THREAT_MODEL.md` with surface area, vulnerability categories, risk level, and mitigation guidance.
35
+ 3. **Assess Impact (Step 2):** Before writing tasks, run `assess-impact --lightweight` on the proposed change. If the risk score exceeds 7, gate require a `grill-me` session. Write the impact report to `specs/IMPACT-<epic>-<story>.md`. For net-new code with no existing dependents, skip.
36
+ 4. **BCP Tracking (Step 2):** After `plan-work` completes, read the `bcps:` count (Business Complexity Points story size) from the epic capsule and carry it into `state.yaml` as `epic_cycle.story_bcps = N`.
37
+ 5. If `epic_cycle.step` missing, set to `1`.
38
+ 6. Run **only the current step** (resume mode) unless user asked for full auto-run.
39
+ 7. After step verify passes, increment `epic_cycle.step` in `state.yaml` (or `bash scripts/bp-yaml-set.sh` if available).
40
+ 8. On story complete, set `execution-status.yaml` story key to `done`; run `bash scripts/sync-status-from-epics.sh`.
39
41
 
40
42
  ### Step 6 — audit-code gate (non-optional)
41
43
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "bigpowers",
3
- "version": "2.34.2",
3
+ "version": "2.36.0",
4
4
  "description": "70 agent skills for spec-driven, test-first software development by solo developers",
5
5
  "main": "index.js",
6
6
  "scripts": {
@@ -0,0 +1,85 @@
1
+ # Confidence Scoring Rubric
2
+
3
+ Every finding that survives Phase 4 false-positive filtering receives a confidence
4
+ score from 1 (speculative) to 10 (certain). Only findings ≥ 8 are reported.
5
+
6
+ ## Score 9–10: Certain Exploit Path
7
+
8
+ **Criteria:**
9
+ - Concrete, testable exploit with clear reproduction steps
10
+ - No assumptions about uncommon configurations
11
+ - No chain of multiple unlikely conditions
12
+ - Attacker has full control over the input vector
13
+
14
+ **Examples:**
15
+ - User-supplied SQL in a `SELECT` statement with no parameterization
16
+ - `os.system(f"rm {user_path}")` where user controls the path
17
+ - Pickle deserialization of user-supplied data without any wrapping
18
+
19
+ **Severity:** HIGH
20
+
21
+ ## Score 8: Clear Vulnerability Pattern
22
+
23
+ **Criteria:**
24
+ - Well-known vulnerability pattern with standard exploitation method
25
+ - Requires specific conditions but conditions are commonly met
26
+ - Exploitability is well-documented in OWASP / CVE databases
27
+
28
+ **Examples:**
29
+ - JWT without signature verification in authentication middleware
30
+ - SSRF where attacker controls the full URL including host
31
+ - Hardcoded AWS secret key in source code
32
+
33
+ **Severity:** HIGH or MEDIUM
34
+
35
+ ## Score 7: Suspicious Pattern
36
+
37
+ **Criteria:**
38
+ - Unusual code that may indicate a vulnerability
39
+ - Requires specific conditions that may not be present
40
+ - Alternative secure interpretation is equally likely
41
+ - Defense-in-depth concern rather than direct exploit
42
+
43
+ **Examples:**
44
+ - A function accepting user input that passes through multiple layers before reaching a sink (unclear if sanitized)
45
+ - Custom encryption implementation (likely weak, but may not process sensitive data)
46
+ - Path construction that looks safe but has a subtle bypass
47
+
48
+ **Severity:** LOW or suppress
49
+
50
+ ## Score < 7: Do Not Report
51
+
52
+ **Criteria:**
53
+ - Theoretical concern without exploit path
54
+ - Requires unrealistic attacker capabilities
55
+ - Violates one or more hard exclusion rules
56
+ - Better handled by separate tooling (dependency scanner, SAST, secret scanner)
57
+ - Purely stylistic or best-practice concern without security impact
58
+
59
+ **Examples:**
60
+ - "This function doesn't validate all inputs" without proving the validated input is the attack surface
61
+ - "This uses MD5" where the hash is not used for security (e.g., cache key)
62
+ - "This function could consume too much memory" (DOS exclusion)
63
+
64
+ **Action:** Suppress entirely. Do not include in report.
65
+
66
+ ## Severity Mapping
67
+
68
+ Once confidence ≥ 8 is confirmed, map to severity:
69
+
70
+ | Severity | Impact | Examples |
71
+ |----------|--------|---------|
72
+ | **CRITICAL** | Remote compromise, full data breach | RCE, auth bypass with admin escalation, SQLi with data exfiltration |
73
+ | **HIGH** | Significant security boundary crossed | SSRF to internal services, hardcoded cloud credentials, insecure deserialization |
74
+ | **MEDIUM** | Limited impact or requires conditions | Stored XSS behind auth, IDOR on non-sensitive data, weak but not broken crypto |
75
+ | **LOW** | Defense-in-depth, minimal blast radius | Missing security header, verbose error messages in non-production |
76
+
77
+ ## Quality Gate
78
+
79
+ The confidence rubric double-checks each finding against three lenses:
80
+
81
+ | Lens | Question |
82
+ |------|----------|
83
+ | **Exploitability** | Can a real attacker trigger this from a trust boundary? |
84
+ | **Actionability** | Would a security engineer accept a fix recommendation for this? |
85
+ | **Precedent** | Has this type of finding passed/failed human review before? |
@@ -0,0 +1,68 @@
1
+ # False-Positive Exclusion Rules
2
+
3
+ Applied during Phase 4 of the scan. Findings matching any hard exclusion are
4
+ automatically suppressed. Precedents from prior reviews guide borderline cases.
5
+
6
+ ## Hard Exclusions
7
+
8
+ Automatically exclude findings matching these patterns:
9
+
10
+ | # | Rule | Rationale |
11
+ |---|------|-----------|
12
+ | 1 | **Denial of Service (DOS)** — resource exhaustion, CPU/memory attacks | Handled separately; not actionable in code review |
13
+ | 2 | **Secrets on disk** if otherwise secured | Secrets management is a separate concern |
14
+ | 3 | **Rate limiting** concerns | Operational, not a code vulnerability |
15
+ | 4 | **Memory consumption / CPU exhaustion** | Not actionable in diff review |
16
+ | 5 | **Input validation on non-security-critical fields** without proven exploit path | Theoretical, not concrete |
17
+ | 6 | **GitHub Actions input sanitization** unless clearly triggerable via untrusted input | Most workflow vulns are not exploitable |
18
+ | 7 | **Lack of hardening measures** | Code is not expected to implement all best practices |
19
+ | 8 | **Race conditions / timing attacks** that are theoretical | Only report if concretely problematic |
20
+ | 9 | **Outdated third-party libraries** | Managed separately by dependency scanners |
21
+ | 10 | **Memory safety** in Rust or other memory-safe languages | Impossible by language guarantees |
22
+ | 11 | **Unit test files only** | Not production risk |
23
+ | 12 | **Log spoofing** | Outputting unsanitized input to logs is not a vuln |
24
+ | 13 | **SSRF that only controls path** | Only host/protocol control is exploitable |
25
+ | 14 | **User-controlled content in AI system prompts** | Not a security vulnerability |
26
+ | 15 | **Regex injection** | Injecting untrusted content into regex is not a vuln |
27
+ | 16 | **Regex DOS** | Excluded alongside general DOS |
28
+ | 17 | **Documentation files** (.md, .txt) | Insecure docs are not code vulnerabilities |
29
+ | 18 | **Lack of audit logs** | Not a vulnerability |
30
+
31
+ ## Precedent Rules
32
+
33
+ These guide borderline cases based on prior human review decisions:
34
+
35
+ | # | Precedent | Reasoning |
36
+ |---|-----------|-----------|
37
+ | 1 | **Logging high-value secrets in plaintext IS a vuln.** Logging URLs is safe. | Secrets in logs = credential exposure; URLs are not secrets |
38
+ | 2 | **UUIDs are unguessable** — no validation needed | Cryptographic property of UUID v4/v7 |
39
+ | 3 | **Environment variables and CLI flags are trusted values** | Attackers cannot modify these in secure environments |
40
+ | 4 | **Resource management issues** (memory leaks, fd leaks) are NOT valid | Operational, not security |
41
+ | 5 | **Tabnabbing, XS-Leaks, prototype pollution, open redirects** — do NOT report unless extremely high confidence | Subtle, low-impact, high false-positive rate |
42
+ | 6 | **React/Angular XSS** — safe unless `dangerouslySetInnerHTML`, `bypassSecurityTrustHtml`, etc. | Framework auto-escapes |
43
+ | 7 | **GitHub Action workflow vulns** — verify concrete attack path before reporting | Most are theoretical |
44
+ | 8 | **Client-side JS/TS auth checks** — not a vuln; server is authoritative | Client code is untrusted |
45
+ | 9 | **IPython notebook vulns** — only report if concrete untrusted-input trigger | Most are not exploitable |
46
+ | 10 | **Logging non-PII data** — not a vuln even if sensitive. Only PII/secrets/passwords. | Intent: operational logging vs credential exposure |
47
+ | 11 | **Shell script command injection** — only report if concrete untrusted-input path | Most shell scripts don't process untrusted input |
48
+
49
+ ## Confidence Scoring
50
+
51
+ Findings that survive exclusions get a confidence score (1–10):
52
+
53
+ | Range | Meaning | Action |
54
+ |-------|---------|--------|
55
+ | 9–10 | Certain exploit path, testable | Report as HIGH |
56
+ | 8 | Clear vulnerability pattern | Report as HIGH/MEDIUM |
57
+ | 7 | Suspicious, needs conditions | Report as LOW or suppress |
58
+ | <7 | Too speculative | **Do not report** |
59
+
60
+ **Hard threshold:** Only report findings with confidence ≥ 8.
61
+
62
+ ## Signal Quality Criteria
63
+
64
+ For remaining findings, assess:
65
+ 1. Is there a concrete, exploitable vulnerability with a clear attack path?
66
+ 2. Does this represent a real security risk (vs theoretical best practice)?
67
+ 3. Are there specific code locations and reproduction steps?
68
+ 4. Would this finding be actionable for a security team?
@@ -0,0 +1,103 @@
1
+ # Vulnerability Categories — Detection Guidance
2
+
3
+ Each category: vulnerable pattern → safe pattern → code example.
4
+
5
+ ## SQL Injection
6
+
7
+ | Aspect | Detail |
8
+ |--------|--------|
9
+ | **Vulnerable** | String interpolation in SQL queries: `f"SELECT * FROM users WHERE id = {uid}"` |
10
+ | **Safe** | Parameterized queries / ORM: `cursor.execute("SELECT * FROM users WHERE id = %s", (uid,))` |
11
+ | **Look for** | f-strings, `+` concatenation, `format()` in query builders; raw SQL in ORM `.raw()` / `.execute()` |
12
+ | **False-positive guard** | Not a FP if the input is user-controlled (HTTP param, file, env var, CLI arg). Env vars are trusted (see exclusion rules). |
13
+
14
+ ## Cross-Site Scripting (XSS)
15
+
16
+ | Aspect | Detail |
17
+ |--------|--------|
18
+ | **Vulnerable** | `element.innerHTML = userInput`, `dangerouslySetInnerHTML={{__html: userInput}}` |
19
+ | **Safe** | `element.textContent = userInput`, React JSX (auto-escaped), template engines with auto-escaping |
20
+ | **Look for** | `.innerHTML`, `document.write()`, `dangerouslySetInnerHTML`, `v-html` (Vue), `bypassSecurityTrustHtml` (Angular) |
21
+ | **False-positive guard** | React/Angular components without unsafe methods are NOT vulnerable (see exclusion rules). |
22
+
23
+ ## Server-Side Request Forgery (SSRF)
24
+
25
+ | Aspect | Detail |
26
+ |--------|--------|
27
+ | **Vulnerable** | User-controlled URL passed to server-side HTTP client: `requests.get(user_url)` |
28
+ | **Safe** | URL allowlist validation, internal-network blocking, protocol/host restriction |
29
+ | **Look for** | User input → `fetch`, `requests.get`, `axios.get`, `urllib`, `curl`, `http.get`; host control only (path-only is excluded) |
30
+
31
+ ## Command Injection
32
+
33
+ | Aspect | Detail |
34
+ |--------|--------|
35
+ | **Vulnerable** | User input in shell commands: `os.system(f"ping {host}")`, `subprocess.run(f"grep {pattern} file", shell=True)` |
36
+ | **Safe** | `subprocess.run(["ping", host])` with arguments as list; `shlex.quote()` |
37
+ | **Look for** | `shell=True`, `os.system`, `os.popen`, `exec()`, `eval()`, `$()`, backticks |
38
+ | **False-positive guard** | Shell scripts without untrusted user input are generally not exploitable. |
39
+
40
+ ## Authentication/Authorization Bypass
41
+
42
+ | Aspect | Detail |
43
+ |--------|--------|
44
+ | **Vulnerable** | Missing auth check on protected endpoint; JWT without signature verification; hardcoded admin tokens |
45
+ | **Safe** | Consistent auth middleware; JWT with `RS256`/`HS256` verification; role-based access control |
46
+ | **Look for** | Routes without auth decorators; `@login_required` / `@require_auth` missing; JWT without `.verify()`; client-side auth checks only |
47
+
48
+ ## Unsafe Deserialization
49
+
50
+ | Aspect | Detail |
51
+ |--------|--------|
52
+ | **Vulnerable** | `pickle.load(user_data)`, `yaml.load(user_input)`, `JSON.parse()` on untrusted tokens, `eval(input())` |
53
+ | **Safe** | `yaml.safe_load()`, `json.loads()` (safe for JSON), `pickle.load(weights_only=True)` (PyTorch), schema validation |
54
+ | **Look for** | `pickle.load`, `yaml.load` (not safe_load), `torch.load(weights_only=False)`, `eval`, `marshal.load`, `node-serialize` |
55
+
56
+ ## Path Traversal
57
+
58
+ | Aspect | Detail |
59
+ |--------|--------|
60
+ | **Vulnerable** | User input in file paths: `open(f"/data/{filename}")`, `path.join(base, user_path)` |
61
+ | **Safe** | Path normalization + prefix check: `os.path.realpath(path).startswith(BASE_DIR)`; allowlist of valid filenames |
62
+ | **Look for** | `open()`, `read_file()`, `os.path.join` with user input; `../` traversal without normalization |
63
+
64
+ ## Insecure Direct Object Reference (IDOR)
65
+
66
+ | Aspect | Detail |
67
+ |--------|--------|
68
+ | **Vulnerable** | API endpoint uses user-supplied ID without ownership check: `GET /api/order/{order_id}` — returns any user's order |
69
+ | **Safe** | Ownership verification: verify `order.user_id == current_user.id` before returning data |
70
+ | **Look for** | CRUD endpoints that accept IDs without authorization; horizontal/vertical privilege checks missing |
71
+
72
+ ## Weak Cryptography
73
+
74
+ | Aspect | Detail |
75
+ |--------|--------|
76
+ | **Vulnerable** | MD5/SHA1 for passwords; ECB mode; hardcoded keys; `random` module (not `secrets`); short key lengths |
77
+ | **Safe** | `bcrypt`/`argon2` for passwords; AES-GCM; `secrets` module; RSA 2048+; proper IV generation |
78
+ | **Look for** | `md5`, `sha1`, `DES`, `ECB`, `PKCS1_v1_5`, `random` for crypto, hardcoded `key=`, `Crypto.Cipher` without AEAD |
79
+
80
+ ## Secrets Exposure
81
+
82
+ | Aspect | Detail |
83
+ |--------|--------|
84
+ | **Vulnerable** | Hardcoded API keys, passwords, tokens in source code; secrets in logs; secrets in client-side code |
85
+ | **Safe** | Environment variables; secret manager (AWS Secrets Manager, HashiCorp Vault); `.env` excluded from VCS |
86
+ | **Look for** | `API_KEY=`, `password=`, `secret=`, `token=` in code; AWS keys, GitHub tokens, Stripe keys, JWTs in source |
87
+ | **False-positive guard** | Secrets stored on disk but otherwise secured ARE excluded. Logging high-value secrets IS a vuln. Logging URLs is safe. |
88
+
89
+ ## Template Injection (SSTI)
90
+
91
+ | Aspect | Detail |
92
+ |--------|--------|
93
+ | **Vulnerable** | User input in template rendering: `Template(user_input).render()`, `render_template_string(user_input)` |
94
+ | **Safe** | Static templates; input passed as context variable, not template string |
95
+ | **Look for** | `render_template_string`, `Template()()` with user string; `eval` in template context; `${user_input}` in JS template literals on server |
96
+
97
+ ## NoSQL Injection
98
+
99
+ | Aspect | Detail |
100
+ |--------|--------|
101
+ | **Vulnerable** | User input in MongoDB queries: `db.users.find({username: user_input})` where input is `{"$gt": ""}` |
102
+ | **Safe** | Schema validation; type checking on query params; ORM sanitization |
103
+ | **Look for** | MongoDB `$where`, `$gt`, `$regex` from user input; raw mongo queries without type coercion |
@@ -0,0 +1,63 @@
1
+ ---
2
+ name: security-review
3
+ description: >
4
+ AI-powered security analysis of code changes — traces data flow, detects
5
+ injection, auth bypass, secrets exposure, and unsafe deserialization across
6
+ files. Use when reviewing pending changes, before release-branch, during
7
+ verify-work Phase 5, during build-epic Step 0 threat modeling, or when
8
+ the user says "security review" or "scan for vulns".
9
+ ---
10
+
11
+ # Security Review
12
+
13
+ > **HARD GATE** — Requires git context (branch with merge-base or diff). Never
14
+ > writes files outside `specs/security/`. Findings below confidence 8/10 are
15
+ > suppressed. **→ verify:** `git rev-parse HEAD >/dev/null 2>&1 && echo "ok" || echo "BLOCKED"`
16
+
17
+ ## 5-phase scan
18
+
19
+ | # | Phase | What |
20
+ |---|-------|------|
21
+ | 1 | **Scope Resolution** | Detect diff via `git diff --merge-base origin/HEAD`; resolve languages/frameworks from dependency files |
22
+ | 2 | **Context Research** | Identify existing security patterns, sanitization, auth model in the codebase |
23
+ | 3 | **Vulnerability Assessment** | Trace user input → sink; check auth boundaries, crypto, deserialization, path ops |
24
+ | 4 | **False-Positive Filtering** | Cross-check each finding against exclusion rules; reject confidence < 8 |
25
+ | 5 | **Report Generation** | Output structured markdown: file:line, severity, category, exploit scenario, fix |
26
+
27
+ ## Categories
28
+
29
+ Covered: SQLi, XSS, SSRF, command injection, auth bypass, unsafe deserialization, path traversal, IDOR, crypto flaws, secrets exposure, template injection, NoSQLi
30
+
31
+ ## Integration points
32
+
33
+ | Skill | Touchpoint |
34
+ |-------|------------|
35
+ | `build-epic` | Step 0 — threat-model epic scope → `specs/security/epics/<id>/THREAT_MODEL.md` |
36
+ | `plan-work` | `security:` field (none/low/medium/high) on story tasks |
37
+ | `plan-release` | +2 WSJF risk boost for HIGH+ risk epics |
38
+ | `audit-code` | Checklist: "diff scanned — no unaddressed HIGH findings" |
39
+ | `request-review` | Inject threat model categories + false-positive rules into reviewer prompt |
40
+ | `investigate-bug` | Security-impact assessment in RCA (NONE→CRITICAL) |
41
+ | `validate-fix` | Recurrence hardening check for security bugs |
42
+ | `verify-work` | Phase 5 — blocks on HIGH findings ≥ 8 confidence |
43
+ | `release-branch` | Hard gate — blocks merge if unresolved HIGH findings |
44
+
45
+ ## Report format
46
+
47
+ Each finding: **`File:Line` — Severity — Category**
48
+ - Description: how the vulnerability manifests
49
+ - Exploit scenario: concrete attack path
50
+ - Recommendation: fix with code example
51
+
52
+ ## Reference files
53
+
54
+ - [Vuln categories](REFERENCE-vuln-categories.md) — detection guidance per vuln type
55
+ - [False positives](REFERENCE-false-positives.md) — hard exclusions + precedent
56
+ - [Confidence rubric](REFERENCE-confidence-rubric.md) — scoring methodology (0–10)
57
+
58
+ ## Verify
59
+
60
+ ```bash
61
+ test -d specs/security && echo "OK: specs/security/ exists" || mkdir -p specs/security
62
+ grep -q "Merge-base\|merge.base\|git diff" SKILL.md && echo "OK: git context verified"
63
+ ```
package/skills-lock.json CHANGED
@@ -23,7 +23,7 @@
23
23
  },
24
24
  "build-epic": {
25
25
  "description": "Eight-step epic build cycle — reads state.yaml, execution-status.yaml, and one epic capsule; updates status via bp-yaml-set or direct edit. Resume mode runs one step per invocation. Use instead of ad-hoc execute-plan for release work.",
26
- "sha256": "7a376ef092fde9cc",
26
+ "sha256": "565d8396889dd9c9",
27
27
  "path": "build-epic/SKILL.md"
28
28
  },
29
29
  "change-request": {
@@ -256,6 +256,11 @@
256
256
  "sha256": "34df830694a6459c",
257
257
  "path": "search-skills/SKILL.md"
258
258
  },
259
+ "security-review": {
260
+ "description": "> AI-powered security analysis of code changes — traces data flow, detects injection, auth bypass, secrets exposure, and unsafe deserialization across files. Use when reviewing pending changes, before release-branch, during verify-work Phase 5, during build-epic Step 0 threat modeling, or when the user says \"security review\" or \"scan for vulns\".",
261
+ "sha256": "24aeeed072282d0c",
262
+ "path": "security-review/SKILL.md"
263
+ },
259
264
  "seed-conventions": {
260
265
  "description": "Generate CLAUDE.md and CONVENTIONS.md for a brand-new project through a brief interview, and create the specs/ directory with evolved bigpowers structure (product/, tech-architecture/, verifications/, epics/archive/). Entry point for greenfield projects. Use when starting a new project from scratch, when user asks to set up AI agent conventions, or when there is no CLAUDE.md yet.",
261
266
  "sha256": "cd3a7fc52d1b0035",