bigpowers 2.34.2 → 2.36.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.pi/package.json +2 -2
- package/.pi/prompts/build-epic.md +10 -8
- package/.pi/prompts/security-review.md +323 -0
- package/.pi/skills/build-epic/SKILL.md +10 -8
- package/.pi/skills/security-review/SKILL.md +324 -0
- package/CHANGELOG.md +14 -0
- package/SKILL-INDEX.md +2 -2
- package/build-epic/SKILL.md +10 -8
- package/package.json +1 -1
- package/security-review/REFERENCE-confidence-rubric.md +85 -0
- package/security-review/REFERENCE-false-positives.md +68 -0
- package/security-review/REFERENCE-vuln-categories.md +103 -0
- package/security-review/SKILL.md +63 -0
- package/skills-lock.json +6 -1
package/.pi/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "bigpowers",
|
|
3
|
-
"version": "2.
|
|
4
|
-
"description": "
|
|
3
|
+
"version": "2.36.0",
|
|
4
|
+
"description": "71 skills — 70 agent skills for spec-driven, test-first software development by solo developers",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"pi-package"
|
|
7
7
|
],
|
|
@@ -13,10 +13,11 @@ Orchestrates the **build** flow for a single epic: survey → plan tasks → kic
|
|
|
13
13
|
>
|
|
14
14
|
> **HARD GATE** — Not on `main`/`master` before step 3 (kickoff-branch).
|
|
15
15
|
|
|
16
|
-
##
|
|
16
|
+
## Nine steps (`epic_cycle` in state.yaml)
|
|
17
17
|
|
|
18
18
|
| Step | Skill / action |
|
|
19
19
|
|------|----------------|
|
|
20
|
+
| 0 | `security-review` — threat-model epic scope → `specs/security/epics/<id>/THREAT_MODEL.md` |
|
|
20
21
|
| 1 | `survey-context` — confirm epic + story |
|
|
21
22
|
| 2 | `plan-work` — flesh out story `tasks[]` in `specs/epics/eNN-slug/epic.yaml` |
|
|
22
23
|
| 3 | `kickoff-branch` — feature branch + clean baseline |
|
|
@@ -24,17 +25,18 @@ Orchestrates the **build** flow for a single epic: survey → plan tasks → kic
|
|
|
24
25
|
| 5 | `verify-work` — UAT + mechanical gates |
|
|
25
26
|
| 6 | `audit-code` — **non-optional gate** (pass/fail; fail → loop back to step 4) |
|
|
26
27
|
| 7 | `commit-message` — Conventional Commits draft |
|
|
27
|
-
| 8 | `release-branch` — PR or solo land (supports `--squash-state`) |
|
|
28
|
+
| 8 | `release-branch` — PR or solo land (supports `--squash-state`) | |
|
|
28
29
|
|
|
29
30
|
## Process
|
|
30
31
|
|
|
31
32
|
1. Read `specs/state.yaml`, `specs/execution-status.yaml`, `specs/release-plan.yaml`, active `specs/epics/eNN-slug/epic.yaml`.
|
|
32
|
-
2. **
|
|
33
|
-
3. **
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
33
|
+
2. **Step 0 — Threat Model:** Run `security-review` against the epic's scope (read from the epic capsule). Output `specs/security/epics/<epic-id>/THREAT_MODEL.md` with surface area, vulnerability categories, risk level, and mitigation guidance.
|
|
34
|
+
3. **Assess Impact (Step 2):** Before writing tasks, run `assess-impact --lightweight` on the proposed change. If the risk score exceeds 7, gate — require a `grill-me` session. Write the impact report to `specs/IMPACT-<epic>-<story>.md`. For net-new code with no existing dependents, skip.
|
|
35
|
+
4. **BCP Tracking (Step 2):** After `plan-work` completes, read the `bcps:` count (Business Complexity Points story size) from the epic capsule and carry it into `state.yaml` as `epic_cycle.story_bcps = N`.
|
|
36
|
+
5. If `epic_cycle.step` missing, set to `1`.
|
|
37
|
+
6. Run **only the current step** (resume mode) unless user asked for full auto-run.
|
|
38
|
+
7. After step verify passes, increment `epic_cycle.step` in `state.yaml` (or `bash scripts/bp-yaml-set.sh` if available).
|
|
39
|
+
8. On story complete, set `execution-status.yaml` story key to `done`; run `bash scripts/sync-status-from-epics.sh`.
|
|
38
40
|
|
|
39
41
|
### Step 6 — audit-code gate (non-optional)
|
|
40
42
|
|
|
@@ -0,0 +1,323 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: > AI-powered security analysis of code changes — traces data flow, detects injection, auth bypass, secrets exposure, and unsafe deserialization across files. Use when reviewing pending changes, before release-branch, during verify-work Phase 5, during build-epic Step 0 threat modeling, or when the user says "security review" or "scan for vulns".
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
|
|
6
|
+
# Security Review
|
|
7
|
+
|
|
8
|
+
> **HARD GATE** — Requires git context (branch with merge-base or diff). Never
|
|
9
|
+
> writes files outside `specs/security/`. Findings below confidence 8/10 are
|
|
10
|
+
> suppressed. **→ verify:** `git rev-parse HEAD >/dev/null 2>&1 && echo "ok" || echo "BLOCKED"`
|
|
11
|
+
|
|
12
|
+
## 5-phase scan
|
|
13
|
+
|
|
14
|
+
| # | Phase | What |
|
|
15
|
+
|---|-------|------|
|
|
16
|
+
| 1 | **Scope Resolution** | Detect diff via `git diff --merge-base origin/HEAD`; resolve languages/frameworks from dependency files |
|
|
17
|
+
| 2 | **Context Research** | Identify existing security patterns, sanitization, auth model in the codebase |
|
|
18
|
+
| 3 | **Vulnerability Assessment** | Trace user input → sink; check auth boundaries, crypto, deserialization, path ops |
|
|
19
|
+
| 4 | **False-Positive Filtering** | Cross-check each finding against exclusion rules; reject confidence < 8 |
|
|
20
|
+
| 5 | **Report Generation** | Output structured markdown: file:line, severity, category, exploit scenario, fix |
|
|
21
|
+
|
|
22
|
+
## Categories
|
|
23
|
+
|
|
24
|
+
Covered: SQLi, XSS, SSRF, command injection, auth bypass, unsafe deserialization, path traversal, IDOR, crypto flaws, secrets exposure, template injection, NoSQLi
|
|
25
|
+
|
|
26
|
+
## Integration points
|
|
27
|
+
|
|
28
|
+
| Skill | Touchpoint |
|
|
29
|
+
|-------|------------|
|
|
30
|
+
| `build-epic` | Step 0 — threat-model epic scope → `specs/security/epics/<id>/THREAT_MODEL.md` |
|
|
31
|
+
| `plan-work` | `security:` field (none/low/medium/high) on story tasks |
|
|
32
|
+
| `plan-release` | +2 WSJF risk boost for HIGH+ risk epics |
|
|
33
|
+
| `audit-code` | Checklist: "diff scanned — no unaddressed HIGH findings" |
|
|
34
|
+
| `request-review` | Inject threat model categories + false-positive rules into reviewer prompt |
|
|
35
|
+
| `investigate-bug` | Security-impact assessment in RCA (NONE→CRITICAL) |
|
|
36
|
+
| `validate-fix` | Recurrence hardening check for security bugs |
|
|
37
|
+
| `verify-work` | Phase 5 — blocks on HIGH findings ≥ 8 confidence |
|
|
38
|
+
| `release-branch` | Hard gate — blocks merge if unresolved HIGH findings |
|
|
39
|
+
|
|
40
|
+
## Report format
|
|
41
|
+
|
|
42
|
+
Each finding: **`File:Line` — Severity — Category**
|
|
43
|
+
- Description: how the vulnerability manifests
|
|
44
|
+
- Exploit scenario: concrete attack path
|
|
45
|
+
- Recommendation: fix with code example
|
|
46
|
+
|
|
47
|
+
## Reference files
|
|
48
|
+
|
|
49
|
+
- [Vuln categories](REFERENCE-vuln-categories.md) — detection guidance per vuln type
|
|
50
|
+
- [False positives](REFERENCE-false-positives.md) — hard exclusions + precedent
|
|
51
|
+
- [Confidence rubric](REFERENCE-confidence-rubric.md) — scoring methodology (0–10)
|
|
52
|
+
|
|
53
|
+
## Verify
|
|
54
|
+
|
|
55
|
+
```bash
|
|
56
|
+
test -d specs/security && echo "OK: specs/security/ exists" || mkdir -p specs/security
|
|
57
|
+
grep -q "Merge-base\|merge.base\|git diff" SKILL.md && echo "OK: git context verified"
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
---
|
|
61
|
+
|
|
62
|
+
# Confidence Scoring Rubric
|
|
63
|
+
|
|
64
|
+
Every finding that survives Phase 4 false-positive filtering receives a confidence
|
|
65
|
+
score from 1 (speculative) to 10 (certain). Only findings ≥ 8 are reported.
|
|
66
|
+
|
|
67
|
+
## Score 9–10: Certain Exploit Path
|
|
68
|
+
|
|
69
|
+
**Criteria:**
|
|
70
|
+
- Concrete, testable exploit with clear reproduction steps
|
|
71
|
+
- No assumptions about uncommon configurations
|
|
72
|
+
- No chain of multiple unlikely conditions
|
|
73
|
+
- Attacker has full control over the input vector
|
|
74
|
+
|
|
75
|
+
**Examples:**
|
|
76
|
+
- User-supplied SQL in a `SELECT` statement with no parameterization
|
|
77
|
+
- `os.system(f"rm {user_path}")` where user controls the path
|
|
78
|
+
- Pickle deserialization of user-supplied data without any wrapping
|
|
79
|
+
|
|
80
|
+
**Severity:** HIGH
|
|
81
|
+
|
|
82
|
+
## Score 8: Clear Vulnerability Pattern
|
|
83
|
+
|
|
84
|
+
**Criteria:**
|
|
85
|
+
- Well-known vulnerability pattern with standard exploitation method
|
|
86
|
+
- Requires specific conditions but conditions are commonly met
|
|
87
|
+
- Exploitability is well-documented in OWASP / CVE databases
|
|
88
|
+
|
|
89
|
+
**Examples:**
|
|
90
|
+
- JWT without signature verification in authentication middleware
|
|
91
|
+
- SSRF where attacker controls the full URL including host
|
|
92
|
+
- Hardcoded AWS secret key in source code
|
|
93
|
+
|
|
94
|
+
**Severity:** HIGH or MEDIUM
|
|
95
|
+
|
|
96
|
+
## Score 7: Suspicious Pattern
|
|
97
|
+
|
|
98
|
+
**Criteria:**
|
|
99
|
+
- Unusual code that may indicate a vulnerability
|
|
100
|
+
- Requires specific conditions that may not be present
|
|
101
|
+
- Alternative secure interpretation is equally likely
|
|
102
|
+
- Defense-in-depth concern rather than direct exploit
|
|
103
|
+
|
|
104
|
+
**Examples:**
|
|
105
|
+
- A function accepting user input that passes through multiple layers before reaching a sink (unclear if sanitized)
|
|
106
|
+
- Custom encryption implementation (likely weak, but may not process sensitive data)
|
|
107
|
+
- Path construction that looks safe but has a subtle bypass
|
|
108
|
+
|
|
109
|
+
**Severity:** LOW or suppress
|
|
110
|
+
|
|
111
|
+
## Score < 7: Do Not Report
|
|
112
|
+
|
|
113
|
+
**Criteria:**
|
|
114
|
+
- Theoretical concern without exploit path
|
|
115
|
+
- Requires unrealistic attacker capabilities
|
|
116
|
+
- Violates one or more hard exclusion rules
|
|
117
|
+
- Better handled by separate tooling (dependency scanner, SAST, secret scanner)
|
|
118
|
+
- Purely stylistic or best-practice concern without security impact
|
|
119
|
+
|
|
120
|
+
**Examples:**
|
|
121
|
+
- "This function doesn't validate all inputs" without proving the validated input is the attack surface
|
|
122
|
+
- "This uses MD5" where the hash is not used for security (e.g., cache key)
|
|
123
|
+
- "This function could consume too much memory" (DOS exclusion)
|
|
124
|
+
|
|
125
|
+
**Action:** Suppress entirely. Do not include in report.
|
|
126
|
+
|
|
127
|
+
## Severity Mapping
|
|
128
|
+
|
|
129
|
+
Once confidence ≥ 8 is confirmed, map to severity:
|
|
130
|
+
|
|
131
|
+
| Severity | Impact | Examples |
|
|
132
|
+
|----------|--------|---------|
|
|
133
|
+
| **CRITICAL** | Remote compromise, full data breach | RCE, auth bypass with admin escalation, SQLi with data exfiltration |
|
|
134
|
+
| **HIGH** | Significant security boundary crossed | SSRF to internal services, hardcoded cloud credentials, insecure deserialization |
|
|
135
|
+
| **MEDIUM** | Limited impact or requires conditions | Stored XSS behind auth, IDOR on non-sensitive data, weak but not broken crypto |
|
|
136
|
+
| **LOW** | Defense-in-depth, minimal blast radius | Missing security header, verbose error messages in non-production |
|
|
137
|
+
|
|
138
|
+
## Quality Gate
|
|
139
|
+
|
|
140
|
+
The confidence rubric double-checks each finding against three lenses:
|
|
141
|
+
|
|
142
|
+
| Lens | Question |
|
|
143
|
+
|------|----------|
|
|
144
|
+
| **Exploitability** | Can a real attacker trigger this from a trust boundary? |
|
|
145
|
+
| **Actionability** | Would a security engineer accept a fix recommendation for this? |
|
|
146
|
+
| **Precedent** | Has this type of finding passed/failed human review before? |
|
|
147
|
+
|
|
148
|
+
---
|
|
149
|
+
|
|
150
|
+
# False-Positive Exclusion Rules
|
|
151
|
+
|
|
152
|
+
Applied during Phase 4 of the scan. Findings matching any hard exclusion are
|
|
153
|
+
automatically suppressed. Precedents from prior reviews guide borderline cases.
|
|
154
|
+
|
|
155
|
+
## Hard Exclusions
|
|
156
|
+
|
|
157
|
+
Automatically exclude findings matching these patterns:
|
|
158
|
+
|
|
159
|
+
| # | Rule | Rationale |
|
|
160
|
+
|---|------|-----------|
|
|
161
|
+
| 1 | **Denial of Service (DOS)** — resource exhaustion, CPU/memory attacks | Handled separately; not actionable in code review |
|
|
162
|
+
| 2 | **Secrets on disk** if otherwise secured | Secrets management is a separate concern |
|
|
163
|
+
| 3 | **Rate limiting** concerns | Operational, not a code vulnerability |
|
|
164
|
+
| 4 | **Memory consumption / CPU exhaustion** | Not actionable in diff review |
|
|
165
|
+
| 5 | **Input validation on non-security-critical fields** without proven exploit path | Theoretical, not concrete |
|
|
166
|
+
| 6 | **GitHub Actions input sanitization** unless clearly triggerable via untrusted input | Most workflow vulns are not exploitable |
|
|
167
|
+
| 7 | **Lack of hardening measures** | Code is not expected to implement all best practices |
|
|
168
|
+
| 8 | **Race conditions / timing attacks** that are theoretical | Only report if concretely problematic |
|
|
169
|
+
| 9 | **Outdated third-party libraries** | Managed separately by dependency scanners |
|
|
170
|
+
| 10 | **Memory safety** in Rust or other memory-safe languages | Impossible by language guarantees |
|
|
171
|
+
| 11 | **Unit test files only** | Not production risk |
|
|
172
|
+
| 12 | **Log spoofing** | Outputting unsanitized input to logs is not a vuln |
|
|
173
|
+
| 13 | **SSRF that only controls path** | Only host/protocol control is exploitable |
|
|
174
|
+
| 14 | **User-controlled content in AI system prompts** | Not a security vulnerability |
|
|
175
|
+
| 15 | **Regex injection** | Injecting untrusted content into regex is not a vuln |
|
|
176
|
+
| 16 | **Regex DOS** | Excluded alongside general DOS |
|
|
177
|
+
| 17 | **Documentation files** (.md, .txt) | Insecure docs are not code vulnerabilities |
|
|
178
|
+
| 18 | **Lack of audit logs** | Not a vulnerability |
|
|
179
|
+
|
|
180
|
+
## Precedent Rules
|
|
181
|
+
|
|
182
|
+
These guide borderline cases based on prior human review decisions:
|
|
183
|
+
|
|
184
|
+
| # | Precedent | Reasoning |
|
|
185
|
+
|---|-----------|-----------|
|
|
186
|
+
| 1 | **Logging high-value secrets in plaintext IS a vuln.** Logging URLs is safe. | Secrets in logs = credential exposure; URLs are not secrets |
|
|
187
|
+
| 2 | **UUIDs are unguessable** — no validation needed | Cryptographic property of UUID v4/v7 |
|
|
188
|
+
| 3 | **Environment variables and CLI flags are trusted values** | Attackers cannot modify these in secure environments |
|
|
189
|
+
| 4 | **Resource management issues** (memory leaks, fd leaks) are NOT valid | Operational, not security |
|
|
190
|
+
| 5 | **Tabnabbing, XS-Leaks, prototype pollution, open redirects** — do NOT report unless extremely high confidence | Subtle, low-impact, high false-positive rate |
|
|
191
|
+
| 6 | **React/Angular XSS** — safe unless `dangerouslySetInnerHTML`, `bypassSecurityTrustHtml`, etc. | Framework auto-escapes |
|
|
192
|
+
| 7 | **GitHub Action workflow vulns** — verify concrete attack path before reporting | Most are theoretical |
|
|
193
|
+
| 8 | **Client-side JS/TS auth checks** — not a vuln; server is authoritative | Client code is untrusted |
|
|
194
|
+
| 9 | **IPython notebook vulns** — only report if concrete untrusted-input trigger | Most are not exploitable |
|
|
195
|
+
| 10 | **Logging non-PII data** — not a vuln even if sensitive. Only PII/secrets/passwords. | Intent: operational logging vs credential exposure |
|
|
196
|
+
| 11 | **Shell script command injection** — only report if concrete untrusted-input path | Most shell scripts don't process untrusted input |
|
|
197
|
+
|
|
198
|
+
## Confidence Scoring
|
|
199
|
+
|
|
200
|
+
Findings that survive exclusions get a confidence score (1–10):
|
|
201
|
+
|
|
202
|
+
| Range | Meaning | Action |
|
|
203
|
+
|-------|---------|--------|
|
|
204
|
+
| 9–10 | Certain exploit path, testable | Report as HIGH |
|
|
205
|
+
| 8 | Clear vulnerability pattern | Report as HIGH/MEDIUM |
|
|
206
|
+
| 7 | Suspicious, needs conditions | Report as LOW or suppress |
|
|
207
|
+
| <7 | Too speculative | **Do not report** |
|
|
208
|
+
|
|
209
|
+
**Hard threshold:** Only report findings with confidence ≥ 8.
|
|
210
|
+
|
|
211
|
+
## Signal Quality Criteria
|
|
212
|
+
|
|
213
|
+
For remaining findings, assess:
|
|
214
|
+
1. Is there a concrete, exploitable vulnerability with a clear attack path?
|
|
215
|
+
2. Does this represent a real security risk (vs theoretical best practice)?
|
|
216
|
+
3. Are there specific code locations and reproduction steps?
|
|
217
|
+
4. Would this finding be actionable for a security team?
|
|
218
|
+
|
|
219
|
+
---
|
|
220
|
+
|
|
221
|
+
# Vulnerability Categories — Detection Guidance
|
|
222
|
+
|
|
223
|
+
Each category: vulnerable pattern → safe pattern → code example.
|
|
224
|
+
|
|
225
|
+
## SQL Injection
|
|
226
|
+
|
|
227
|
+
| Aspect | Detail |
|
|
228
|
+
|--------|--------|
|
|
229
|
+
| **Vulnerable** | String interpolation in SQL queries: `f"SELECT * FROM users WHERE id = {uid}"` |
|
|
230
|
+
| **Safe** | Parameterized queries / ORM: `cursor.execute("SELECT * FROM users WHERE id = %s", (uid,))` |
|
|
231
|
+
| **Look for** | f-strings, `+` concatenation, `format()` in query builders; raw SQL in ORM `.raw()` / `.execute()` |
|
|
232
|
+
| **False-positive guard** | Not a FP if the input is user-controlled (HTTP param, file, env var, CLI arg). Env vars are trusted (see exclusion rules). |
|
|
233
|
+
|
|
234
|
+
## Cross-Site Scripting (XSS)
|
|
235
|
+
|
|
236
|
+
| Aspect | Detail |
|
|
237
|
+
|--------|--------|
|
|
238
|
+
| **Vulnerable** | `element.innerHTML = userInput`, `dangerouslySetInnerHTML={{__html: userInput}}` |
|
|
239
|
+
| **Safe** | `element.textContent = userInput`, React JSX (auto-escaped), template engines with auto-escaping |
|
|
240
|
+
| **Look for** | `.innerHTML`, `document.write()`, `dangerouslySetInnerHTML`, `v-html` (Vue), `bypassSecurityTrustHtml` (Angular) |
|
|
241
|
+
| **False-positive guard** | React/Angular components without unsafe methods are NOT vulnerable (see exclusion rules). |
|
|
242
|
+
|
|
243
|
+
## Server-Side Request Forgery (SSRF)
|
|
244
|
+
|
|
245
|
+
| Aspect | Detail |
|
|
246
|
+
|--------|--------|
|
|
247
|
+
| **Vulnerable** | User-controlled URL passed to server-side HTTP client: `requests.get(user_url)` |
|
|
248
|
+
| **Safe** | URL allowlist validation, internal-network blocking, protocol/host restriction |
|
|
249
|
+
| **Look for** | User input → `fetch`, `requests.get`, `axios.get`, `urllib`, `curl`, `http.get`; host control only (path-only is excluded) |
|
|
250
|
+
|
|
251
|
+
## Command Injection
|
|
252
|
+
|
|
253
|
+
| Aspect | Detail |
|
|
254
|
+
|--------|--------|
|
|
255
|
+
| **Vulnerable** | User input in shell commands: `os.system(f"ping {host}")`, `subprocess.run(f"grep {pattern} file", shell=True)` |
|
|
256
|
+
| **Safe** | `subprocess.run(["ping", host])` with arguments as list; `shlex.quote()` |
|
|
257
|
+
| **Look for** | `shell=True`, `os.system`, `os.popen`, `exec()`, `eval()`, `$()`, backticks |
|
|
258
|
+
| **False-positive guard** | Shell scripts without untrusted user input are generally not exploitable. |
|
|
259
|
+
|
|
260
|
+
## Authentication/Authorization Bypass
|
|
261
|
+
|
|
262
|
+
| Aspect | Detail |
|
|
263
|
+
|--------|--------|
|
|
264
|
+
| **Vulnerable** | Missing auth check on protected endpoint; JWT without signature verification; hardcoded admin tokens |
|
|
265
|
+
| **Safe** | Consistent auth middleware; JWT with `RS256`/`HS256` verification; role-based access control |
|
|
266
|
+
| **Look for** | Routes without auth decorators; `@login_required` / `@require_auth` missing; JWT without `.verify()`; client-side auth checks only |
|
|
267
|
+
|
|
268
|
+
## Unsafe Deserialization
|
|
269
|
+
|
|
270
|
+
| Aspect | Detail |
|
|
271
|
+
|--------|--------|
|
|
272
|
+
| **Vulnerable** | `pickle.load(user_data)`, `yaml.load(user_input)`, `JSON.parse()` on untrusted tokens, `eval(input())` |
|
|
273
|
+
| **Safe** | `yaml.safe_load()`, `json.loads()` (safe for JSON), `pickle.load(weights_only=True)` (PyTorch), schema validation |
|
|
274
|
+
| **Look for** | `pickle.load`, `yaml.load` (not safe_load), `torch.load(weights_only=False)`, `eval`, `marshal.load`, `node-serialize` |
|
|
275
|
+
|
|
276
|
+
## Path Traversal
|
|
277
|
+
|
|
278
|
+
| Aspect | Detail |
|
|
279
|
+
|--------|--------|
|
|
280
|
+
| **Vulnerable** | User input in file paths: `open(f"/data/{filename}")`, `path.join(base, user_path)` |
|
|
281
|
+
| **Safe** | Path normalization + prefix check: `os.path.realpath(path).startswith(BASE_DIR)`; allowlist of valid filenames |
|
|
282
|
+
| **Look for** | `open()`, `read_file()`, `os.path.join` with user input; `../` traversal without normalization |
|
|
283
|
+
|
|
284
|
+
## Insecure Direct Object Reference (IDOR)
|
|
285
|
+
|
|
286
|
+
| Aspect | Detail |
|
|
287
|
+
|--------|--------|
|
|
288
|
+
| **Vulnerable** | API endpoint uses user-supplied ID without ownership check: `GET /api/order/{order_id}` — returns any user's order |
|
|
289
|
+
| **Safe** | Ownership verification: verify `order.user_id == current_user.id` before returning data |
|
|
290
|
+
| **Look for** | CRUD endpoints that accept IDs without authorization; horizontal/vertical privilege checks missing |
|
|
291
|
+
|
|
292
|
+
## Weak Cryptography
|
|
293
|
+
|
|
294
|
+
| Aspect | Detail |
|
|
295
|
+
|--------|--------|
|
|
296
|
+
| **Vulnerable** | MD5/SHA1 for passwords; ECB mode; hardcoded keys; `random` module (not `secrets`); short key lengths |
|
|
297
|
+
| **Safe** | `bcrypt`/`argon2` for passwords; AES-GCM; `secrets` module; RSA 2048+; proper IV generation |
|
|
298
|
+
| **Look for** | `md5`, `sha1`, `DES`, `ECB`, `PKCS1_v1_5`, `random` for crypto, hardcoded `key=`, `Crypto.Cipher` without AEAD |
|
|
299
|
+
|
|
300
|
+
## Secrets Exposure
|
|
301
|
+
|
|
302
|
+
| Aspect | Detail |
|
|
303
|
+
|--------|--------|
|
|
304
|
+
| **Vulnerable** | Hardcoded API keys, passwords, tokens in source code; secrets in logs; secrets in client-side code |
|
|
305
|
+
| **Safe** | Environment variables; secret manager (AWS Secrets Manager, HashiCorp Vault); `.env` excluded from VCS |
|
|
306
|
+
| **Look for** | `API_KEY=`, `password=`, `secret=`, `token=` in code; AWS keys, GitHub tokens, Stripe keys, JWTs in source |
|
|
307
|
+
| **False-positive guard** | Secrets stored on disk but otherwise secured ARE excluded. Logging high-value secrets IS a vuln. Logging URLs is safe. |
|
|
308
|
+
|
|
309
|
+
## Template Injection (SSTI)
|
|
310
|
+
|
|
311
|
+
| Aspect | Detail |
|
|
312
|
+
|--------|--------|
|
|
313
|
+
| **Vulnerable** | User input in template rendering: `Template(user_input).render()`, `render_template_string(user_input)` |
|
|
314
|
+
| **Safe** | Static templates; input passed as context variable, not template string |
|
|
315
|
+
| **Look for** | `render_template_string`, `Template()()` with user string; `eval` in template context; `${user_input}` in JS template literals on server |
|
|
316
|
+
|
|
317
|
+
## NoSQL Injection
|
|
318
|
+
|
|
319
|
+
| Aspect | Detail |
|
|
320
|
+
|--------|--------|
|
|
321
|
+
| **Vulnerable** | User input in MongoDB queries: `db.users.find({username: user_input})` where input is `{"$gt": ""}` |
|
|
322
|
+
| **Safe** | Schema validation; type checking on query params; ORM sanitization |
|
|
323
|
+
| **Look for** | MongoDB `$where`, `$gt`, `$regex` from user input; raw mongo queries without type coercion |
|
|
@@ -15,10 +15,11 @@ Orchestrates the **build** flow for a single epic: survey → plan tasks → kic
|
|
|
15
15
|
>
|
|
16
16
|
> **HARD GATE** — Not on `main`/`master` before step 3 (kickoff-branch).
|
|
17
17
|
|
|
18
|
-
##
|
|
18
|
+
## Nine steps (`epic_cycle` in state.yaml)
|
|
19
19
|
|
|
20
20
|
| Step | Skill / action |
|
|
21
21
|
|------|----------------|
|
|
22
|
+
| 0 | `security-review` — threat-model epic scope → `specs/security/epics/<id>/THREAT_MODEL.md` |
|
|
22
23
|
| 1 | `survey-context` — confirm epic + story |
|
|
23
24
|
| 2 | `plan-work` — flesh out story `tasks[]` in `specs/epics/eNN-slug/epic.yaml` |
|
|
24
25
|
| 3 | `kickoff-branch` — feature branch + clean baseline |
|
|
@@ -26,17 +27,18 @@ Orchestrates the **build** flow for a single epic: survey → plan tasks → kic
|
|
|
26
27
|
| 5 | `verify-work` — UAT + mechanical gates |
|
|
27
28
|
| 6 | `audit-code` — **non-optional gate** (pass/fail; fail → loop back to step 4) |
|
|
28
29
|
| 7 | `commit-message` — Conventional Commits draft |
|
|
29
|
-
| 8 | `release-branch` — PR or solo land (supports `--squash-state`) |
|
|
30
|
+
| 8 | `release-branch` — PR or solo land (supports `--squash-state`) | |
|
|
30
31
|
|
|
31
32
|
## Process
|
|
32
33
|
|
|
33
34
|
1. Read `specs/state.yaml`, `specs/execution-status.yaml`, `specs/release-plan.yaml`, active `specs/epics/eNN-slug/epic.yaml`.
|
|
34
|
-
2. **
|
|
35
|
-
3. **
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
35
|
+
2. **Step 0 — Threat Model:** Run `security-review` against the epic's scope (read from the epic capsule). Output `specs/security/epics/<epic-id>/THREAT_MODEL.md` with surface area, vulnerability categories, risk level, and mitigation guidance.
|
|
36
|
+
3. **Assess Impact (Step 2):** Before writing tasks, run `assess-impact --lightweight` on the proposed change. If the risk score exceeds 7, gate — require a `grill-me` session. Write the impact report to `specs/IMPACT-<epic>-<story>.md`. For net-new code with no existing dependents, skip.
|
|
37
|
+
4. **BCP Tracking (Step 2):** After `plan-work` completes, read the `bcps:` count (Business Complexity Points story size) from the epic capsule and carry it into `state.yaml` as `epic_cycle.story_bcps = N`.
|
|
38
|
+
5. If `epic_cycle.step` missing, set to `1`.
|
|
39
|
+
6. Run **only the current step** (resume mode) unless user asked for full auto-run.
|
|
40
|
+
7. After step verify passes, increment `epic_cycle.step` in `state.yaml` (or `bash scripts/bp-yaml-set.sh` if available).
|
|
41
|
+
8. On story complete, set `execution-status.yaml` story key to `done`; run `bash scripts/sync-status-from-epics.sh`.
|
|
40
42
|
|
|
41
43
|
### Step 6 — audit-code gate (non-optional)
|
|
42
44
|
|
|
@@ -0,0 +1,324 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: security-review
|
|
3
|
+
description: "> AI-powered security analysis of code changes — traces data flow, detects injection, auth bypass, secrets exposure, and unsafe deserialization across files. Use when reviewing pending changes, before release-branch, during verify-work Phase 5, during build-epic Step 0 threat modeling, or when the user says \"security review\" or \"scan for vulns\"."
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
|
|
7
|
+
# Security Review
|
|
8
|
+
|
|
9
|
+
> **HARD GATE** — Requires git context (branch with merge-base or diff). Never
|
|
10
|
+
> writes files outside `specs/security/`. Findings below confidence 8/10 are
|
|
11
|
+
> suppressed. **→ verify:** `git rev-parse HEAD >/dev/null 2>&1 && echo "ok" || echo "BLOCKED"`
|
|
12
|
+
|
|
13
|
+
## 5-phase scan
|
|
14
|
+
|
|
15
|
+
| # | Phase | What |
|
|
16
|
+
|---|-------|------|
|
|
17
|
+
| 1 | **Scope Resolution** | Detect diff via `git diff --merge-base origin/HEAD`; resolve languages/frameworks from dependency files |
|
|
18
|
+
| 2 | **Context Research** | Identify existing security patterns, sanitization, auth model in the codebase |
|
|
19
|
+
| 3 | **Vulnerability Assessment** | Trace user input → sink; check auth boundaries, crypto, deserialization, path ops |
|
|
20
|
+
| 4 | **False-Positive Filtering** | Cross-check each finding against exclusion rules; reject confidence < 8 |
|
|
21
|
+
| 5 | **Report Generation** | Output structured markdown: file:line, severity, category, exploit scenario, fix |
|
|
22
|
+
|
|
23
|
+
## Categories
|
|
24
|
+
|
|
25
|
+
Covered: SQLi, XSS, SSRF, command injection, auth bypass, unsafe deserialization, path traversal, IDOR, crypto flaws, secrets exposure, template injection, NoSQLi
|
|
26
|
+
|
|
27
|
+
## Integration points
|
|
28
|
+
|
|
29
|
+
| Skill | Touchpoint |
|
|
30
|
+
|-------|------------|
|
|
31
|
+
| `build-epic` | Step 0 — threat-model epic scope → `specs/security/epics/<id>/THREAT_MODEL.md` |
|
|
32
|
+
| `plan-work` | `security:` field (none/low/medium/high) on story tasks |
|
|
33
|
+
| `plan-release` | +2 WSJF risk boost for HIGH+ risk epics |
|
|
34
|
+
| `audit-code` | Checklist: "diff scanned — no unaddressed HIGH findings" |
|
|
35
|
+
| `request-review` | Inject threat model categories + false-positive rules into reviewer prompt |
|
|
36
|
+
| `investigate-bug` | Security-impact assessment in RCA (NONE→CRITICAL) |
|
|
37
|
+
| `validate-fix` | Recurrence hardening check for security bugs |
|
|
38
|
+
| `verify-work` | Phase 5 — blocks on HIGH findings ≥ 8 confidence |
|
|
39
|
+
| `release-branch` | Hard gate — blocks merge if unresolved HIGH findings |
|
|
40
|
+
|
|
41
|
+
## Report format
|
|
42
|
+
|
|
43
|
+
Each finding: **`File:Line` — Severity — Category**
|
|
44
|
+
- Description: how the vulnerability manifests
|
|
45
|
+
- Exploit scenario: concrete attack path
|
|
46
|
+
- Recommendation: fix with code example
|
|
47
|
+
|
|
48
|
+
## Reference files
|
|
49
|
+
|
|
50
|
+
- [Vuln categories](REFERENCE-vuln-categories.md) — detection guidance per vuln type
|
|
51
|
+
- [False positives](REFERENCE-false-positives.md) — hard exclusions + precedent
|
|
52
|
+
- [Confidence rubric](REFERENCE-confidence-rubric.md) — scoring methodology (0–10)
|
|
53
|
+
|
|
54
|
+
## Verify
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
test -d specs/security && echo "OK: specs/security/ exists" || mkdir -p specs/security
|
|
58
|
+
grep -q "Merge-base\|merge.base\|git diff" SKILL.md && echo "OK: git context verified"
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
---
|
|
62
|
+
|
|
63
|
+
# Confidence Scoring Rubric
|
|
64
|
+
|
|
65
|
+
Every finding that survives Phase 4 false-positive filtering receives a confidence
|
|
66
|
+
score from 1 (speculative) to 10 (certain). Only findings ≥ 8 are reported.
|
|
67
|
+
|
|
68
|
+
## Score 9–10: Certain Exploit Path
|
|
69
|
+
|
|
70
|
+
**Criteria:**
|
|
71
|
+
- Concrete, testable exploit with clear reproduction steps
|
|
72
|
+
- No assumptions about uncommon configurations
|
|
73
|
+
- No chain of multiple unlikely conditions
|
|
74
|
+
- Attacker has full control over the input vector
|
|
75
|
+
|
|
76
|
+
**Examples:**
|
|
77
|
+
- User-supplied SQL in a `SELECT` statement with no parameterization
|
|
78
|
+
- `os.system(f"rm {user_path}")` where user controls the path
|
|
79
|
+
- Pickle deserialization of user-supplied data without any wrapping
|
|
80
|
+
|
|
81
|
+
**Severity:** HIGH
|
|
82
|
+
|
|
83
|
+
## Score 8: Clear Vulnerability Pattern
|
|
84
|
+
|
|
85
|
+
**Criteria:**
|
|
86
|
+
- Well-known vulnerability pattern with standard exploitation method
|
|
87
|
+
- Requires specific conditions but conditions are commonly met
|
|
88
|
+
- Exploitability is well-documented in OWASP / CVE databases
|
|
89
|
+
|
|
90
|
+
**Examples:**
|
|
91
|
+
- JWT without signature verification in authentication middleware
|
|
92
|
+
- SSRF where attacker controls the full URL including host
|
|
93
|
+
- Hardcoded AWS secret key in source code
|
|
94
|
+
|
|
95
|
+
**Severity:** HIGH or MEDIUM
|
|
96
|
+
|
|
97
|
+
## Score 7: Suspicious Pattern
|
|
98
|
+
|
|
99
|
+
**Criteria:**
|
|
100
|
+
- Unusual code that may indicate a vulnerability
|
|
101
|
+
- Requires specific conditions that may not be present
|
|
102
|
+
- Alternative secure interpretation is equally likely
|
|
103
|
+
- Defense-in-depth concern rather than direct exploit
|
|
104
|
+
|
|
105
|
+
**Examples:**
|
|
106
|
+
- A function accepting user input that passes through multiple layers before reaching a sink (unclear if sanitized)
|
|
107
|
+
- Custom encryption implementation (likely weak, but may not process sensitive data)
|
|
108
|
+
- Path construction that looks safe but has a subtle bypass
|
|
109
|
+
|
|
110
|
+
**Severity:** LOW or suppress
|
|
111
|
+
|
|
112
|
+
## Score < 7: Do Not Report
|
|
113
|
+
|
|
114
|
+
**Criteria:**
|
|
115
|
+
- Theoretical concern without exploit path
|
|
116
|
+
- Requires unrealistic attacker capabilities
|
|
117
|
+
- Violates one or more hard exclusion rules
|
|
118
|
+
- Better handled by separate tooling (dependency scanner, SAST, secret scanner)
|
|
119
|
+
- Purely stylistic or best-practice concern without security impact
|
|
120
|
+
|
|
121
|
+
**Examples:**
|
|
122
|
+
- "This function doesn't validate all inputs" without proving the validated input is the attack surface
|
|
123
|
+
- "This uses MD5" where the hash is not used for security (e.g., cache key)
|
|
124
|
+
- "This function could consume too much memory" (DOS exclusion)
|
|
125
|
+
|
|
126
|
+
**Action:** Suppress entirely. Do not include in report.
|
|
127
|
+
|
|
128
|
+
## Severity Mapping
|
|
129
|
+
|
|
130
|
+
Once confidence ≥ 8 is confirmed, map to severity:
|
|
131
|
+
|
|
132
|
+
| Severity | Impact | Examples |
|
|
133
|
+
|----------|--------|---------|
|
|
134
|
+
| **CRITICAL** | Remote compromise, full data breach | RCE, auth bypass with admin escalation, SQLi with data exfiltration |
|
|
135
|
+
| **HIGH** | Significant security boundary crossed | SSRF to internal services, hardcoded cloud credentials, insecure deserialization |
|
|
136
|
+
| **MEDIUM** | Limited impact or requires conditions | Stored XSS behind auth, IDOR on non-sensitive data, weak but not broken crypto |
|
|
137
|
+
| **LOW** | Defense-in-depth, minimal blast radius | Missing security header, verbose error messages in non-production |
|
|
138
|
+
|
|
139
|
+
## Quality Gate
|
|
140
|
+
|
|
141
|
+
The confidence rubric double-checks each finding against three lenses:
|
|
142
|
+
|
|
143
|
+
| Lens | Question |
|
|
144
|
+
|------|----------|
|
|
145
|
+
| **Exploitability** | Can a real attacker trigger this from a trust boundary? |
|
|
146
|
+
| **Actionability** | Would a security engineer accept a fix recommendation for this? |
|
|
147
|
+
| **Precedent** | Has this type of finding passed/failed human review before? |
|
|
148
|
+
|
|
149
|
+
---
|
|
150
|
+
|
|
151
|
+
# False-Positive Exclusion Rules
|
|
152
|
+
|
|
153
|
+
Applied during Phase 4 of the scan. Findings matching any hard exclusion are
|
|
154
|
+
automatically suppressed. Precedents from prior reviews guide borderline cases.
|
|
155
|
+
|
|
156
|
+
## Hard Exclusions
|
|
157
|
+
|
|
158
|
+
Automatically exclude findings matching these patterns:
|
|
159
|
+
|
|
160
|
+
| # | Rule | Rationale |
|
|
161
|
+
|---|------|-----------|
|
|
162
|
+
| 1 | **Denial of Service (DOS)** — resource exhaustion, CPU/memory attacks | Handled separately; not actionable in code review |
|
|
163
|
+
| 2 | **Secrets on disk** if otherwise secured | Secrets management is a separate concern |
|
|
164
|
+
| 3 | **Rate limiting** concerns | Operational, not a code vulnerability |
|
|
165
|
+
| 4 | **Memory consumption / CPU exhaustion** | Not actionable in diff review |
|
|
166
|
+
| 5 | **Input validation on non-security-critical fields** without proven exploit path | Theoretical, not concrete |
|
|
167
|
+
| 6 | **GitHub Actions input sanitization** unless clearly triggerable via untrusted input | Most workflow vulns are not exploitable |
|
|
168
|
+
| 7 | **Lack of hardening measures** | Code is not expected to implement all best practices |
|
|
169
|
+
| 8 | **Race conditions / timing attacks** that are theoretical | Only report if concretely problematic |
|
|
170
|
+
| 9 | **Outdated third-party libraries** | Managed separately by dependency scanners |
|
|
171
|
+
| 10 | **Memory safety** in Rust or other memory-safe languages | Impossible by language guarantees |
|
|
172
|
+
| 11 | **Unit test files only** | Not production risk |
|
|
173
|
+
| 12 | **Log spoofing** | Outputting unsanitized input to logs is not a vuln |
|
|
174
|
+
| 13 | **SSRF that only controls path** | Only host/protocol control is exploitable |
|
|
175
|
+
| 14 | **User-controlled content in AI system prompts** | Not a security vulnerability |
|
|
176
|
+
| 15 | **Regex injection** | Injecting untrusted content into regex is not a vuln |
|
|
177
|
+
| 16 | **Regex DOS** | Excluded alongside general DOS |
|
|
178
|
+
| 17 | **Documentation files** (.md, .txt) | Insecure docs are not code vulnerabilities |
|
|
179
|
+
| 18 | **Lack of audit logs** | Not a vulnerability |
|
|
180
|
+
|
|
181
|
+
## Precedent Rules
|
|
182
|
+
|
|
183
|
+
These guide borderline cases based on prior human review decisions:
|
|
184
|
+
|
|
185
|
+
| # | Precedent | Reasoning |
|
|
186
|
+
|---|-----------|-----------|
|
|
187
|
+
| 1 | **Logging high-value secrets in plaintext IS a vuln.** Logging URLs is safe. | Secrets in logs = credential exposure; URLs are not secrets |
|
|
188
|
+
| 2 | **UUIDs are unguessable** — no validation needed | Cryptographic property of UUID v4/v7 |
|
|
189
|
+
| 3 | **Environment variables and CLI flags are trusted values** | Attackers cannot modify these in secure environments |
|
|
190
|
+
| 4 | **Resource management issues** (memory leaks, fd leaks) are NOT valid | Operational, not security |
|
|
191
|
+
| 5 | **Tabnabbing, XS-Leaks, prototype pollution, open redirects** — do NOT report unless extremely high confidence | Subtle, low-impact, high false-positive rate |
|
|
192
|
+
| 6 | **React/Angular XSS** — safe unless `dangerouslySetInnerHTML`, `bypassSecurityTrustHtml`, etc. | Framework auto-escapes |
|
|
193
|
+
| 7 | **GitHub Action workflow vulns** — verify concrete attack path before reporting | Most are theoretical |
|
|
194
|
+
| 8 | **Client-side JS/TS auth checks** — not a vuln; server is authoritative | Client code is untrusted |
|
|
195
|
+
| 9 | **IPython notebook vulns** — only report if concrete untrusted-input trigger | Most are not exploitable |
|
|
196
|
+
| 10 | **Logging non-PII data** — not a vuln even if sensitive. Only PII/secrets/passwords. | Intent: operational logging vs credential exposure |
|
|
197
|
+
| 11 | **Shell script command injection** — only report if concrete untrusted-input path | Most shell scripts don't process untrusted input |
|
|
198
|
+
|
|
199
|
+
## Confidence Scoring
|
|
200
|
+
|
|
201
|
+
Findings that survive exclusions get a confidence score (1–10):
|
|
202
|
+
|
|
203
|
+
| Range | Meaning | Action |
|
|
204
|
+
|-------|---------|--------|
|
|
205
|
+
| 9–10 | Certain exploit path, testable | Report as HIGH |
|
|
206
|
+
| 8 | Clear vulnerability pattern | Report as HIGH/MEDIUM |
|
|
207
|
+
| 7 | Suspicious, needs conditions | Report as LOW or suppress |
|
|
208
|
+
| <7 | Too speculative | **Do not report** |
|
|
209
|
+
|
|
210
|
+
**Hard threshold:** Only report findings with confidence ≥ 8.
|
|
211
|
+
|
|
212
|
+
## Signal Quality Criteria
|
|
213
|
+
|
|
214
|
+
For remaining findings, assess:
|
|
215
|
+
1. Is there a concrete, exploitable vulnerability with a clear attack path?
|
|
216
|
+
2. Does this represent a real security risk (vs theoretical best practice)?
|
|
217
|
+
3. Are there specific code locations and reproduction steps?
|
|
218
|
+
4. Would this finding be actionable for a security team?
|
|
219
|
+
|
|
220
|
+
---
|
|
221
|
+
|
|
222
|
+
# Vulnerability Categories — Detection Guidance
|
|
223
|
+
|
|
224
|
+
Each category: vulnerable pattern → safe pattern → code example.
|
|
225
|
+
|
|
226
|
+
## SQL Injection
|
|
227
|
+
|
|
228
|
+
| Aspect | Detail |
|
|
229
|
+
|--------|--------|
|
|
230
|
+
| **Vulnerable** | String interpolation in SQL queries: `f"SELECT * FROM users WHERE id = {uid}"` |
|
|
231
|
+
| **Safe** | Parameterized queries / ORM: `cursor.execute("SELECT * FROM users WHERE id = %s", (uid,))` |
|
|
232
|
+
| **Look for** | f-strings, `+` concatenation, `format()` in query builders; raw SQL in ORM `.raw()` / `.execute()` |
|
|
233
|
+
| **False-positive guard** | Not a FP if the input is user-controlled (HTTP param, file, env var, CLI arg). Env vars are trusted (see exclusion rules). |
|
|
234
|
+
|
|
235
|
+
## Cross-Site Scripting (XSS)
|
|
236
|
+
|
|
237
|
+
| Aspect | Detail |
|
|
238
|
+
|--------|--------|
|
|
239
|
+
| **Vulnerable** | `element.innerHTML = userInput`, `dangerouslySetInnerHTML={{__html: userInput}}` |
|
|
240
|
+
| **Safe** | `element.textContent = userInput`, React JSX (auto-escaped), template engines with auto-escaping |
|
|
241
|
+
| **Look for** | `.innerHTML`, `document.write()`, `dangerouslySetInnerHTML`, `v-html` (Vue), `bypassSecurityTrustHtml` (Angular) |
|
|
242
|
+
| **False-positive guard** | React/Angular components without unsafe methods are NOT vulnerable (see exclusion rules). |
|
|
243
|
+
|
|
244
|
+
## Server-Side Request Forgery (SSRF)
|
|
245
|
+
|
|
246
|
+
| Aspect | Detail |
|
|
247
|
+
|--------|--------|
|
|
248
|
+
| **Vulnerable** | User-controlled URL passed to server-side HTTP client: `requests.get(user_url)` |
|
|
249
|
+
| **Safe** | URL allowlist validation, internal-network blocking, protocol/host restriction |
|
|
250
|
+
| **Look for** | User input → `fetch`, `requests.get`, `axios.get`, `urllib`, `curl`, `http.get`; host control only (path-only is excluded) |
|
|
251
|
+
|
|
252
|
+
## Command Injection
|
|
253
|
+
|
|
254
|
+
| Aspect | Detail |
|
|
255
|
+
|--------|--------|
|
|
256
|
+
| **Vulnerable** | User input in shell commands: `os.system(f"ping {host}")`, `subprocess.run(f"grep {pattern} file", shell=True)` |
|
|
257
|
+
| **Safe** | `subprocess.run(["ping", host])` with arguments as list; `shlex.quote()` |
|
|
258
|
+
| **Look for** | `shell=True`, `os.system`, `os.popen`, `exec()`, `eval()`, `$()`, backticks |
|
|
259
|
+
| **False-positive guard** | Shell scripts without untrusted user input are generally not exploitable. |
|
|
260
|
+
|
|
261
|
+
## Authentication/Authorization Bypass
|
|
262
|
+
|
|
263
|
+
| Aspect | Detail |
|
|
264
|
+
|--------|--------|
|
|
265
|
+
| **Vulnerable** | Missing auth check on protected endpoint; JWT without signature verification; hardcoded admin tokens |
|
|
266
|
+
| **Safe** | Consistent auth middleware; JWT with `RS256`/`HS256` verification; role-based access control |
|
|
267
|
+
| **Look for** | Routes without auth decorators; `@login_required` / `@require_auth` missing; JWT without `.verify()`; client-side auth checks only |
|
|
268
|
+
|
|
269
|
+
## Unsafe Deserialization
|
|
270
|
+
|
|
271
|
+
| Aspect | Detail |
|
|
272
|
+
|--------|--------|
|
|
273
|
+
| **Vulnerable** | `pickle.load(user_data)`, `yaml.load(user_input)`, `JSON.parse()` on untrusted tokens, `eval(input())` |
|
|
274
|
+
| **Safe** | `yaml.safe_load()`, `json.loads()` (safe for JSON), `pickle.load(weights_only=True)` (PyTorch), schema validation |
|
|
275
|
+
| **Look for** | `pickle.load`, `yaml.load` (not safe_load), `torch.load(weights_only=False)`, `eval`, `marshal.load`, `node-serialize` |
|
|
276
|
+
|
|
277
|
+
## Path Traversal
|
|
278
|
+
|
|
279
|
+
| Aspect | Detail |
|
|
280
|
+
|--------|--------|
|
|
281
|
+
| **Vulnerable** | User input in file paths: `open(f"/data/{filename}")`, `path.join(base, user_path)` |
|
|
282
|
+
| **Safe** | Path normalization + prefix check: `os.path.realpath(path).startswith(BASE_DIR)`; allowlist of valid filenames |
|
|
283
|
+
| **Look for** | `open()`, `read_file()`, `os.path.join` with user input; `../` traversal without normalization |
|
|
284
|
+
|
|
285
|
+
## Insecure Direct Object Reference (IDOR)
|
|
286
|
+
|
|
287
|
+
| Aspect | Detail |
|
|
288
|
+
|--------|--------|
|
|
289
|
+
| **Vulnerable** | API endpoint uses user-supplied ID without ownership check: `GET /api/order/{order_id}` — returns any user's order |
|
|
290
|
+
| **Safe** | Ownership verification: verify `order.user_id == current_user.id` before returning data |
|
|
291
|
+
| **Look for** | CRUD endpoints that accept IDs without authorization; horizontal/vertical privilege checks missing |
|
|
292
|
+
|
|
293
|
+
## Weak Cryptography
|
|
294
|
+
|
|
295
|
+
| Aspect | Detail |
|
|
296
|
+
|--------|--------|
|
|
297
|
+
| **Vulnerable** | MD5/SHA1 for passwords; ECB mode; hardcoded keys; `random` module (not `secrets`); short key lengths |
|
|
298
|
+
| **Safe** | `bcrypt`/`argon2` for passwords; AES-GCM; `secrets` module; RSA 2048+; proper IV generation |
|
|
299
|
+
| **Look for** | `md5`, `sha1`, `DES`, `ECB`, `PKCS1_v1_5`, `random` for crypto, hardcoded `key=`, `Crypto.Cipher` without AEAD |
|
|
300
|
+
|
|
301
|
+
## Secrets Exposure
|
|
302
|
+
|
|
303
|
+
| Aspect | Detail |
|
|
304
|
+
|--------|--------|
|
|
305
|
+
| **Vulnerable** | Hardcoded API keys, passwords, tokens in source code; secrets in logs; secrets in client-side code |
|
|
306
|
+
| **Safe** | Environment variables; secret manager (AWS Secrets Manager, HashiCorp Vault); `.env` excluded from VCS |
|
|
307
|
+
| **Look for** | `API_KEY=`, `password=`, `secret=`, `token=` in code; AWS keys, GitHub tokens, Stripe keys, JWTs in source |
|
|
308
|
+
| **False-positive guard** | Secrets stored on disk but otherwise secured ARE excluded. Logging high-value secrets IS a vuln. Logging URLs is safe. |
|
|
309
|
+
|
|
310
|
+
## Template Injection (SSTI)
|
|
311
|
+
|
|
312
|
+
| Aspect | Detail |
|
|
313
|
+
|--------|--------|
|
|
314
|
+
| **Vulnerable** | User input in template rendering: `Template(user_input).render()`, `render_template_string(user_input)` |
|
|
315
|
+
| **Safe** | Static templates; input passed as context variable, not template string |
|
|
316
|
+
| **Look for** | `render_template_string`, `Template()()` with user string; `eval` in template context; `${user_input}` in JS template literals on server |
|
|
317
|
+
|
|
318
|
+
## NoSQL Injection
|
|
319
|
+
|
|
320
|
+
| Aspect | Detail |
|
|
321
|
+
|--------|--------|
|
|
322
|
+
| **Vulnerable** | User input in MongoDB queries: `db.users.find({username: user_input})` where input is `{"$gt": ""}` |
|
|
323
|
+
| **Safe** | Schema validation; type checking on query params; ORM sanitization |
|
|
324
|
+
| **Look for** | MongoDB `$where`, `$gt`, `$regex` from user input; raw mongo queries without type coercion |
|
package/CHANGELOG.md
CHANGED
|
@@ -1,3 +1,17 @@
|
|
|
1
|
+
# [2.36.0](https://github.com/danielvm-git/bigpowers/compare/v2.35.0...v2.36.0) (2026-06-27)
|
|
2
|
+
|
|
3
|
+
|
|
4
|
+
### Features
|
|
5
|
+
|
|
6
|
+
* **build-epic:** add Step 0 threat model to epic cycle ([9977f3e](https://github.com/danielvm-git/bigpowers/commit/9977f3e491b9cdefef644e145fa2331fe0ce1154))
|
|
7
|
+
|
|
8
|
+
# [2.35.0](https://github.com/danielvm-git/bigpowers/compare/v2.34.2...v2.35.0) (2026-06-27)
|
|
9
|
+
|
|
10
|
+
|
|
11
|
+
### Features
|
|
12
|
+
|
|
13
|
+
* **security-review:** add security-review skill with lifecycle integration ([932171a](https://github.com/danielvm-git/bigpowers/commit/932171a6526c4f9465c0d3768877dd2ad7775917))
|
|
14
|
+
|
|
1
15
|
## [2.34.2](https://github.com/danielvm-git/bigpowers/compare/v2.34.1...v2.34.2) (2026-06-27)
|
|
2
16
|
|
|
3
17
|
|
package/SKILL-INDEX.md
CHANGED
|
@@ -3,8 +3,8 @@
|
|
|
3
3
|
> **DO NOT EDIT** — This file is auto-generated by `scripts/generate-skill-index.sh`.
|
|
4
4
|
> Edit `SKILL.md` source files or `skills-lock.json` instead. Run `bash scripts/sync-skills.sh` to regenerate.
|
|
5
5
|
|
|
6
|
-
**Generated:** 2026-06-
|
|
7
|
-
**Skills:**
|
|
6
|
+
**Generated:** 2026-06-27T16:37:32Z
|
|
7
|
+
**Skills:** 71
|
|
8
8
|
|
|
9
9
|
---
|
|
10
10
|
|
package/build-epic/SKILL.md
CHANGED
|
@@ -14,10 +14,11 @@ Orchestrates the **build** flow for a single epic: survey → plan tasks → kic
|
|
|
14
14
|
>
|
|
15
15
|
> **HARD GATE** — Not on `main`/`master` before step 3 (kickoff-branch).
|
|
16
16
|
|
|
17
|
-
##
|
|
17
|
+
## Nine steps (`epic_cycle` in state.yaml)
|
|
18
18
|
|
|
19
19
|
| Step | Skill / action |
|
|
20
20
|
|------|----------------|
|
|
21
|
+
| 0 | `security-review` — threat-model epic scope → `specs/security/epics/<id>/THREAT_MODEL.md` |
|
|
21
22
|
| 1 | `survey-context` — confirm epic + story |
|
|
22
23
|
| 2 | `plan-work` — flesh out story `tasks[]` in `specs/epics/eNN-slug/epic.yaml` |
|
|
23
24
|
| 3 | `kickoff-branch` — feature branch + clean baseline |
|
|
@@ -25,17 +26,18 @@ Orchestrates the **build** flow for a single epic: survey → plan tasks → kic
|
|
|
25
26
|
| 5 | `verify-work` — UAT + mechanical gates |
|
|
26
27
|
| 6 | `audit-code` — **non-optional gate** (pass/fail; fail → loop back to step 4) |
|
|
27
28
|
| 7 | `commit-message` — Conventional Commits draft |
|
|
28
|
-
| 8 | `release-branch` — PR or solo land (supports `--squash-state`) |
|
|
29
|
+
| 8 | `release-branch` — PR or solo land (supports `--squash-state`) | |
|
|
29
30
|
|
|
30
31
|
## Process
|
|
31
32
|
|
|
32
33
|
1. Read `specs/state.yaml`, `specs/execution-status.yaml`, `specs/release-plan.yaml`, active `specs/epics/eNN-slug/epic.yaml`.
|
|
33
|
-
2. **
|
|
34
|
-
3. **
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
34
|
+
2. **Step 0 — Threat Model:** Run `security-review` against the epic's scope (read from the epic capsule). Output `specs/security/epics/<epic-id>/THREAT_MODEL.md` with surface area, vulnerability categories, risk level, and mitigation guidance.
|
|
35
|
+
3. **Assess Impact (Step 2):** Before writing tasks, run `assess-impact --lightweight` on the proposed change. If the risk score exceeds 7, gate — require a `grill-me` session. Write the impact report to `specs/IMPACT-<epic>-<story>.md`. For net-new code with no existing dependents, skip.
|
|
36
|
+
4. **BCP Tracking (Step 2):** After `plan-work` completes, read the `bcps:` count (Business Complexity Points story size) from the epic capsule and carry it into `state.yaml` as `epic_cycle.story_bcps = N`.
|
|
37
|
+
5. If `epic_cycle.step` missing, set to `1`.
|
|
38
|
+
6. Run **only the current step** (resume mode) unless user asked for full auto-run.
|
|
39
|
+
7. After step verify passes, increment `epic_cycle.step` in `state.yaml` (or `bash scripts/bp-yaml-set.sh` if available).
|
|
40
|
+
8. On story complete, set `execution-status.yaml` story key to `done`; run `bash scripts/sync-status-from-epics.sh`.
|
|
39
41
|
|
|
40
42
|
### Step 6 — audit-code gate (non-optional)
|
|
41
43
|
|
package/package.json
CHANGED
|
@@ -0,0 +1,85 @@
|
|
|
1
|
+
# Confidence Scoring Rubric
|
|
2
|
+
|
|
3
|
+
Every finding that survives Phase 4 false-positive filtering receives a confidence
|
|
4
|
+
score from 1 (speculative) to 10 (certain). Only findings ≥ 8 are reported.
|
|
5
|
+
|
|
6
|
+
## Score 9–10: Certain Exploit Path
|
|
7
|
+
|
|
8
|
+
**Criteria:**
|
|
9
|
+
- Concrete, testable exploit with clear reproduction steps
|
|
10
|
+
- No assumptions about uncommon configurations
|
|
11
|
+
- No chain of multiple unlikely conditions
|
|
12
|
+
- Attacker has full control over the input vector
|
|
13
|
+
|
|
14
|
+
**Examples:**
|
|
15
|
+
- User-supplied SQL in a `SELECT` statement with no parameterization
|
|
16
|
+
- `os.system(f"rm {user_path}")` where user controls the path
|
|
17
|
+
- Pickle deserialization of user-supplied data without any wrapping
|
|
18
|
+
|
|
19
|
+
**Severity:** HIGH
|
|
20
|
+
|
|
21
|
+
## Score 8: Clear Vulnerability Pattern
|
|
22
|
+
|
|
23
|
+
**Criteria:**
|
|
24
|
+
- Well-known vulnerability pattern with standard exploitation method
|
|
25
|
+
- Requires specific conditions but conditions are commonly met
|
|
26
|
+
- Exploitability is well-documented in OWASP / CVE databases
|
|
27
|
+
|
|
28
|
+
**Examples:**
|
|
29
|
+
- JWT without signature verification in authentication middleware
|
|
30
|
+
- SSRF where attacker controls the full URL including host
|
|
31
|
+
- Hardcoded AWS secret key in source code
|
|
32
|
+
|
|
33
|
+
**Severity:** HIGH or MEDIUM
|
|
34
|
+
|
|
35
|
+
## Score 7: Suspicious Pattern
|
|
36
|
+
|
|
37
|
+
**Criteria:**
|
|
38
|
+
- Unusual code that may indicate a vulnerability
|
|
39
|
+
- Requires specific conditions that may not be present
|
|
40
|
+
- Alternative secure interpretation is equally likely
|
|
41
|
+
- Defense-in-depth concern rather than direct exploit
|
|
42
|
+
|
|
43
|
+
**Examples:**
|
|
44
|
+
- A function accepting user input that passes through multiple layers before reaching a sink (unclear if sanitized)
|
|
45
|
+
- Custom encryption implementation (likely weak, but may not process sensitive data)
|
|
46
|
+
- Path construction that looks safe but has a subtle bypass
|
|
47
|
+
|
|
48
|
+
**Severity:** LOW or suppress
|
|
49
|
+
|
|
50
|
+
## Score < 7: Do Not Report
|
|
51
|
+
|
|
52
|
+
**Criteria:**
|
|
53
|
+
- Theoretical concern without exploit path
|
|
54
|
+
- Requires unrealistic attacker capabilities
|
|
55
|
+
- Violates one or more hard exclusion rules
|
|
56
|
+
- Better handled by separate tooling (dependency scanner, SAST, secret scanner)
|
|
57
|
+
- Purely stylistic or best-practice concern without security impact
|
|
58
|
+
|
|
59
|
+
**Examples:**
|
|
60
|
+
- "This function doesn't validate all inputs" without proving the validated input is the attack surface
|
|
61
|
+
- "This uses MD5" where the hash is not used for security (e.g., cache key)
|
|
62
|
+
- "This function could consume too much memory" (DOS exclusion)
|
|
63
|
+
|
|
64
|
+
**Action:** Suppress entirely. Do not include in report.
|
|
65
|
+
|
|
66
|
+
## Severity Mapping
|
|
67
|
+
|
|
68
|
+
Once confidence ≥ 8 is confirmed, map to severity:
|
|
69
|
+
|
|
70
|
+
| Severity | Impact | Examples |
|
|
71
|
+
|----------|--------|---------|
|
|
72
|
+
| **CRITICAL** | Remote compromise, full data breach | RCE, auth bypass with admin escalation, SQLi with data exfiltration |
|
|
73
|
+
| **HIGH** | Significant security boundary crossed | SSRF to internal services, hardcoded cloud credentials, insecure deserialization |
|
|
74
|
+
| **MEDIUM** | Limited impact or requires conditions | Stored XSS behind auth, IDOR on non-sensitive data, weak but not broken crypto |
|
|
75
|
+
| **LOW** | Defense-in-depth, minimal blast radius | Missing security header, verbose error messages in non-production |
|
|
76
|
+
|
|
77
|
+
## Quality Gate
|
|
78
|
+
|
|
79
|
+
The confidence rubric double-checks each finding against three lenses:
|
|
80
|
+
|
|
81
|
+
| Lens | Question |
|
|
82
|
+
|------|----------|
|
|
83
|
+
| **Exploitability** | Can a real attacker trigger this from a trust boundary? |
|
|
84
|
+
| **Actionability** | Would a security engineer accept a fix recommendation for this? |
|
|
85
|
+
| **Precedent** | Has this type of finding passed/failed human review before? |
|
|
@@ -0,0 +1,68 @@
|
|
|
1
|
+
# False-Positive Exclusion Rules
|
|
2
|
+
|
|
3
|
+
Applied during Phase 4 of the scan. Findings matching any hard exclusion are
|
|
4
|
+
automatically suppressed. Precedents from prior reviews guide borderline cases.
|
|
5
|
+
|
|
6
|
+
## Hard Exclusions
|
|
7
|
+
|
|
8
|
+
Automatically exclude findings matching these patterns:
|
|
9
|
+
|
|
10
|
+
| # | Rule | Rationale |
|
|
11
|
+
|---|------|-----------|
|
|
12
|
+
| 1 | **Denial of Service (DOS)** — resource exhaustion, CPU/memory attacks | Handled separately; not actionable in code review |
|
|
13
|
+
| 2 | **Secrets on disk** if otherwise secured | Secrets management is a separate concern |
|
|
14
|
+
| 3 | **Rate limiting** concerns | Operational, not a code vulnerability |
|
|
15
|
+
| 4 | **Memory consumption / CPU exhaustion** | Not actionable in diff review |
|
|
16
|
+
| 5 | **Input validation on non-security-critical fields** without proven exploit path | Theoretical, not concrete |
|
|
17
|
+
| 6 | **GitHub Actions input sanitization** unless clearly triggerable via untrusted input | Most workflow vulns are not exploitable |
|
|
18
|
+
| 7 | **Lack of hardening measures** | Code is not expected to implement all best practices |
|
|
19
|
+
| 8 | **Race conditions / timing attacks** that are theoretical | Only report if concretely problematic |
|
|
20
|
+
| 9 | **Outdated third-party libraries** | Managed separately by dependency scanners |
|
|
21
|
+
| 10 | **Memory safety** in Rust or other memory-safe languages | Impossible by language guarantees |
|
|
22
|
+
| 11 | **Unit test files only** | Not production risk |
|
|
23
|
+
| 12 | **Log spoofing** | Outputting unsanitized input to logs is not a vuln |
|
|
24
|
+
| 13 | **SSRF that only controls path** | Only host/protocol control is exploitable |
|
|
25
|
+
| 14 | **User-controlled content in AI system prompts** | Not a security vulnerability |
|
|
26
|
+
| 15 | **Regex injection** | Injecting untrusted content into regex is not a vuln |
|
|
27
|
+
| 16 | **Regex DOS** | Excluded alongside general DOS |
|
|
28
|
+
| 17 | **Documentation files** (.md, .txt) | Insecure docs are not code vulnerabilities |
|
|
29
|
+
| 18 | **Lack of audit logs** | Not a vulnerability |
|
|
30
|
+
|
|
31
|
+
## Precedent Rules
|
|
32
|
+
|
|
33
|
+
These guide borderline cases based on prior human review decisions:
|
|
34
|
+
|
|
35
|
+
| # | Precedent | Reasoning |
|
|
36
|
+
|---|-----------|-----------|
|
|
37
|
+
| 1 | **Logging high-value secrets in plaintext IS a vuln.** Logging URLs is safe. | Secrets in logs = credential exposure; URLs are not secrets |
|
|
38
|
+
| 2 | **UUIDs are unguessable** — no validation needed | Cryptographic property of UUID v4/v7 |
|
|
39
|
+
| 3 | **Environment variables and CLI flags are trusted values** | Attackers cannot modify these in secure environments |
|
|
40
|
+
| 4 | **Resource management issues** (memory leaks, fd leaks) are NOT valid | Operational, not security |
|
|
41
|
+
| 5 | **Tabnabbing, XS-Leaks, prototype pollution, open redirects** — do NOT report unless extremely high confidence | Subtle, low-impact, high false-positive rate |
|
|
42
|
+
| 6 | **React/Angular XSS** — safe unless `dangerouslySetInnerHTML`, `bypassSecurityTrustHtml`, etc. | Framework auto-escapes |
|
|
43
|
+
| 7 | **GitHub Action workflow vulns** — verify concrete attack path before reporting | Most are theoretical |
|
|
44
|
+
| 8 | **Client-side JS/TS auth checks** — not a vuln; server is authoritative | Client code is untrusted |
|
|
45
|
+
| 9 | **IPython notebook vulns** — only report if concrete untrusted-input trigger | Most are not exploitable |
|
|
46
|
+
| 10 | **Logging non-PII data** — not a vuln even if sensitive. Only PII/secrets/passwords. | Intent: operational logging vs credential exposure |
|
|
47
|
+
| 11 | **Shell script command injection** — only report if concrete untrusted-input path | Most shell scripts don't process untrusted input |
|
|
48
|
+
|
|
49
|
+
## Confidence Scoring
|
|
50
|
+
|
|
51
|
+
Findings that survive exclusions get a confidence score (1–10):
|
|
52
|
+
|
|
53
|
+
| Range | Meaning | Action |
|
|
54
|
+
|-------|---------|--------|
|
|
55
|
+
| 9–10 | Certain exploit path, testable | Report as HIGH |
|
|
56
|
+
| 8 | Clear vulnerability pattern | Report as HIGH/MEDIUM |
|
|
57
|
+
| 7 | Suspicious, needs conditions | Report as LOW or suppress |
|
|
58
|
+
| <7 | Too speculative | **Do not report** |
|
|
59
|
+
|
|
60
|
+
**Hard threshold:** Only report findings with confidence ≥ 8.
|
|
61
|
+
|
|
62
|
+
## Signal Quality Criteria
|
|
63
|
+
|
|
64
|
+
For remaining findings, assess:
|
|
65
|
+
1. Is there a concrete, exploitable vulnerability with a clear attack path?
|
|
66
|
+
2. Does this represent a real security risk (vs theoretical best practice)?
|
|
67
|
+
3. Are there specific code locations and reproduction steps?
|
|
68
|
+
4. Would this finding be actionable for a security team?
|
|
@@ -0,0 +1,103 @@
|
|
|
1
|
+
# Vulnerability Categories — Detection Guidance
|
|
2
|
+
|
|
3
|
+
Each category: vulnerable pattern → safe pattern → code example.
|
|
4
|
+
|
|
5
|
+
## SQL Injection
|
|
6
|
+
|
|
7
|
+
| Aspect | Detail |
|
|
8
|
+
|--------|--------|
|
|
9
|
+
| **Vulnerable** | String interpolation in SQL queries: `f"SELECT * FROM users WHERE id = {uid}"` |
|
|
10
|
+
| **Safe** | Parameterized queries / ORM: `cursor.execute("SELECT * FROM users WHERE id = %s", (uid,))` |
|
|
11
|
+
| **Look for** | f-strings, `+` concatenation, `format()` in query builders; raw SQL in ORM `.raw()` / `.execute()` |
|
|
12
|
+
| **False-positive guard** | Not a FP if the input is user-controlled (HTTP param, file, env var, CLI arg). Env vars are trusted (see exclusion rules). |
|
|
13
|
+
|
|
14
|
+
## Cross-Site Scripting (XSS)
|
|
15
|
+
|
|
16
|
+
| Aspect | Detail |
|
|
17
|
+
|--------|--------|
|
|
18
|
+
| **Vulnerable** | `element.innerHTML = userInput`, `dangerouslySetInnerHTML={{__html: userInput}}` |
|
|
19
|
+
| **Safe** | `element.textContent = userInput`, React JSX (auto-escaped), template engines with auto-escaping |
|
|
20
|
+
| **Look for** | `.innerHTML`, `document.write()`, `dangerouslySetInnerHTML`, `v-html` (Vue), `bypassSecurityTrustHtml` (Angular) |
|
|
21
|
+
| **False-positive guard** | React/Angular components without unsafe methods are NOT vulnerable (see exclusion rules). |
|
|
22
|
+
|
|
23
|
+
## Server-Side Request Forgery (SSRF)
|
|
24
|
+
|
|
25
|
+
| Aspect | Detail |
|
|
26
|
+
|--------|--------|
|
|
27
|
+
| **Vulnerable** | User-controlled URL passed to server-side HTTP client: `requests.get(user_url)` |
|
|
28
|
+
| **Safe** | URL allowlist validation, internal-network blocking, protocol/host restriction |
|
|
29
|
+
| **Look for** | User input → `fetch`, `requests.get`, `axios.get`, `urllib`, `curl`, `http.get`; host control only (path-only is excluded) |
|
|
30
|
+
|
|
31
|
+
## Command Injection
|
|
32
|
+
|
|
33
|
+
| Aspect | Detail |
|
|
34
|
+
|--------|--------|
|
|
35
|
+
| **Vulnerable** | User input in shell commands: `os.system(f"ping {host}")`, `subprocess.run(f"grep {pattern} file", shell=True)` |
|
|
36
|
+
| **Safe** | `subprocess.run(["ping", host])` with arguments as list; `shlex.quote()` |
|
|
37
|
+
| **Look for** | `shell=True`, `os.system`, `os.popen`, `exec()`, `eval()`, `$()`, backticks |
|
|
38
|
+
| **False-positive guard** | Shell scripts without untrusted user input are generally not exploitable. |
|
|
39
|
+
|
|
40
|
+
## Authentication/Authorization Bypass
|
|
41
|
+
|
|
42
|
+
| Aspect | Detail |
|
|
43
|
+
|--------|--------|
|
|
44
|
+
| **Vulnerable** | Missing auth check on protected endpoint; JWT without signature verification; hardcoded admin tokens |
|
|
45
|
+
| **Safe** | Consistent auth middleware; JWT with `RS256`/`HS256` verification; role-based access control |
|
|
46
|
+
| **Look for** | Routes without auth decorators; `@login_required` / `@require_auth` missing; JWT without `.verify()`; client-side auth checks only |
|
|
47
|
+
|
|
48
|
+
## Unsafe Deserialization
|
|
49
|
+
|
|
50
|
+
| Aspect | Detail |
|
|
51
|
+
|--------|--------|
|
|
52
|
+
| **Vulnerable** | `pickle.load(user_data)`, `yaml.load(user_input)`, `JSON.parse()` on untrusted tokens, `eval(input())` |
|
|
53
|
+
| **Safe** | `yaml.safe_load()`, `json.loads()` (safe for JSON), `pickle.load(weights_only=True)` (PyTorch), schema validation |
|
|
54
|
+
| **Look for** | `pickle.load`, `yaml.load` (not safe_load), `torch.load(weights_only=False)`, `eval`, `marshal.load`, `node-serialize` |
|
|
55
|
+
|
|
56
|
+
## Path Traversal
|
|
57
|
+
|
|
58
|
+
| Aspect | Detail |
|
|
59
|
+
|--------|--------|
|
|
60
|
+
| **Vulnerable** | User input in file paths: `open(f"/data/{filename}")`, `path.join(base, user_path)` |
|
|
61
|
+
| **Safe** | Path normalization + prefix check: `os.path.realpath(path).startswith(BASE_DIR)`; allowlist of valid filenames |
|
|
62
|
+
| **Look for** | `open()`, `read_file()`, `os.path.join` with user input; `../` traversal without normalization |
|
|
63
|
+
|
|
64
|
+
## Insecure Direct Object Reference (IDOR)
|
|
65
|
+
|
|
66
|
+
| Aspect | Detail |
|
|
67
|
+
|--------|--------|
|
|
68
|
+
| **Vulnerable** | API endpoint uses user-supplied ID without ownership check: `GET /api/order/{order_id}` — returns any user's order |
|
|
69
|
+
| **Safe** | Ownership verification: verify `order.user_id == current_user.id` before returning data |
|
|
70
|
+
| **Look for** | CRUD endpoints that accept IDs without authorization; horizontal/vertical privilege checks missing |
|
|
71
|
+
|
|
72
|
+
## Weak Cryptography
|
|
73
|
+
|
|
74
|
+
| Aspect | Detail |
|
|
75
|
+
|--------|--------|
|
|
76
|
+
| **Vulnerable** | MD5/SHA1 for passwords; ECB mode; hardcoded keys; `random` module (not `secrets`); short key lengths |
|
|
77
|
+
| **Safe** | `bcrypt`/`argon2` for passwords; AES-GCM; `secrets` module; RSA 2048+; proper IV generation |
|
|
78
|
+
| **Look for** | `md5`, `sha1`, `DES`, `ECB`, `PKCS1_v1_5`, `random` for crypto, hardcoded `key=`, `Crypto.Cipher` without AEAD |
|
|
79
|
+
|
|
80
|
+
## Secrets Exposure
|
|
81
|
+
|
|
82
|
+
| Aspect | Detail |
|
|
83
|
+
|--------|--------|
|
|
84
|
+
| **Vulnerable** | Hardcoded API keys, passwords, tokens in source code; secrets in logs; secrets in client-side code |
|
|
85
|
+
| **Safe** | Environment variables; secret manager (AWS Secrets Manager, HashiCorp Vault); `.env` excluded from VCS |
|
|
86
|
+
| **Look for** | `API_KEY=`, `password=`, `secret=`, `token=` in code; AWS keys, GitHub tokens, Stripe keys, JWTs in source |
|
|
87
|
+
| **False-positive guard** | Secrets stored on disk but otherwise secured ARE excluded. Logging high-value secrets IS a vuln. Logging URLs is safe. |
|
|
88
|
+
|
|
89
|
+
## Template Injection (SSTI)
|
|
90
|
+
|
|
91
|
+
| Aspect | Detail |
|
|
92
|
+
|--------|--------|
|
|
93
|
+
| **Vulnerable** | User input in template rendering: `Template(user_input).render()`, `render_template_string(user_input)` |
|
|
94
|
+
| **Safe** | Static templates; input passed as context variable, not template string |
|
|
95
|
+
| **Look for** | `render_template_string`, `Template()()` with user string; `eval` in template context; `${user_input}` in JS template literals on server |
|
|
96
|
+
|
|
97
|
+
## NoSQL Injection
|
|
98
|
+
|
|
99
|
+
| Aspect | Detail |
|
|
100
|
+
|--------|--------|
|
|
101
|
+
| **Vulnerable** | User input in MongoDB queries: `db.users.find({username: user_input})` where input is `{"$gt": ""}` |
|
|
102
|
+
| **Safe** | Schema validation; type checking on query params; ORM sanitization |
|
|
103
|
+
| **Look for** | MongoDB `$where`, `$gt`, `$regex` from user input; raw mongo queries without type coercion |
|
|
@@ -0,0 +1,63 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: security-review
|
|
3
|
+
description: >
|
|
4
|
+
AI-powered security analysis of code changes — traces data flow, detects
|
|
5
|
+
injection, auth bypass, secrets exposure, and unsafe deserialization across
|
|
6
|
+
files. Use when reviewing pending changes, before release-branch, during
|
|
7
|
+
verify-work Phase 5, during build-epic Step 0 threat modeling, or when
|
|
8
|
+
the user says "security review" or "scan for vulns".
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# Security Review
|
|
12
|
+
|
|
13
|
+
> **HARD GATE** — Requires git context (branch with merge-base or diff). Never
|
|
14
|
+
> writes files outside `specs/security/`. Findings below confidence 8/10 are
|
|
15
|
+
> suppressed. **→ verify:** `git rev-parse HEAD >/dev/null 2>&1 && echo "ok" || echo "BLOCKED"`
|
|
16
|
+
|
|
17
|
+
## 5-phase scan
|
|
18
|
+
|
|
19
|
+
| # | Phase | What |
|
|
20
|
+
|---|-------|------|
|
|
21
|
+
| 1 | **Scope Resolution** | Detect diff via `git diff --merge-base origin/HEAD`; resolve languages/frameworks from dependency files |
|
|
22
|
+
| 2 | **Context Research** | Identify existing security patterns, sanitization, auth model in the codebase |
|
|
23
|
+
| 3 | **Vulnerability Assessment** | Trace user input → sink; check auth boundaries, crypto, deserialization, path ops |
|
|
24
|
+
| 4 | **False-Positive Filtering** | Cross-check each finding against exclusion rules; reject confidence < 8 |
|
|
25
|
+
| 5 | **Report Generation** | Output structured markdown: file:line, severity, category, exploit scenario, fix |
|
|
26
|
+
|
|
27
|
+
## Categories
|
|
28
|
+
|
|
29
|
+
Covered: SQLi, XSS, SSRF, command injection, auth bypass, unsafe deserialization, path traversal, IDOR, crypto flaws, secrets exposure, template injection, NoSQLi
|
|
30
|
+
|
|
31
|
+
## Integration points
|
|
32
|
+
|
|
33
|
+
| Skill | Touchpoint |
|
|
34
|
+
|-------|------------|
|
|
35
|
+
| `build-epic` | Step 0 — threat-model epic scope → `specs/security/epics/<id>/THREAT_MODEL.md` |
|
|
36
|
+
| `plan-work` | `security:` field (none/low/medium/high) on story tasks |
|
|
37
|
+
| `plan-release` | +2 WSJF risk boost for HIGH+ risk epics |
|
|
38
|
+
| `audit-code` | Checklist: "diff scanned — no unaddressed HIGH findings" |
|
|
39
|
+
| `request-review` | Inject threat model categories + false-positive rules into reviewer prompt |
|
|
40
|
+
| `investigate-bug` | Security-impact assessment in RCA (NONE→CRITICAL) |
|
|
41
|
+
| `validate-fix` | Recurrence hardening check for security bugs |
|
|
42
|
+
| `verify-work` | Phase 5 — blocks on HIGH findings ≥ 8 confidence |
|
|
43
|
+
| `release-branch` | Hard gate — blocks merge if unresolved HIGH findings |
|
|
44
|
+
|
|
45
|
+
## Report format
|
|
46
|
+
|
|
47
|
+
Each finding: **`File:Line` — Severity — Category**
|
|
48
|
+
- Description: how the vulnerability manifests
|
|
49
|
+
- Exploit scenario: concrete attack path
|
|
50
|
+
- Recommendation: fix with code example
|
|
51
|
+
|
|
52
|
+
## Reference files
|
|
53
|
+
|
|
54
|
+
- [Vuln categories](REFERENCE-vuln-categories.md) — detection guidance per vuln type
|
|
55
|
+
- [False positives](REFERENCE-false-positives.md) — hard exclusions + precedent
|
|
56
|
+
- [Confidence rubric](REFERENCE-confidence-rubric.md) — scoring methodology (0–10)
|
|
57
|
+
|
|
58
|
+
## Verify
|
|
59
|
+
|
|
60
|
+
```bash
|
|
61
|
+
test -d specs/security && echo "OK: specs/security/ exists" || mkdir -p specs/security
|
|
62
|
+
grep -q "Merge-base\|merge.base\|git diff" SKILL.md && echo "OK: git context verified"
|
|
63
|
+
```
|
package/skills-lock.json
CHANGED
|
@@ -23,7 +23,7 @@
|
|
|
23
23
|
},
|
|
24
24
|
"build-epic": {
|
|
25
25
|
"description": "Eight-step epic build cycle — reads state.yaml, execution-status.yaml, and one epic capsule; updates status via bp-yaml-set or direct edit. Resume mode runs one step per invocation. Use instead of ad-hoc execute-plan for release work.",
|
|
26
|
-
"sha256": "
|
|
26
|
+
"sha256": "565d8396889dd9c9",
|
|
27
27
|
"path": "build-epic/SKILL.md"
|
|
28
28
|
},
|
|
29
29
|
"change-request": {
|
|
@@ -256,6 +256,11 @@
|
|
|
256
256
|
"sha256": "34df830694a6459c",
|
|
257
257
|
"path": "search-skills/SKILL.md"
|
|
258
258
|
},
|
|
259
|
+
"security-review": {
|
|
260
|
+
"description": "> AI-powered security analysis of code changes — traces data flow, detects injection, auth bypass, secrets exposure, and unsafe deserialization across files. Use when reviewing pending changes, before release-branch, during verify-work Phase 5, during build-epic Step 0 threat modeling, or when the user says \"security review\" or \"scan for vulns\".",
|
|
261
|
+
"sha256": "24aeeed072282d0c",
|
|
262
|
+
"path": "security-review/SKILL.md"
|
|
263
|
+
},
|
|
259
264
|
"seed-conventions": {
|
|
260
265
|
"description": "Generate CLAUDE.md and CONVENTIONS.md for a brand-new project through a brief interview, and create the specs/ directory with evolved bigpowers structure (product/, tech-architecture/, verifications/, epics/archive/). Entry point for greenfield projects. Use when starting a new project from scratch, when user asks to set up AI agent conventions, or when there is no CLAUDE.md yet.",
|
|
261
266
|
"sha256": "cd3a7fc52d1b0035",
|