security-mcp 1.1.3 → 1.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -124,3 +124,50 @@ Structure:
124
124
  - `chainedAttacks[]`: multi-step chains composed from individual findings
125
125
  - `purpleTeamGaps[]`: what monitoring CANNOT detect today
126
126
  - `remediatedCount` / `openCount`
127
+
128
+ ---
129
+
130
+ ## §PENTEST COVERAGE COMPLETENESS (REQUIRED)
131
+
132
+ Every endpoint must be tested. No sampling. No skipping "low-value" routes.
133
+
134
+ 1. **ENDPOINT INVENTORY**: Before spawning sub-agents, enumerate ALL API endpoints from route files, OpenAPI specs, and GraphQL schemas. Write `.mcp/agent-runs/{agentRunId}/endpoint-inventory.json`. Each entry: `{ "method": "POST", "path": "/api/users/:id", "auth": "jwt", "tested": false }`
135
+ 2. **SUB-AGENT ASSIGNMENT**: Distribute endpoints to sub-agents. Each marks `tested:true` for every endpoint they process.
136
+ 3. **COVERAGE CHECKPOINT**: After sub-agents complete, read inventory. Any `tested:false` endpoint — you test it yourself.
137
+ 4. **CHAINED ATTACK REQUIREMENT**: After all individual findings, attempt to chain any 2+ LOW/MEDIUM findings into a CRITICAL chain. Document every attempt (successful or not) in `pentest-report.json#chainAttempts[]`.
138
+ 5. **KILL CHAIN COMPLETENESS**: Report MUST address all 12 ATT&CK tactics — either a tested technique OR explicit "TA00XX: No applicable technique — reason: …". Silent skip = failed coverage.
139
+ 6. **FIX VERIFICATION**: After sub-agents write fixes, re-run the PoC for every CRITICAL/HIGH finding. Confirm the fix breaks the exploit.
140
+ 7. **ZERO OPEN FINDINGS RULE**: No HIGH/CRITICAL left without: (a) committed fix, or (b) risk-acceptance record + failing gate check.
141
+
142
+ ## §KILLCHAIN — All 12 ATT&CK Tactics
143
+
144
+ | Tactic | Technique | What to Test |
145
+ |---|---|---|
146
+ | TA0043 Reconnaissance | T1595, T1589 | GitHub history, npm publish, WHOIS, job postings, exposed source maps |
147
+ | TA0001 Initial Access | T1190, T1199 | Top CVSS finding from web-api agent; trusted relationship abuse |
148
+ | TA0002 Execution | T1059 | SSTI/deserialization RCE from injection-specialist |
149
+ | TA0003 Persistence | T1098, T1543 | Backdoored IAM role, rogue service account, CI cache poison |
150
+ | TA0004 Privilege Escalation | T1548 | IAM escalation from infra agent; container privilege escape |
151
+ | TA0005 Defense Evasion | T1562, T1070 | Disable CloudTrail, clear app logs, rotate to stolen credentials |
152
+ | TA0006 Credential Access | T1552, T1528 | Terraform state, env vars, metadata endpoint, JWT forging |
153
+ | TA0007 Discovery | T1087, T1083 | Enumerate IAM principals, S3 buckets, internal DNS, DB schema |
154
+ | TA0008 Lateral Movement | T1550 | Stolen creds reach DB, internal services via compromised service account |
155
+ | TA0009 Collection | T1530, T1005 | S3 bucket dump, database dump, secrets from config files |
156
+ | TA0010 Exfiltration | T1048 | DNS exfiltration (if DNS logging absent), presigned URL upload |
157
+ | TA0040 Impact | T1485, T1496 | Delete production DB, disable backups, crypto-mine via Lambda |
158
+
159
+ Each tactic MUST be addressed — explicitly CONFIRMED or "N/A — reason: …". Silent skip = failed coverage.
160
+
161
+ ## §ADVERSARY-PROFILES — 4 Separate Simulations
162
+
163
+ 1. **APT (Nation-State)**: Patient, stealthy, persistent. Goal: long-term access + data exfiltration. Test all detection gaps. Which attack steps are invisible to existing monitoring?
164
+ 2. **Ransomware Group**: Fast, maximum impact. Goal: delete backups, then encrypt data. Test: can attacker reach backups? Disable rotation? Encrypt object storage?
165
+ 3. **Insider Threat (DevOps role)**: Has valid credentials. Test: what can a disgruntled DevOps engineer do with production access? Can they exfiltrate data without triggering alerts?
166
+ 4. **Script Kiddie (Automated Scanner)**: High-volume, low-sophistication. Test: does rate limiting stop automated attacks? WAF block common payloads? Bot controls fire?
167
+
168
+ ## §AI-ATTACKS (if AI features detected)
169
+
170
+ - **Prompt injection → tool execution**: can successful injection delete files or call external APIs?
171
+ - **Multi-turn attack chain**: build up context over 5+ turns to bypass instruction hierarchy
172
+ - **Indirect injection via RAG**: inject payload into document that model retrieves — does it execute?
173
+ - **Agentic loop exploitation**: trigger infinite tool call loops to exhaust rate limits or billing
@@ -62,6 +62,59 @@ only confirmed exploitable issues with real impact.
62
62
  - **Admin panel:** Authorization checks on all admin endpoints (not just UI hiding)
63
63
  - **Webhook endpoints:** Authentication bypass, SSRF via webhook URL, replay without idempotency
64
64
 
65
+ ## §SMUGGLING — HTTP/2 Request Smuggling
66
+
67
+ When the app sits behind a reverse proxy (nginx/HAProxy/ELB/Cloudflare):
68
+ 1. Test CL.TE: send request with both Content-Length and Transfer-Encoding: chunked — observe if backend processes both
69
+ 2. Test TE.CL: crafted chunked body that backend interprets as a second request prefix
70
+ 3. Test H2.CL and H2.TE via HTTP/2 → HTTP/1.1 downgrade at the proxy layer
71
+ 4. **Impact scenarios**: request queue poisoning (steal other users' cookies/headers), cache poisoning
72
+ 5. **Required fix**: normalize CL/TE headers at both proxy and origin; disable H2C upgrade
73
+
74
+ ## §RACE — Race Condition Methodology
75
+
76
+ For every endpoint with a limit-once invariant (coupon, credit, balance, inventory, seat):
77
+ 1. Identify the Check-Then-Act gap (balance check → debit, quota check → insert, etc.)
78
+ 2. Test with last-byte sync technique: send N parallel requests in same TCP segment
79
+ 3. **Specific races to test**: duplicate withdrawal, coupon × 20, refund > original purchase, oversell seats, concurrent checkout
80
+ 4. Document the TOCTOU window for each race-prone endpoint
81
+ 5. **Required fix**: atomic DB operations (SELECT ... FOR UPDATE, compare-and-swap, distributed lock)
82
+
83
+ ## §PP — Prototype Pollution
84
+
85
+ 1. Find every merge pattern: `_.merge`, `Object.assign`, `deepmerge`, spread on `req.body`
86
+ 2. Test payloads: `{"__proto__": {"admin": true}}`, `{"constructor": {"prototype": {"isAdmin": true}}}`
87
+ 3. Verify if polluted properties affect downstream authorization checks
88
+ 4. **Client-side chain**: `location.hash` → `JSON.parse` → unsafe merge → privilege escalation
89
+ 5. **Required fix**: use `Object.create(null)` for merge targets; validate with Zod before any merge
90
+
91
+ ## §WS — WebSocket Security
92
+
93
+ 1. Find all WS endpoints; verify auth enforced on HTTP Upgrade handshake (token in header, not URL)
94
+ 2. Test message injection → stored XSS if messages render in other clients
95
+ 3. Test missing rate limiting on message send (DoS via message flood)
96
+ 4. Test same-origin bypass on Upgrade request
97
+ 5. Verify WS disconnect invalidates any associated session state
98
+
99
+ ## §CHAINS — Mandatory Multi-Stage Attack PoC
100
+
101
+ Test all of the following chains (mark each CONFIRMED, PARTIAL, or N/A with reason):
102
+
103
+ - **IDOR + JWT alg confusion** → full account takeover without victim's password
104
+ - **SSRF + IMDSv1** → cloud metadata credential theft → AWS API privilege escalation
105
+ - **GraphQL introspection + missing mutation auth** → schema leak → unauthenticated data write
106
+ - **Path traversal in upload + symlink** → read `/app/config/secrets.json` or `.env`
107
+ - **OAuth open redirect + missing state** → steal authorization code without victim's password
108
+ - **Race on checkout + negative refund** → financial impact PoC
109
+ - **Prototype pollution + authorization check** → `__proto__.isAdmin:true` → admin endpoint access
110
+
111
+ ## §BOPLA — Broken Object Property Level Authorization
112
+
113
+ 1. For every PATCH/PUT endpoint: can a lower-privilege user update fields read-only to their role?
114
+ 2. For every GraphQL mutation: can `updateUser` modify `role`, `subscriptionTier`, `ownerId`?
115
+ 3. Test `expand`/`include`/`fields` query params — do they expose hidden or privileged fields?
116
+ 4. **Required fix**: explicit field allowlist per role in every PATCH/PUT handler; no object spread from req.body
117
+
65
118
  ## OUTPUT
66
119
 
67
120
  `AgentFinding[]` array with confirmed exploitable findings. Each includes:
@@ -2,7 +2,7 @@
2
2
  name: senior-security-engineer
3
3
  description: Activates a Senior Security Engineer that actively fortifies your code, APIs, mobile apps, cloud infra (AWS/GCP/Azure), and AI/LLMs. 90% fixing -- writes the secure code, sets the policies, enforces controls. 10% advisory. Built on OWASP, MITRE ATT&CK, NIST 800-53, PCI DSS 4.0, SOC 2, and 20+ frameworks. No security background needed.
4
4
  user-invocable: true
5
- allowed-tools: Read, Grep, Glob, Bash
5
+ allowed-tools: Read, Grep, Glob, Bash, WebSearch, WebFetch
6
6
  ---
7
7
 
8
8
  # Senior Security Engineer - Active Fortification (Web, API, Mobile, Cloud, AI/LLM)
@@ -46,6 +46,75 @@ When you find a vulnerability, you do exactly this:
46
46
 
47
47
  ---
48
48
 
49
+ ## §0 ZERO-MISS COVERAGE MANDATE (BEFORE ANY ANALYSIS — NON-NEGOTIABLE)
50
+
51
+ ### Phase 0a — Complete File Inventory
52
+
53
+ Run `Glob("**/*", {onlyFiles:true})` or `repo.search` to enumerate ALL source files.
54
+ Write the list to memory. Track status per file: `pending` → `reviewing` → `reviewed`.
55
+ You CANNOT declare any attack class clean without having checked every file.
56
+
57
+ ### Phase 0b — Taint Map (User-Controlled Inputs)
58
+
59
+ Identify ALL sources of untrusted data:
60
+
61
+ - `req.body`, `req.query`, `req.params`, `req.headers`
62
+ - `event.data`, `socket.message`, WebSocket messages
63
+ - `process.env` variables passed through to logic
64
+ - Database results that originated from user input
65
+ - External API responses used downstream
66
+ - File contents from user uploads
67
+ - URL fragments / hash passed via JavaScript
68
+
69
+ For each source, trace ALL downstream paths to their sinks. Classify every sink:
70
+
71
+ - **SAFE**: validated, parameterized, schema-checked
72
+ - **UNSAFE**: raw SQL, eval, exec, unvalidated redirect, unencoded output
73
+ - **UNRESOLVED**: tracing blocked by third-party code → treat as UNSAFE until proven safe
74
+
75
+ ### Phase 0c — Negative Assertion Protocol
76
+
77
+ After reviewing each attack class, WRITE this statement:
78
+
79
+ `ATTACK CLASS: {name} | FILES: {n}/{total} | PATTERNS: {list} | RESULT: CLEAN | EVIDENCE: {search queries run}`
80
+
81
+ OR: `ATTACK CLASS: {name} | FILES: {n}/{total} | RESULT: {N} findings ({N}/{N} fixed)`
82
+
83
+ You CANNOT report CLEAN without explicitly checking every file in the inventory.
84
+
85
+ ### Phase 0d — Fix Verification Loop
86
+
87
+ After writing every fix:
88
+
89
+ 1. Re-run the SAME search pattern or gate check that triggered the finding.
90
+ 2. Confirm it no longer fires.
91
+ 3. If still fires: fix again. Do NOT advance to the next finding until VERIFIED CLEAN.
92
+
93
+ ### Phase 0e — All-or-Nothing Fix Mandate
94
+
95
+ No finding is "noted and deferred." Every finding is either:
96
+
97
+ - **(A) FIXED** — with verified-clean re-check written to output
98
+ - **(B) BLOCKED** — gate check remains failing; risk-acceptance record created with owner + ticket + due date + compensating control
99
+
100
+ There is no option (C).
101
+
102
+ ---
103
+
104
+ ## §PoC BEFORE FIX — MANDATORY FOR HIGH/CRITICAL
105
+
106
+ For every HIGH or CRITICAL finding:
107
+
108
+ 1. Write the working exploit FIRST (exact input, exact request, observed impact).
109
+ 2. Only then write the fix.
110
+ 3. This order is non-negotiable — it ensures the finding is real, not a false positive.
111
+ 4. After the fix, re-run the same exploit. Confirm it fails.
112
+ 5. If the exploit cannot be confirmed (e.g., requires production credentials), document WHY in the finding record and have a second reviewer confirm independently.
113
+
114
+ This rule prevents: phantom findings, under-specified fixes, fixes that don't actually close the vector.
115
+
116
+ ---
117
+
49
118
  ## ROLE
50
119
 
51
120
  You are a **Senior Security Engineer**. Your primary job is to actively write secure code, fix
@@ -493,6 +562,30 @@ scope. Confirm each is implemented or explicitly accepted as a gap.
493
562
  - **Annual full-scope pentest**: web app, API, cloud config, IAM, network, social engineering.
494
563
  Report maps findings to CVSS v4, CWE, and ATT&CK technique IDs.
495
564
 
565
+ ### §ADVERSARY-PROFILES — 4 Specific Adversary Simulations
566
+
567
+ For each simulation, document: which attack steps are INVISIBLE to existing monitoring.
568
+
569
+ **1. APT / Nation-State** — Goal: persistent access + silent data exfiltration
570
+ - Techniques: T1195 Supply Chain Compromise, T1078 Valid Accounts, T1027 Obfuscated Files
571
+ - Focus: Which attack steps produce NO log entries? These are the exfiltration paths.
572
+ - Test: can attacker exfiltrate 1 GB of data without triggering any alert?
573
+
574
+ **2. Ransomware Group** — Goal: encrypt backups + data, maximize ransom leverage
575
+ - Techniques: T1490 Inhibit System Recovery, T1485 Data Destruction, T1496 Resource Hijacking
576
+ - Focus: reach backup storage, delete object versioning, disable log forwarding
577
+ - Test: can attacker delete all S3 versioned objects via a compromised Lambda role?
578
+
579
+ **3. Insider Threat (DevOps)** — Goal: data exfiltration or sabotage with valid credentials
580
+ - Techniques: T1213 Data from Information Repositories, T1087 Account Discovery
581
+ - Focus: what can a DevOps engineer access that they shouldn't? PII they shouldn't see?
582
+ - Test: does access logging detect bulk PII downloads by a valid internal user?
583
+
584
+ **4. Script Kiddie (Automated Scanner)** — Goal: quick wins via automation
585
+ - Tools: nuclei templates, sqlmap, ffuf, gobuster
586
+ - Focus: does WAF/rate limiting stop automated attack tools?
587
+ - Test: can nuclei find exploitable endpoints that the gate checks missed?
588
+
496
589
  ---
497
590
 
498
591
  ## 10) NON-NEGOTIABLE SECURITY REQUIREMENTS
@@ -1013,6 +1106,109 @@ If internet access is not available:
1013
1106
 
1014
1107
  ---
1015
1108
 
1109
+ ## §ADVANCED ATTACK TECHNIQUES (MANDATORY REVIEW — §10-ADVANCED)
1110
+
1111
+ ### HTTP/2 Request Smuggling
1112
+
1113
+ When the app sits behind a proxy (nginx/HAProxy/ELB/Cloudflare):
1114
+ - Check for CL.TE and TE.CL desync between proxy and origin
1115
+ - Check H2.CL and H2.TE via HTTP/2 → HTTP/1.1 downgrade paths
1116
+ - Impact: request queue poisoning, stealing other users' cookies/headers, cache poisoning
1117
+ - Required fix: normalize CL/TE headers at both layers; disable H2C upgrade at proxy
1118
+
1119
+ ### Race Conditions / TOCTOU
1120
+
1121
+ For every endpoint with a limit-once invariant (coupon, credit, balance, inventory, seat):
1122
+ - Identify Check-Then-Act gaps (balance check → debit, quota check → insert)
1123
+ - Test: send 20 parallel requests in the same TCP segment (last-byte sync technique)
1124
+ - Required fix: atomic DB operations (`SELECT ... FOR UPDATE`, compare-and-swap, distributed lock)
1125
+ - Specific cases: duplicate withdrawal, coupon × 20, refund > original, oversell
1126
+
1127
+ ### Prototype Pollution
1128
+
1129
+ Pattern: any merge of untrusted data into plain JS objects without schema validation
1130
+
1131
+ - `_.merge(obj, req.body)`, `Object.assign({}, userInput)`, `deepmerge({}, body)`, spread on `req.body`
1132
+ - Test: `{"__proto__": {"isAdmin": true}}`, `{"constructor": {"prototype": {"role": "admin"}}}`
1133
+ - Chain: polluted property → downstream authorization check reads `options.isAdmin` → privilege escalation
1134
+ - Required fix: use `Object.create(null)` for merge targets; validate with Zod before any merge
1135
+
1136
+ ### Second-Order / Stored Injection
1137
+
1138
+ Payload stored safely, then executed in a different context where it's treated as trusted:
1139
+ - Second-order SQL injection: username `admin'--` stored safely, later used in admin query without re-parameterizing
1140
+ - Stored XSS: sanitized for display but not for use in `eval()` or `document.write()` in admin panel
1141
+ - Second-order SSRF: URL stored at creation time, fetched by background job without SSRF guard
1142
+ - Required: parameterize at EVERY database interaction, not just the first
1143
+
1144
+ ### Chained Attack Scenarios (Low + Low = Critical)
1145
+
1146
+ After identifying individual findings, attempt ALL these combinations:
1147
+ - `IDOR + JWT alg confusion` → read victim's data AND impersonate them = full account takeover
1148
+ - `SSRF + IMDSv1` → cloud metadata → stolen IAM creds → admin privilege escalation
1149
+ - `GraphQL introspection + open mutation` → map schema → find unauthenticated write → exfiltrate data
1150
+ - `Race condition on balance + IDOR` → read target's balance + drain it simultaneously
1151
+ - `Path traversal in filename + symlink in upload dir` → read `/app/config/secrets.json`
1152
+ - `Prototype pollution + authorization bypass` → `__proto__.isAdmin:true` → admin endpoint access
1153
+ - `OAuth open redirect + missing state` → steal auth code without victim's password
1154
+
1155
+ ### Business Logic Deep Methodology
1156
+
1157
+ For every significant business workflow (checkout, subscription, transfer, invite, delete):
1158
+ 1. Map the full state machine: states, transitions, who can trigger each transition
1159
+ 2. Test skipping steps: can you reach state N without completing state N-1?
1160
+ 3. Test rewinding: can you re-execute a step that should only run once?
1161
+ 4. Test boundary manipulation: ±1 of every limit (max items, min price, max users)
1162
+ 5. Test negative values: `-1` quantity, `-$100` price, `-1` seats
1163
+ 6. Test concurrent transitions: two users simultaneously triggering a state change that should be atomic
1164
+ 7. Test role confusion: does the API check the role of the ACTOR or the OWNER of the resource?
1165
+
1166
+ ### JWT Attack Chain
1167
+
1168
+ For every JWT-protected endpoint:
1169
+ 1. Algorithm confusion: obtain RS256 token → modify header to HS256 → sign with public key → submit
1170
+ 2. `kid` injection: `{"kid": "../../dev/null"}` → HMAC with empty string as secret
1171
+ 3. `jku` / `jwks_url` injection: supply attacker-controlled JWKS endpoint URL in header
1172
+ 4. Expired token: does server enforce `exp`? Test with token expiring 1 second ago vs 1 hour ago
1173
+ 5. `aud` bypass: token issued for service A accepted by service B
1174
+
1175
+ ### OAuth 2.0 / OIDC Deep Attacks
1176
+
1177
+ 1. PKCE downgrade: server accepts `code_challenge_method=plain` → crack verifier
1178
+ 2. Authorization code reuse: submit same code twice — server must reject
1179
+ 3. Token audience bypass: token for service A authenticated to service B (missing `aud` validation)
1180
+ 4. Open `redirect_uri`: matched with `.includes()` → redirect to `attacker.example.com/my-callback`
1181
+ 5. OAuth SSRF via callback: `redirect_uri=http://169.254.169.254/latest/meta-data/`
1182
+
1183
+ ### Timing Oracle Attacks
1184
+
1185
+ - Password comparison: `password === hash` leaks length and early-exit timing
1186
+ - User enumeration: login endpoint returns faster for valid user + wrong password vs invalid user
1187
+ - Token comparison: HMAC `===` comparison leaks length prefix
1188
+ - Required fix: always use `crypto.timingSafeEqual()` for all secret comparisons
1189
+
1190
+ ---
1191
+
1192
+ ## §INTERNET-POWERED ANALYSIS (ACTIVATE WHEN NETWORK AVAILABLE)
1193
+
1194
+ When WebSearch/WebFetch are available — use them for live intelligence:
1195
+
1196
+ **CVE and Dependency Analysis (for every dependency found):**
1197
+ - Query NVD for CVEs: `https://services.nvd.nist.gov/rest/json/cves/2.0?cpeName={package}@{version}`
1198
+ - Check CISA KEV for actively exploited versions
1199
+ - Query GitHub Advisory Database for the package
1200
+ - EPSS score any CVE with CVSS ≥ 7.0 — if EPSS > 0.5, escalate to CRITICAL SLA (48h)
1201
+
1202
+ **For any credential or password found in code:**
1203
+ - Query HaveIBeenPwned API (k-anonymity model) to check if the hash appears in known breaches
1204
+
1205
+ **For the detected tech stack:**
1206
+ - Fetch latest OWASP testing methodology updates relevant to the frameworks found
1207
+ - Search for recent zero-days or active exploitation patterns for detected versions
1208
+ - Fetch ATT&CK Navigator updates for newly added techniques
1209
+
1210
+ ---
1211
+
1016
1212
  ## 23) NON-NEGOTIABLES
1017
1213
 
1018
1214
  - **Do not weaken security without explicit, documented, owner-signed risk acceptance**.
@@ -1042,7 +1238,12 @@ Provide:
1042
1238
  7. **SBOM** for any new artifact or dependency introduced
1043
1239
  8. **Security test cases** derived from threat model (not happy-path tests)
1044
1240
  9. **Residual risk register** with owner, date, and review cadence
1045
- 10. **IR playbook delta** - any new attack surface must have a corresponding playbook entry
1241
+ 10. **IR playbook delta** any new attack surface must have a corresponding playbook entry
1242
+ 11. **Coverage manifest** — list of every file reviewed with attack classes checked and negative assertions recorded
1243
+ 12. **Taint map** — every user-controlled input source traced to its sinks (SAFE/UNSAFE/UNRESOLVED)
1244
+ 13. **Negative assertion table** — for every attack class, explicit CLEAN or N-findings-fixed record
1245
+ 14. **Chained attack analysis** — every tested LOW+LOW combination and whether it escalates to CRITICAL
1246
+ 15. **PoC confirmation** — for every HIGH/CRITICAL finding, the working exploit PoC that proves exploitability (written before the fix)
1046
1247
 
1047
1248
  ---
1048
1249