npm - security-mcp - Versions diffs - 1.1.3 → 1.3.1 - Mend

security-mcp 1.1.3 → 1.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (133) hide show

package/README.md +164 -185
package/defaults/checklists/ai.json +20 -1
package/defaults/checklists/api.json +35 -1
package/defaults/checklists/infra.json +34 -1
package/defaults/checklists/mobile.json +23 -1
package/defaults/checklists/payments.json +15 -1
package/defaults/checklists/web.json +11 -1
package/defaults/control-catalog.json +200 -0
package/defaults/security-policy.json +2 -2
package/dist/cli/index.js +82 -5
package/dist/cli/install.js +36 -6
package/dist/cli/onboarding.js +6 -0
package/dist/gate/baseline.js +82 -7
package/dist/gate/catalog.js +10 -2
package/dist/gate/checks/ai.js +757 -39
package/dist/gate/checks/auth-deep.js +935 -0
package/dist/gate/checks/business-logic.js +751 -0
package/dist/gate/checks/ci-pipeline.js +399 -4
package/dist/gate/checks/crypto.js +423 -2
package/dist/gate/checks/dependencies.js +571 -15
package/dist/gate/checks/graphql.js +201 -19
package/dist/gate/checks/infra.js +246 -1
package/dist/gate/checks/injection-deep.js +848 -0
package/dist/gate/checks/k8s.js +114 -1
package/dist/gate/checks/mobile-android.js +917 -3
package/dist/gate/checks/mobile-ios.js +797 -5
package/dist/gate/checks/required-artifacts.js +194 -0
package/dist/gate/checks/runtime.js +178 -0
package/dist/gate/checks/secrets.js +244 -13
package/dist/gate/checks/supply-chain-deep.js +787 -0
package/dist/gate/checks/web-nextjs.js +572 -48
package/dist/gate/diff.js +17 -5
package/dist/gate/evidence.js +8 -1
package/dist/gate/exceptions.js +131 -9
package/dist/gate/policy.js +282 -129
package/dist/mcp/audit-chain.js +122 -28
package/dist/mcp/auth.js +169 -0
package/dist/mcp/learning.js +129 -4
package/dist/mcp/model-router.js +158 -21
package/dist/mcp/orchestration.js +186 -51
package/dist/mcp/server.js +608 -94
package/dist/repo/fs.js +24 -1
package/dist/repo/search.js +31 -6
package/dist/review/store.js +52 -1
package/package.json +7 -7
package/prompts/SECURITY_PROMPT.md +73 -0
package/skills/_TEMPLATE/SKILL.md +99 -0
package/skills/advanced-dos-tester/SKILL.md +109 -0
package/skills/agentic-loop-exploiter/SKILL.md +368 -0
package/skills/ai-llm-redteam/SKILL.md +104 -0
package/skills/ai-model-supply-chain-agent/SKILL.md +103 -0
package/skills/algorithm-implementation-reviewer/SKILL.md +98 -0
package/skills/android-penetration-tester/SKILL.md +455 -46
package/skills/anti-replay-tester/SKILL.md +106 -0
package/skills/appsec-code-auditor/SKILL.md +120 -0
package/skills/artifact-integrity-analyst/SKILL.md +441 -0
package/skills/attack-navigator/SKILL.md +467 -8
package/skills/auth-session-hacker/SKILL.md +128 -0
package/skills/aws-penetration-tester/SKILL.md +456 -0
package/skills/azure-penetration-tester/SKILL.md +490 -3
package/skills/binary-auth-validator/SKILL.md +111 -0
package/skills/bot-detection-specialist/SKILL.md +109 -0
package/skills/business-logic-attacker/SKILL.md +231 -0
package/skills/capec-code-mapper/SKILL.md +84 -0
package/skills/cert-pin-rotation-specialist/SKILL.md +112 -0
package/skills/cicd-pipeline-hijacker/SKILL.md +405 -0
package/skills/ciso-orchestrator/SKILL.md +454 -43
package/skills/cloud-infra-specialist/SKILL.md +118 -0
package/skills/compliance-gap-analyst/SKILL.md +422 -0
package/skills/compliance-grc/SKILL.md +85 -0
package/skills/compliance-lifecycle-tracker/SKILL.md +84 -0
package/skills/credential-stuffing-specialist/SKILL.md +102 -0
package/skills/crypto-pki-specialist/SKILL.md +87 -0
package/skills/csa-ccm-mapper/SKILL.md +84 -0
package/skills/csf2-governance-mapper/SKILL.md +84 -0
package/skills/deep-link-fuzzer/SKILL.md +109 -0
package/skills/dependency-confusion-attacker/SKILL.md +415 -0
package/skills/device-integrity-aggregator/SKILL.md +108 -0
package/skills/dos-resilience-tester/SKILL.md +97 -0
package/skills/dread-scorer/SKILL.md +84 -0
package/skills/egress-policy-enforcer/SKILL.md +99 -0
package/skills/evidence-collector/SKILL.md +98 -0
package/skills/file-upload-attacker/SKILL.md +109 -0
package/skills/gcp-penetration-tester/SKILL.md +459 -2
package/skills/git-history-secret-scanner/SKILL.md +106 -0
package/skills/iam-privesc-graph-builder/SKILL.md +152 -0
package/skills/incident-responder/SKILL.md +111 -0
package/skills/injection-specialist/SKILL.md +131 -0
package/skills/ios-security-auditor/SKILL.md +282 -0
package/skills/json-ambiguity-tester/SKILL.md +0 -0
package/skills/k8s-container-escaper/SKILL.md +384 -0
package/skills/key-management-lifecycle-analyst/SKILL.md +98 -0
package/skills/kill-switch-engineer/SKILL.md +102 -0
package/skills/linddun-privacy-analyst/SKILL.md +102 -0
package/skills/logic-race-fuzzer/SKILL.md +443 -0
package/skills/mobile-api-network-attacker/SKILL.md +421 -0
package/skills/mobile-binary-hardener/SKILL.md +102 -0
package/skills/mobile-security-specialist/SKILL.md +85 -0
package/skills/mobile-webview-auditor/SKILL.md +96 -0
package/skills/model-extraction-attacker/SKILL.md +219 -0
package/skills/multipart-abuse-tester/SKILL.md +84 -0
package/skills/oauth-pkce-specialist/SKILL.md +104 -0
package/skills/parser-exhaustion-tester/SKILL.md +142 -0
package/skills/pentest-infra/SKILL.md +141 -0
package/skills/pentest-social/SKILL.md +201 -0
package/skills/pentest-team/SKILL.md +134 -0
package/skills/pentest-web-api/SKILL.md +151 -0
package/skills/privacy-flow-analyst/SKILL.md +234 -0
package/skills/prompt-injection-specialist/SKILL.md +394 -0
package/skills/quantum-migration-planner/SKILL.md +96 -0
package/skills/rag-poisoning-specialist/SKILL.md +358 -0
package/skills/registry-mirror-enforcer/SKILL.md +84 -0
package/skills/rotation-validation-agent/SKILL.md +112 -0
package/skills/samm-assessor/SKILL.md +85 -0
package/skills/secrets-mask-bypass-tester/SKILL.md +100 -0
package/skills/senior-security-engineer/SKILL.md +370 -2
package/skills/serialization-memory-attacker/SKILL.md +332 -0
package/skills/session-timeout-tester/SKILL.md +161 -0
package/skills/slsa-level3-enforcer/SKILL.md +112 -0
package/skills/slsa-provenance-enforcer/SKILL.md +102 -0
package/skills/ssrf-detection-validator/SKILL.md +108 -0
package/skills/step-up-auth-enforcer/SKILL.md +84 -0
package/skills/stride-pasta-analyst/SKILL.md +420 -0
package/skills/supply-chain-devsecops/SKILL.md +98 -0
package/skills/threat-infrastructure-analyst/SKILL.md +84 -0
package/skills/threat-modeler/SKILL.md +85 -0
package/skills/tls-certificate-auditor/SKILL.md +573 -18
package/skills/token-reuse-detector/SKILL.md +95 -0
package/skills/trike-risk-modeler/SKILL.md +84 -0
package/skills/unicode-homograph-tester/SKILL.md +84 -0
package/skills/waf-rule-lifecycle-agent/SKILL.md +97 -0
package/skills/webhook-security-tester/SKILL.md +102 -0
package/skills/zero-trust-architect/SKILL.md +109 -0

package/skills/bot-detection-specialist/SKILL.md CHANGED Viewed

@@ -219,3 +219,112 @@ If internet permitted:
 - `requiredActions`: ordered action list
 - `complianceImpact`: framework mappings
 - `beyondSkillMd`: true if finding goes beyond the SKILL.md mandate
+- `intelligenceForOtherAgents`: cross-agent intelligence package (see schema below)
+Every findings JSON MUST include `intelligenceForOtherAgents`:
+```json
+{
+  "intelligenceForOtherAgents": {
+    "forPentestTeam": [{ "type": "HIGH_VALUE_TARGET", "description": "Unprotected login endpoint with no bot mitigation — ideal credential-stuffing target", "exploitHint": "Use Hydra or Sentry MBA with residential proxies; no CAPTCHA barrier" }],
+    "forCryptoSpecialist": [{ "type": "CRYPTO_WEAKNESS_REFERENCE", "algorithm": "HMAC-SHA1 used in legacy CAPTCHA token validation", "location": "src/middleware/captcha.ts" }],
+    "forCloudSpecialist": [{ "type": "SSRF_TO_CLOUD_CHAIN", "ssrfLocation": "IP reputation check calls external provider with user-supplied URL", "escalationPath": "Redirect to 169.254.169.254 to leak cloud metadata" }],
+    "forComplianceGrc": [{ "type": "COMPLIANCE_BLOCKER", "frameworks": ["PCI DSS Req 8.3.4", "SOC 2 CC6.6"], "releaseBlock": true }]
+  }
+}
+```
+---
+## BEYOND SKILL.MD — MANDATORY EXPANSIONS
+- **AI-Powered CAPTCHA Solving via Multimodal LLMs (ATT&CK T1110.001 / CVE-2023-28531 context):** GPT-4o and Gemini 1.5 Pro achieve >95% solve rates on reCAPTCHA v2 image challenges and >85% on hCaptcha grids as documented in the 2024 UC San Diego paper "An LLM-Powered Autonomous Agent for CAPTCHA Solving." Test by: submit 100 reCAPTCHA v2 image tokens solved via the OpenAI vision API to your login endpoint's CAPTCHA validation route; measure acceptance rate. Finding threshold: >10% acceptance rate with LLM-solved tokens = CAPTCHA layer is effectively defeated; migrate to behavior-only challenges (Turnstile invisible, PoW).
+- **Puppeteer-Extra Stealth Plugin Evasion of `navigator.webdriver` Detection (ATT&CK T1036.005):** The `puppeteer-extra-plugin-stealth` library (npm, 500K+ weekly downloads) patches 11 browser automation signals: `navigator.webdriver`, `window.chrome`, Canvas fingerprint randomization, WebGL vendor spoofing, and `Permissions` API behavior. Standard UA-based and `webdriver` flag checks are completely blind to it. Test by: run `puppeteer-extra` with stealth plugin against your `/api/login` endpoint and confirm bot detection fires on behavioral signals (inter-keystroke timing entropy <0.3, mouse movement linearity >0.95) rather than any header or DOM property. Finding threshold: if bot detection relies solely on `navigator.webdriver` or UA string matching = HIGH finding; requires JS challenge upgrade.
+- **JA3/JA4 TLS Fingerprint Mismatch for Headless Client Detection (Research: Salesforce JA3 2017, BLAKE2 JA4 2023):** Automated HTTP clients (`curl`, `python-requests`, Go `net/http`, Node `undici`) produce TLS ClientHello JA3 hashes distinct from real browser JA3 hashes — even when User-Agent is spoofed to match Chrome 120. JA4 (John Althouse, 2023) extends this to capture ALPN, SNI, and extension ordering, making it significantly harder to spoof. Test by: capture TLS ClientHello packets via `tcpdump` or `Cloudflare JA3 logs` during simulated bot traffic; compare hashes against the FingerprintJS JA3 browser baseline database (`https://ja3er.com`). Finding threshold: if your WAF/edge does not propagate `cf-ja3-fingerprint` (Cloudflare) or equivalent header into the application for bot scoring = MEDIUM gap; implement Cloudflare WAF custom rule to block known bot JA3 hashes and inject fingerprint header.
+- **Credential Stuffing via Residential Proxy Pool with Per-Account Velocity Evasion (ATT&CK T1110.004 / Okta breach October 2023):** The 2023 Okta credential stuffing attack used residential proxy networks (Luminati/Bright Data) to rotate source IPs such that each IP made <3 requests, bypassing all per-IP rate limits. The attack succeeded because per-account lockout was also configured with a high threshold (10 attempts). Test by: using `mitmproxy` + a list of 500 distinct IP headers (`X-Forwarded-For`), submit authentication requests against 50 test accounts at a rate of 2 attempts per IP per account; confirm that cross-account velocity detection (same ASN cluster, same device fingerprint, distributed failed auth) triggers an alert within 15 minutes. Finding threshold: no cross-account velocity alert within 30 minutes of the simulated pattern = CRITICAL; implement sliding-window cross-account anomaly detection keyed on `(ASN, device_fingerprint, failed_auth_count)`.
+- **CAPTCHA Farm Token Replay and Timing-Based Detection (ATT&CK T1111 / 2captcha, CapMonster supply chain risk):** CAPTCHA solving farms (2captcha, CapMonster, Anti-Captcha) return human-solved tokens with a characteristic latency band of 15–45 seconds. Tokens from farms are valid per the CAPTCHA provider's API but are often shared/replayed if the application does not enforce single-use binding to `(session_id, action, timestamp)`. Supply chain risk: CapMonster distributes a browser extension used by end users — if compromised, it could silently exfiltrate valid CAPTCHA tokens. Test by: (1) solve a Turnstile token once, then replay it in 10 subsequent requests within 60 seconds — confirm each replay is rejected; (2) submit tokens with a `solved_in` timestamp of exactly 18 seconds (farm median) across 20 accounts — confirm timing anomaly detection fires. Finding threshold: token accepted more than once = CRITICAL; no timing anomaly detection for farm-latency-band solves = MEDIUM.
+- **EU AI Act Article 52 Transparency Obligation for Bot Scoring Systems (Regulatory — enforcement Q1 2026):** Behavioral bot-scoring systems that make consequential automated decisions (account suspension, access denial, payment blocking) may qualify as AI systems under EU AI Act Annex I and require transparency disclosures under Article 52 if they process EU resident data. The Act's enforcement deadline for high-risk AI provisions is August 2026. Test by: classify your bot-scoring pipeline against AI Act Annex III criteria — if it gates access to essential services (financial, employment, education) it is presumptively high-risk; audit whether affected users receive an Article 52 disclosure and a human-review override path. Finding threshold: bot scoring gates consequential access without a documented human-review override and no Article 52 disclosure = MEDIUM compliance gap requiring legal review before August 2026 enforcement date.
+---
+## §EDGE-CASE-MATRIX
+The 5 bot-detection attack cases that automated scanners and naive manual review universally miss. MANDATORY checks — do not skip.
+| # | Edge Case | Why Scanners Miss It | Concrete Test |
+|---|-----------|----------------------|---------------|
+| 1 | Puppeteer-stealth / undetected-chromedriver patching | Standard headless UA checks pass because stealth mode patches `navigator.webdriver`, overrides `HeadlessChrome` UA, and fakes canvas/WebGL fingerprints | Launch `puppeteer-extra` with `stealth` plugin against the target endpoint; confirm bot detection still fires on behavioral signals (mouse entropy, timing) not UA alone |
+| 2 | Residential proxy pool rotation below per-IP rate limits | Each IP makes only 1–3 requests total — never triggers IP-based thresholds; scanner tests against a single source IP | Simulate 500 requests from 500 distinct IPs (use `mitmproxy` + IP rotation); confirm per-account and behavioral rate limits are independent of source IP |
+| 3 | CAPTCHA farm bypass — human-solved tokens replayed | CAPTCHA token is valid and issued by the provider; no ML bypass needed; scanner only checks "is CAPTCHA present" | Solve a Turnstile/reCAPTCHA token once; replay it in 50 rapid requests; confirm token one-time-use enforcement and binding to session/IP |
+| 4 | Timing attack on honeypot field detection | Application adds latency or changes response shape when honeypot is filled, leaking to attacker which field is the honeypot | Measure response times for filled vs. unfilled honeypot — delta must be zero; response body must be identical (use `simulateLoginDelay` before any branch exit) |
+| 5 | TLS fingerprint mismatch (JA3/JA4 spoofing) | User-Agent matches a real browser but TLS ClientHello JA3 hash matches `curl`/`python-requests` defaults; scanner never checks TLS layer | Capture JA3 hash via Wireshark or Cloudflare logs; compare against browser JA3 baseline database — mismatch with claimed UA = bot |
+---
+## §TEMPORAL-THREATS
+Threats materialising in the 2025–2030 window that bot-detection defences designed today must account for.
+| Threat | Est. Timeline | Relevance to Bot Detection | Prepare Now By |
+|--------|--------------|---------------------------|----------------|
+| LLM-powered CAPTCHA solvers (multimodal) | 2025–2026 (active) | GPT-4o-level vision models solve image CAPTCHAs at >95% accuracy; audio CAPTCHAs solved via Whisper | Move to behaviour-only CAPTCHA alternatives (Turnstile invisible, PoW challenges); treat all image CAPTCHAs as weak |
+| AI-generated synthetic mouse/keyboard behaviour | 2026–2027 | ML models trained on real human interaction datasets produce behavioural biometric fingerprints indistinguishable from humans to current detectors | Require multi-session behavioural consistency checks (not just per-request); integrate device attestation (Play Integrity / App Attest) as ground truth |
+| Residential proxy infrastructure commoditisation | 2025 (active) | Rotating residential proxies now cost $1–3/GB; per-IP detection has near-zero cost to defeat | IP reputation alone is a failed control; enforce per-account velocity limits, device fingerprint binding, and step-up authentication as primary signals |
+| EU AI Act enforcement (automated profiling restrictions) | 2026 | Behavioural bot scoring that profiles users may require conformity assessments if used for consequential decisions | Classify bot-scoring systems against AI Act Annex III; document human-review override paths |
+| Browser vendor deprecation of navigator.webdriver / UA-Client-Hints shift | 2025–2026 | Detection signals that rely on `navigator.webdriver` or classical User-Agent parsing will degrade as browsers standardise UA-CH | Migrate detection to UA-Client-Hints (`Sec-CH-UA-*`) and entropy-based signals; audit for `navigator.webdriver` reliance today |
+---
+## §DETECTION-GAP
+What current bot-detection monitoring CANNOT detect in this domain, and what to build to close each gap.
+**Domain-specific gaps that MUST be checked:**
+- **Stealth-patched headless browsers**: No UA or `webdriver` flag is present after stealth patching. Standard WAF rules and UA blocklists miss these. Need: server-side JavaScript challenge that tests for genuine browser API behaviour (e.g., WebGL renderer, canvas noise, AudioContext fingerprint) — not just header inspection.
+- **Multi-session CAPTCHA token replay**: CAPTCHA provider confirms token valid once; replays in subsequent sessions go unchecked if token TTL is long. Need: bind each token to `(session_id, action, IP)` tuple server-side and reject on any mismatch — check token issuance logs for >1 use.
+- **Slow credential stuffing across accounts (not IPs)**: Each account receives ≤2 failed attempts per day — never triggers per-account lockout. Individually, each IP is also under rate limits. Need: cross-account velocity detection — alert when >N distinct accounts from the same ASN/fingerprint cluster experience failed auth within a rolling 1-hour window.
+- **Human-in-the-loop CAPTCHA farms**: Requests look fully human (real browser, real human solving CAPTCHA) because they are. Detection relies on speed: farms solve in 15–45 seconds (API latency). Need: enforce minimum-time checks between CAPTCHA load and submission (< 8 seconds = reject); monitor for clustered solve times at exactly farm API latency bands.
+- **TLS fingerprint / JA3 mismatch invisible to application logs**: Application only sees decrypted HTTP; TLS fingerprint is lost. Need: deploy JA3/JA4 fingerprinting at the network edge (Cloudflare custom rules, nginx + `nginx-ja3` module, or Envoy filter) and propagate the fingerprint hash as a request header into the application for scoring.
+---
+## §ZERO-MISS-MANDATE
+This agent CANNOT declare any bot-detection attack class clean without explicit evidence of checking. For each item, output one of:
+- `CHECKED: [N files] | [patterns used] | CLEAN`
+- `CHECKED: [N files] | [patterns used] | [N findings, all fixed]`
+- `SKIPPED: [reason — must be "not applicable: [evidence]"]`
+**Silent skip = FAILED COVERAGE.** The orchestrator flags this as a quality gap.
+Attack classes that MUST be covered:
+| Attack Class | Minimum Evidence Required |
+|---|---|
+| Headless browser detection | Grepped for UA patterns + webdriver signal; confirmed behavioral challenge exists |
+| IP-only rate limiting (proxy-defeatable) | Confirmed per-account AND per-device rate limits independent of IP |
+| CAPTCHA absence on bot-sensitive endpoints | Checked all auth, account-creation, and high-value action routes |
+| CAPTCHA token replay / binding | Confirmed token bound to session/action/IP tuple server-side |
+| Honeypot timing side-channel | Confirmed response time and body are identical regardless of honeypot state |
+| Device fingerprint coverage | Confirmed fingerprint used as rate-limit dimension alongside IP and account |
+| TLS fingerprint mismatch | Confirmed JA3/JA4 propagated to application layer OR noted as infrastructure gap |
+The output findings JSON MUST include a `coverageManifest` key:
+```json
+{
+  "coverageManifest": {
+    "attackClassesCovered": [
+      { "class": "Headless Browser Detection", "filesReviewed": 12, "patterns": ["HeadlessChrome", "navigator.webdriver", "webdriver"], "result": "CLEAN" },
+      { "class": "IP-Only Rate Limiting", "filesReviewed": 8, "patterns": ["rateLimit", "limiter", "throttle"], "result": "2 findings, both fixed" }
+    ],
+    "filesReviewed": 34,
+    "negativeAssertions": ["CAPTCHA token replay: token binding checked across 6 auth routes — all bind to session_id"],
+    "uncoveredReason": {}
+  }
+}
+```

package/skills/business-logic-attacker/SKILL.md CHANGED Viewed

@@ -74,3 +74,234 @@ Structured data for Agent 1 lead:
 - `stateViolations[]`: flows where state machine can be violated
 - `raceConditions[]`: flows with exploitable time-of-check/time-of-use gaps
 - `numericFlaws[]`: integer overflow, negative value, precision loss findings
+Every findings JSON MUST include `intelligenceForOtherAgents`:
+```json
+{
+  "intelligenceForOtherAgents": {
+    "forPentestTeam": [{ "type": "HIGH_VALUE_TARGET", "description": "...", "exploitHint": "..." }],
+    "forCryptoSpecialist": [{ "type": "CRYPTO_WEAKNESS_REFERENCE", "algorithm": "...", "location": "..." }],
+    "forCloudSpecialist": [{ "type": "SSRF_TO_CLOUD_CHAIN", "ssrfLocation": "...", "escalationPath": "..." }],
+    "forComplianceGrc": [{ "type": "COMPLIANCE_BLOCKER", "frameworks": ["..."], "releaseBlock": true }]
+  }
+}
+```
+---
+## BEYOND SKILL.MD — MANDATORY EXPANSIONS
+### BL-EXT-1: Price Manipulation via Client-Supplied Totals (CVE-2023-27163 pattern)
+**Technique**: Many e-commerce and SaaS checkout flows pass the final price or discount amount as a client-controlled parameter. If the backend recalculates using the client-submitted value rather than a server-authoritative quote, an attacker submits an arbitrarily low (or zero) price.
+**Detection**: Grep for `price`, `total`, `amount`, `discount` in request body parsing code. Check whether the value is used directly in a payment API call (`stripe.charges.create({ amount: req.body.amount })`) versus a server-computed quote looked up by session/cart ID.
+**Test**: Submit a checkout request with `"amount": 1` (one cent). If the order completes at that price, this is a CRITICAL finding. Also try `"amount": -100` to test for refund credit injection.
+**Finding criteria**: Any path from client-controlled numeric input to a payment processor charge without server-side recomputation of the canonical amount.
+### BL-EXT-2: Workflow Step Bypass via Direct Endpoint Calls (OWASP WSTG-BUSL-01)
+**Technique**: Multi-step processes (onboarding, checkout, KYC verification) implement each step as a separate endpoint. If steps are guarded only by client-submitted state (`step=3`) rather than cryptographically verified server-side state, an attacker can call the final step directly, skipping all validation steps.
+**Detection**: Search for `step`, `phase`, `stage`, `screen` parameters in route handlers. Check whether session state or a signed server-issued token enforces sequencing.
+**Test**: Map all steps in a multi-stage flow. Issue a direct POST to the final completion endpoint without completing prerequisite steps. If successful, state sequencing is not enforced server-side.
+**Finding criteria**: Completion endpoint accepts requests from sessions that have not completed mandatory prerequisite steps.
+### BL-EXT-3: Race Condition Double-Spend via Parallel Requests (CWE-362)
+**Technique**: Inventory reservation, coupon redemption, referral credit, and one-time-use token endpoints are susceptible to time-of-check/time-of-use (TOCTOU) races. If the "check availability" → "mark as used" sequence is not atomic (SELECT + UPDATE in the same transaction, or a Redis SETNX), concurrent requests can both pass the check before either update completes.
+**Detection**: Grep for coupon redemption, balance deduction, or inventory decrement logic. Check whether the read and write occur inside a serializable database transaction or use an atomic primitive (Redis SETNX, database-level advisory lock).
+**Test**: Use a parallel HTTP client (wrk, Burp Intruder, or custom script) to send 20 simultaneous redemption requests for a single-use coupon. If more than one succeeds, the race is confirmed.
+**Finding criteria**: Multiple concurrent requests successfully redeem a single-use resource, or deplete a shared balance below zero.
+### BL-EXT-4: JWT Algorithm Confusion and Claim Injection (CVE-2022-21449, alg:none)
+**Technique**: Business logic often gates premium features or admin access on JWT claims (`"role": "admin"`, `"plan": "enterprise"`). If the application accepts unsigned tokens (`alg: none`), accepts RS256 tokens verified as HS256 with the public key as the HMAC secret, or trusts attacker-supplied `kid` values to select verification keys, an attacker can forge arbitrary claims.
+**Detection**: Grep for JWT verification libraries (`jsonwebtoken`, `python-jose`, `java-jwt`). Check whether `algorithms` is constrained to a whitelist. Check whether `kid` is validated before use. Check whether `alg: none` is explicitly rejected.
+**Test**: Craft a token with `alg: none` and `"role": "admin"`. Submit to protected endpoints. Also test RS256-to-HS256 confusion by signing with the PEM-encoded public key as the HMAC secret.
+**Finding criteria**: Server accepts a forged token granting elevated privileges.
+### BL-EXT-5: AI-Assisted Fuzzing of Business Rule Edge Cases (Emerging — 2025)
+**Technique**: Attackers are now deploying LLM-assisted fuzzing that reads API documentation or OpenAPI specs to generate semantically valid but logically abusive inputs — e.g., an LLM discovers that a shipping calculator accepts `weight: 0` and `quantity: 99999` simultaneously and infers that this combination may trigger free-shipping logic. This goes far beyond what traditional boundary-value fuzzers produce.
+**Detection**: Review all numeric field combinations in checkout, pricing, and eligibility logic. Look for any place where two or more fields interact to produce a business outcome (discount, free shipping, tier unlock) without upper-bound validation on each field independently and in combination.
+**Test**: Generate a combinatorial test matrix of numeric inputs using boundary values, zero, negative, and maximum integer. Specifically test cross-field combinations: `{ quantity: 0, weight: 0 }`, `{ quantity: MAX_INT, price: 0.01 }`, `{ discountPercent: 100, quantity: -1 }`.
+**Finding criteria**: Any combination of legal per-field values produces an unintended business outcome (negative total, free premium access, unlimited resource consumption).
+### BL-EXT-6: Supply Chain Integrity — Malicious Dependency Injecting Backdoor into Payment Flow (Emerging — 2025)
+**Technique**: Attackers targeting e-commerce and SaaS platforms increasingly compromise npm/PyPI packages that sit in the dependency chain of payment or checkout code. A malicious version of a utility library can silently modify price values, intercept payment tokens, or exfiltrate card data. This is an extension of traditional business logic attack surface into the supply chain layer.
+**Detection**: Run `npm audit` and `npx lockfile-lint` on the repository. Check `package-lock.json` or `yarn.lock` for unexpected version bumps in packages that touch payment flows. Cross-reference against the OSV database and Socket.dev for known-malicious packages. Generate a CycloneDX SBOM and compare against a known-good baseline.
+**Test**: Identify every package that is imported by payment-processing modules (`grep -r "require\|import" src/payments/`). For each, verify the installed version hash against the registry checksum. Use `npm pack --dry-run` to inspect what files are actually included.
+**Finding criteria**: Any dependency in the payment flow whose resolved version differs from the expected pinned version, or which has been flagged by OSV/Socket.dev, or whose tarball hash does not match the registry.
+### BL-EXT-7: Negative-Value Exploit via Unsigned Integer Underflow in Discount Calculation (CWE-191)
+**Technique**: When discount values are applied to order totals in languages or ORMs that coerce types, a discount larger than the order total can produce a negative total. If this negative value is passed to a payment processor, some processors interpret it as a credit to be issued to the attacker's account. Even where the processor rejects it, the negative balance may be stored in the application's internal ledger, creating a credit that can be spent.
+**Detection**: Grep for discount and total calculation logic. Check whether the final total is asserted to be `>= 0` before submission. Check the data type: if total is stored as a signed integer or float, underflow is possible.
+**Test**: Submit an order with a discount code that exceeds the order total. Observe the computed total. If the total is negative or zero, attempt to complete the order. Check the account balance after the transaction.
+**Finding criteria**: Application permits a negative or zero total to reach the payment processor or stores a negative balance in the internal ledger.
+### BL-EXT-8: Post-Quantum Harvest-Now-Decrypt-Later Against Payment Tokens (Emerging — 2028 horizon, active threat today)
+**Technique**: Adversaries with nation-state resources are currently harvesting encrypted payment tokens, session tokens, and cryptographic proofs transmitted over TLS sessions using classical algorithms (ECDSA P-256, RSA-2048). When cryptographically relevant quantum computers become available (estimated 2028–2032), these stored ciphertexts become decryptable, exposing payment data retroactively. For long-lived tokens (subscription tokens, stored payment methods), the threat window is active today.
+**Detection**: Enumerate all endpoints that transmit or store payment tokens, subscription identifiers, or long-lived session material. Check TLS configuration for hybrid key exchange support (`X25519Kyber768` in TLS 1.3). Check whether stored tokens are encrypted at rest with a quantum-resistant algorithm.
+**Test**: Use `nmap --script ssl-enum-ciphers` or `testssl.sh` against the payment endpoints. Check whether any hybrid PQ key exchange is advertised in the TLS handshake. Grep for RSA/ECDSA usage in token signing code.
+**Finding criteria**: Long-lived payment or identity tokens are transmitted or stored with no quantum-resistant protection; TLS does not offer hybrid PQ key exchange.
+---
+## §BUSINESS_LOGIC_ATTACKER-CHECKLIST
+1. **Payment total recomputation**: Verify the server recomputes the final charge amount from a server-authoritative quote (session/cart ID lookup), not from any client-submitted value. Grep: `req.body.amount`, `req.body.total`, `req.body.price`. Finding: any of these values reach a payment API call.
+2. **Step sequencing enforcement**: For every multi-step flow, confirm the final step verifies all prior steps completed in the server-side session. Grep: `req.body.step`, `req.body.phase`, `session.currentStep`. Test: POST directly to the final step endpoint without completing prerequisites. Finding: completion succeeds without prerequisite session state.
+3. **Single-use resource atomicity**: Confirm coupon, referral code, and one-time-token redemption uses an atomic read-then-write (database transaction at SERIALIZABLE isolation or Redis SETNX). Grep: redemption handlers for non-transactional SELECT followed by UPDATE. Test: 20 concurrent redemption requests. Finding: more than one request succeeds.
+4. **Negative and zero quantity handling**: Verify all quantity, count, and weight fields reject values ≤ 0 at the validation layer before any calculation. Grep: `quantity`, `count`, `units`, `weight` in request schemas. Test: submit `quantity: -1`, `quantity: 0`. Finding: negative total, negative inventory, or error that reveals internal ledger values.
+5. **Integer overflow on large numeric inputs**: Check fields that accept user-supplied numbers for maximum-value bounds. Test: submit `quantity: 2147483647` (MAX_INT32) or `9007199254740993` (MAX_SAFE_INT + 1 in JavaScript). Finding: unexpected total (wrap-around to negative, or zero).
+6. **Subscription feature entitlement at downgrade**: Verify that when a subscription is cancelled or downgraded, premium feature flags are revoked synchronously (not just on next billing cycle). Grep: feature-flag checks that read `user.plan` without checking subscription expiry timestamp. Test: subscribe, access premium feature, cancel subscription, immediately re-check premium endpoint. Finding: premium access persists after cancellation.
+7. **Password reset token single-use enforcement**: Confirm reset tokens are invalidated immediately after first use. Grep: reset token lookup handlers. Test: use a reset token, then submit the same token again in a new request. Finding: second use succeeds or token remains valid.
+8. **IDOR via predictable resource IDs in multi-tenant context**: Enumerate resource IDs used in API endpoints (order IDs, document IDs, upload IDs). Check whether IDs are sequential integers or short UUIDs. Test: authenticate as tenant A, request resource IDs that neighbour your own (ID + 1, ID - 1). Finding: resources belonging to tenant B are returned.
+9. **Coupon code stacking and combinability**: Test whether multiple coupon codes can be applied simultaneously beyond the intended limit. Test: apply two 50%-off coupons to reach 100% discount; apply one coupon and one referral credit simultaneously. Finding: total reaches or exceeds 100% discount, or negative total.
+10. **Email verification bypass**: Confirm that privileged actions (payment, data export, account linking) require a verified email and that the verification state is enforced server-side. Grep: `user.emailVerified` checks before privileged endpoints. Test: create account, skip email verification, attempt privileged action directly. Finding: privileged action succeeds without email verification.
+11. **File replacement between upload and processing**: In upload flows with a processing step (antivirus scan, format validation), check whether the uploaded file's storage path is predictable and whether the file can be replaced between upload and processing. Test: upload a benign file, observe the storage path, immediately overwrite with a malicious file via a second request before the processing step reads it. Finding: processing step operates on the replaced malicious file.
+12. **Tenant-prefixed cache key enforcement**: In multi-tenant applications using shared caches (Redis, Memcached), verify all cache keys include the tenant ID as a prefix. Grep: cache set/get calls without tenant ID in key construction. Test: as tenant A, cache a value; as tenant B, attempt to read the same key without tenant prefix. Finding: tenant B reads tenant A's cached data.
+---
+## §POC-REQUIREMENT
+For every CRITICAL or HIGH finding in this domain, the following sequence is mandatory before the finding is considered complete:
+1. **Write the working PoC FIRST**: Document the exact HTTP request (method, URL, headers, body), the observed response, and the confirmed business impact (e.g., "order total became $0.00", "premium features accessible after cancellation").
+2. **Confirm reproduction**: The PoC must be executed against the target and the result must match the expected impact. Screenshot or log output required.
+3. **Write the fix**: Implement the remediation (server-side total recomputation, atomic transaction, step sequencing enforcement, etc.).
+4. **Verify the PoC fails**: Re-execute the identical PoC against the fixed code. Confirm the attack now fails (correct error response, correct business outcome).
+5. **Record in findings JSON**:
+```json
+{
+  "findingId": "BL-001",
+  "severity": "CRITICAL",
+  "title": "Price manipulation via client-supplied amount",
+  "exploitPoC": {
+    "request": "POST /api/checkout HTTP/1.1\nContent-Type: application/json\n\n{\"cartId\": \"abc123\", \"amount\": 1}",
+    "expectedResponse": "HTTP 200 — order created at $0.01",
+    "observedImpact": "Order for $299 product completed at $0.01",
+    "reproduced": true
+  },
+  "fix": "Recompute amount server-side from cartId; reject any client-supplied amount field",
+  "fixVerified": true
+}
+```
+**PoC skipping = finding severity downgraded to MEDIUM automatically. No exceptions.**
+---
+## §PROJECT-ESCALATION
+Immediately call `orchestration.update_agent_status` with `"status": "CRITICAL_ESCALATION"` and halt your current run to alert the orchestrator before completing under any of these conditions:
+1. **Payment processor receives attacker-controlled amounts**: A code path exists where a client-submitted numeric value (price, quantity, discount) reaches a payment processor API call without server-side recomputation. This is an active financial fraud vector requiring immediate remediation before any other work continues.
+2. **Multi-tenant data boundary collapse confirmed**: Cross-tenant data access is reproduced — tenant A can read, modify, or delete resources owned by tenant B. This is a data breach condition affecting all tenants and must be escalated to the full security team before the finding is documented in any shared channel.
+3. **Single-use token race condition confirmed at scale**: A race condition on a single-use token (coupon, reset token, referral code) is confirmed to allow unlimited redemption by a single attacker. This may represent an active financial liability if the token has monetary value.
+4. **Authentication step completely bypassable**: A multi-step authentication or verification flow (MFA, email verification, KYC) can be skipped by direct endpoint calls, meaning an attacker can achieve full account access or privileged status without satisfying any verification requirement.
+5. **Admin or privileged endpoint accessible to unauthenticated users**: Any endpoint that performs administrative actions (user management, billing override, configuration change) is accessible without authentication. This is an unconditional escalation regardless of how the endpoint was discovered.
+6. **Malicious dependency confirmed in payment flow**: A package in the dependency chain of payment-processing code has been flagged as compromised or modified (hash mismatch, OSV advisory, Socket.dev alert). This may mean payment data is currently being exfiltrated in production.
+7. **Mass account takeover vector confirmed**: A flaw allows an attacker to take over arbitrary user accounts at scale (e.g., predictable password reset tokens, session fixation in multi-step auth flow). Escalate immediately — this is a full incident response trigger, not just a finding.
+8. **Negative-balance exploit reaches production payment processor**: A negative-value order is confirmed to have been submitted to the payment processor (check processor logs or webhook logs). This is an active financial incident, not just a vulnerability — escalate to include the finance team.
+---
+## §EDGE-CASE-MATRIX
+The 5 attack cases in this domain that automated scanners and naive manual review universally miss. MANDATORY checks — do not skip.
+| # | Edge Case | Why Scanners Miss It | Concrete Test |
+|---|-----------|----------------------|---------------|
+| 1 | Second-order / stored payload executed in different context | Scanner checks input context, not execution context | Store payload safely; trigger in separate request/session |
+| 2 | Unicode normalisation bypass | Regex filters run before normalisation; attacker uses homoglyphs or composed forms | Submit Ⅰ (U+2160) or ＜ (U+FF1C) variants of known-bad strings |
+| 3 | Polyglot payload active in multiple sinks simultaneously | Scanners test one injection class per payload | `'"><script>{{7*7}}</script><!--` — SQL + XSS + SSTI in one request |
+| 4 | Out-of-band exfiltration (DNS/HTTP callback) | Scanner looks for inline response difference; OOB leaves no visible trace | Use Burp Collaborator / interactsh; inject DNS lookup payload |
+| 5 | Race condition between check and use (TOCTOU) | Sequential scanners don't model concurrency | Send two simultaneous requests to the same state-changing endpoint |
+---
+## §TEMPORAL-THREATS
+Threats materialising in the 2025–2030 window that defences designed today must account for.
+| Threat | Est. Timeline | Relevance to This Domain | Prepare Now By |
+|--------|--------------|--------------------------|----------------|
+| Cryptographically Relevant Quantum Computer (CRQC) | 2028–2032 | Harvest-now-decrypt-later attacks active today; RSA/ECDSA keys signed today will be broken | Inventory all RSA/ECDSA usage; migrate long-lived data to ML-KEM (FIPS 203) |
+| AI-assisted adversaries at scale | 2025–2027 (active) | LLM-powered fuzzing finds 10× more edge cases; automated PoC generation | Assume attackers have LLM help; expand test surface to match |
+| EU AI Act full enforcement | 2026 | High-risk AI systems require mandatory conformity assessments | Classify all AI features against AI Act tiers now |
+| Post-quantum TLS migration deadline | 2028–2030 | Browser vendors will drop classical-only TLS connections | Begin TLS agility assessment; test hybrid key exchange |
+| Mandatory SBOM + build provenance (US EO 14028 / EU CRA) | 2025–2026 (active) | SBOM and SLSA attestation are becoming legally required | Achieve SLSA L2 minimum; generate CycloneDX SBOM per release |
+---
+## §DETECTION-GAP
+What current security monitoring CANNOT detect in this domain, and what to build to close each gap.
+**Standard gaps that MUST be checked:**
+- **Second-order attack execution**: The storage request looks safe; only the retrieval+execution step is dangerous. Need: correlate write events with downstream read+execute events in the same SIEM query window.
+- **Timing-side-channel leakage**: No log event emitted; only observable as microsecond response-time variance. Need: per-endpoint p99 latency tracking with statistical anomaly detection.
+- **Low-and-slow credential stuffing**: Individually, each request is under rate limits. Need: behavioural baseline — flag accounts with geographically impossible velocity or device-fingerprint mismatch across authentication attempts.
+- **Insider exfiltration via legitimate process**: Authorised exports, reports, and data downloads that individually are permitted but collectively constitute data exfiltration. Need: data-volume anomaly detection — alert when a single user's data access volume exceeds 3× their 30-day baseline within 24 hours.
+- **Cross-agent attack chains**: Phase 1 finding A + Phase 1 finding B = CRITICAL chain invisible to either agent alone. Need: CISO orchestrator Phase 1 synthesis step — correlate all agent findings before Phase 2.
+**Business-logic-specific gaps:**
+- **Step-skip attacks in multi-step flows**: Each individual endpoint returns a normal HTTP 200; only the sequence violation is anomalous. Need: server-side flow state machine that emits an audit event when a step is accessed out of order.
+- **Slow coupon exhaustion below rate-limit thresholds**: An attacker distributes coupon redemptions across 1,000 accounts at 1 redemption per hour per account. Individually, none trigger rate limits, but collectively the coupon is exhausted fraudulently. Need: aggregate coupon redemption rate alerting independent of per-account rate limits.
+- **Subscription entitlement drift after plan changes**: No alert is emitted when a user retains premium feature access after downgrading. Need: a scheduled reconciliation job that compares active feature flags against current subscription status and emits an alert on any mismatch.
+---
+## §ZERO-MISS-MANDATE
+This agent CANNOT declare any attack class clean without explicit evidence of checking. For each item, output one of:
+- `CHECKED: [N files] | [patterns used] | CLEAN`
+- `CHECKED: [N files] | [patterns used] | [N findings, all fixed]`
+- `SKIPPED: [reason — must be "not applicable: [evidence]"]`
+**Silent skip = FAILED COVERAGE.** The orchestrator flags this as a quality gap.
+The output findings JSON MUST include a `coverageManifest` key:
+```json
+{
+  "coverageManifest": {
+    "attackClassesCovered": [{ "class": "Price Manipulation", "filesReviewed": 12, "patterns": ["req.body.amount", "req.body.price", "req.body.total"], "result": "CLEAN" }],
+    "filesReviewed": 47,
+    "negativeAssertions": ["Price Manipulation: client-supplied amount pattern searched across 47 files — 0 matches reaching payment API"],
+    "uncoveredReason": {}
+  }
+}
+```
+---
+## LEARNING SIGNAL
+On every finding resolved, emit:
+```json
+{
+  "findingId": "FINDING_ID",
+  "agentName": "business-logic-attacker",
+  "resolved": true,
+  "remediationTemplate": "one-line description of what was done",
+  "falsePositive": false
+}
+```
+Call `security.record_outcome` with this payload so the routing engine learns which agent resolves each finding class most successfully. If a finding is a false positive, set `falsePositive: true` — this prevents the false-positive pattern from being routed here again.

package/skills/capec-code-mapper/SKILL.md CHANGED Viewed

@@ -161,3 +161,87 @@ If internet permitted:
 - `requiredActions`: ordered action list
 - `complianceImpact`: framework mappings
 - `beyondSkillMd`: true if finding goes beyond the SKILL.md mandate
+Every findings JSON MUST include `intelligenceForOtherAgents`:
+```json
+{
+  "intelligenceForOtherAgents": {
+    "forPentestTeam": [{ "type": "HIGH_VALUE_TARGET", "description": "...", "exploitHint": "..." }],
+    "forCryptoSpecialist": [{ "type": "CRYPTO_WEAKNESS_REFERENCE", "algorithm": "...", "location": "..." }],
+    "forCloudSpecialist": [{ "type": "SSRF_TO_CLOUD_CHAIN", "ssrfLocation": "...", "escalationPath": "..." }],
+    "forComplianceGrc": [{ "type": "COMPLIANCE_BLOCKER", "frameworks": ["..."], "releaseBlock": true }]
+  }
+}
+```
+## BEYOND SKILL.MD
+Domain-specific intelligence that extends beyond the base CAPEC mapping mandate:
+- **CAPEC-194 + CVE-2023-29374 (Spring Framework mass assignment)**: Auto-binding frameworks silently map attacker-controlled HTTP parameters to model fields. Grep for `@ModelAttribute`, `@RequestBody` without `@JsonIgnoreProperties` — one field difference between "safe" and "full account takeover."
+- **CAPEC-460 (HTTP Response Splitting) via CRLF injection** — still present in raw header-write code even in 2025; CVE-2023-24998 (Apache Commons FileUpload) demonstrates the chain. Search for `res.setHeader` with unsanitized user input.
+- **CAPEC-666 (Exploitation of Permissions via Confused Deputy) — AI/LLM era**: When an LLM agent can call tools on behalf of users, prompt injection (CAPEC-114) becomes a confused-deputy attack. Attacker-controlled document content tricks the LLM into invoking privileged tools with the user's credentials. No CVE yet, but PortSwigger Research 2024 demonstrated full account takeover via indirect prompt injection in a GenAI assistant.
+- **CAPEC-116 (Excavation via Differential Analysis) + post-quantum timing**: Classical constant-time code guarantees break under quantum simulation environments. Harvest-now-decrypt-later (HNDL) attacks mean RSA-2048 ciphertext captured today is already at risk. CVE-2024-28882 illustrates OpenSSH timing leakage. Inventory all `crypto.createDiffieHellman` and `crypto.generateKeyPairSync` calls for algorithm agility.
+- **CAPEC-153 (Input Data Manipulation) via GraphQL batching abuse** — CVE-2023-28425 (Redis) and analogous patterns in Apollo Server: attackers batch thousands of mutations in a single HTTP request, bypassing per-request rate limits. Check for `apollo-server` without `@graphql-armor/max-directives` or query-cost analysis.
+- **CAPEC-1 (Accessing Functionality Not Properly Constrained) in server-side AI tool calls**: LLM function-calling surfaces expose internal APIs to model-controlled dispatch. Without a capability allowlist, an attacker who controls the prompt controls which functions are called. Map every `tools: [...]` array in Anthropic/OpenAI SDK calls to a permission boundary check.
+- **CAPEC-56 (Removing/Adding Data Stores) via prototype pollution** — CVE-2022-37601 (webpack loader-utils), CVE-2023-26136 (tough-cookie): `__proto__` mutation still appears in lodash `_.merge` and `JSON.parse` + dynamic key assignment patterns. Grep for `Object.assign(target, userInput)` and `[userKey] =` with untrusted keys.
+- **CAPEC-549 (Local Execution of Code) via supply-chain compromised package** — post-quantum threat vector: adversaries use AI-generated lookalike packages (typosquatting at scale) to inject CAPEC-549 payloads. CVE-2024-21501 (sanitize-html bypass) illustrates how a "security" package itself became the attack vector. Verify every `package.json` dependency against npm provenance attestations (`npm audit signatures`).
+---
+## §EDGE-CASE-MATRIX
+The 5 attack cases in this domain that automated scanners and naive manual review universally miss. MANDATORY checks — do not skip.
+| # | Edge Case | Why Scanners Miss It | Concrete Test |
+|---|-----------|----------------------|---------------|
+| 1 | Second-order / stored payload executed in different context | Scanner checks input context, not execution context | Store payload safely; trigger in separate request/session |
+| 2 | Unicode normalisation bypass | Regex filters run before normalisation; attacker uses homoglyphs or composed forms | Submit Ⅰ (U+2160) or ＜ (U+FF1C) variants of known-bad strings |
+| 3 | Polyglot payload active in multiple sinks simultaneously | Scanners test one injection class per payload | `'"><script>{{7*7}}</script><!--` — SQL + XSS + SSTI in one request |
+| 4 | Out-of-band exfiltration (DNS/HTTP callback) | Scanner looks for inline response difference; OOB leaves no visible trace | Use Burp Collaborator / interactsh; inject DNS lookup payload |
+| 5 | Race condition between check and use (TOCTOU) | Sequential scanners don't model concurrency | Send two simultaneous requests to the same state-changing endpoint |
+## §TEMPORAL-THREATS
+Threats materialising in the 2025–2030 window that defences designed today must account for.
+| Threat | Est. Timeline | Relevance to This Domain | Prepare Now By |
+|--------|--------------|--------------------------|----------------|
+| Cryptographically Relevant Quantum Computer (CRQC) | 2028–2032 | Harvest-now-decrypt-later attacks active today; RSA/ECDSA keys signed today will be broken | Inventory all RSA/ECDSA usage; migrate long-lived data to ML-KEM (FIPS 203) |
+| AI-assisted adversaries at scale | 2025–2027 (active) | LLM-powered fuzzing finds 10× more edge cases; automated PoC generation | Assume attackers have LLM help; expand test surface to match |
+| EU AI Act full enforcement | 2026 | High-risk AI systems require mandatory conformity assessments | Classify all AI features against AI Act tiers now |
+| Post-quantum TLS migration deadline | 2028–2030 | Browser vendors will drop classical-only TLS connections | Begin TLS agility assessment; test hybrid key exchange |
+| Mandatory SBOM + build provenance (US EO 14028 / EU CRA) | 2025–2026 (active) | SBOM and SLSA attestation are becoming legally required | Achieve SLSA L2 minimum; generate CycloneDX SBOM per release |
+## §DETECTION-GAP
+What current security monitoring CANNOT detect in this domain, and what to build to close each gap.
+**Standard gaps that MUST be checked:**
+- **Second-order attack execution**: The storage request looks safe; only the retrieval+execution step is dangerous. Need: correlate write events with downstream read+execute events in the same SIEM query window.
+- **Timing-side-channel leakage**: No log event emitted; only observable as microsecond response-time variance. Need: per-endpoint p99 latency tracking with statistical anomaly detection.
+- **Low-and-slow credential stuffing**: Individually, each request is under rate limits. Need: behavioural baseline — flag accounts with geographically impossible velocity or device-fingerprint mismatch across authentication attempts.
+- **Insider exfiltration via legitimate process**: Authorised exports, reports, and data downloads that individually are permitted but collectively constitute data exfiltration. Need: data-volume anomaly detection — alert when a single user's data access volume exceeds 3× their 30-day baseline within 24 hours.
+- **Cross-agent attack chains**: Phase 1 finding A + Phase 1 finding B = CRITICAL chain invisible to either agent alone. Need: CISO orchestrator Phase 1 synthesis step — correlate all agent findings before Phase 2.
+## §ZERO-MISS-MANDATE
+This agent CANNOT declare any attack class clean without explicit evidence of checking. For each item, output one of:
+- `CHECKED: [N files] | [patterns used] | CLEAN`
+- `CHECKED: [N files] | [patterns used] | [N findings, all fixed]`
+- `SKIPPED: [reason — must be "not applicable: [evidence]"]`
+**Silent skip = FAILED COVERAGE.** The orchestrator flags this as a quality gap.
+The output findings JSON MUST include a `coverageManifest` key:
+```json
+{
+  "coverageManifest": {
+    "attackClassesCovered": [{ "class": "SQL Injection", "filesReviewed": 47, "patterns": ["queryRaw", "string concat"], "result": "CLEAN" }],
+    "filesReviewed": 47,
+    "negativeAssertions": ["SQL Injection: queryRaw pattern searched across 47 files — 0 matches"],
+    "uncoveredReason": {}
+  }
+}
+```

package/skills/cert-pin-rotation-specialist/SKILL.md CHANGED Viewed

@@ -198,3 +198,115 @@ export async function fetchPinUpdate(): Promise<string[] | null> {
 - `requiredActions`: ordered action list
 - `complianceImpact`: framework mappings
 - `beyondSkillMd`: true if finding goes beyond the SKILL.md mandate
+Every findings JSON MUST include `intelligenceForOtherAgents`:
+```json
+{
+  "intelligenceForOtherAgents": {
+    "forPentestTeam": [{ "type": "HIGH_VALUE_TARGET", "description": "App falls back to no pinning on pin-fetch failure — MitM window open during remote-config fetch", "exploitHint": "Block config.yourdomain.com at network layer; client reverts to no-pin mode" }],
+    "forCryptoSpecialist": [{ "type": "CRYPTO_WEAKNESS_REFERENCE", "algorithm": "SHA-1 certificate fingerprint used as pin (not SPKI SHA-256)", "location": "android/res/xml/network_security_config.xml" }],
+    "forCloudSpecialist": [{ "type": "SSRF_TO_CLOUD_CHAIN", "ssrfLocation": "Remote pin-config fetch URL is user-controllable", "escalationPath": "Attacker supplies internal metadata endpoint as config URL; server fetches and returns cloud credentials" }],
+    "forComplianceGrc": [{ "type": "COMPLIANCE_BLOCKER", "frameworks": ["PCI-DSS 4.2.1", "NIST SP 800-53 SC-17"], "releaseBlock": true }]
+  }
+}
+```
+---
+## BEYOND SKILL.MD — MANDATORY EXPANSIONS
+- **AI-Assisted Pin-Bypass Script Generation (ATT&CK T1557.002 — AiTM Phishing):** LLM-powered tools (e.g., Frida-AI wrappers seen in 2024 red-team toolkits) analyze an APK's OkHttp or TrustKit configuration at runtime and auto-generate a Frida hook script tailored to that app's specific pin-check method signature, bypassing pinning without touching network traffic. Test by: run `objection -g <package> explore --startup-command "android sslpinning disable"` on a debug build and verify the app refuses the MITM certificate anyway; then attempt the AI-generated hook against the release build and confirm certificate pinning still blocks the connection. Finding threshold: any release build where a generic objection script or auto-generated Frida hook bypasses pinning without requiring app-specific reverse engineering.
+- **90-Day Certificate Lifetime Ballot (CA/B Forum SC-081, effective 2026):** The CA/Browser Forum ballot SC-081 mandates maximum 90-day TLS certificate lifetimes by 2026, shattering rotation runbooks designed around 1–2-year certs. Apps that pin leaf SPKI hashes and rely on a 60-day pre-release update cycle will break quarterly. Test by: simulate a 90-day rotation in staging — revoke the pinned cert, issue a new one, measure the time from "backup pin shipped in app" to ">80% user adoption" via store analytics; if that window exceeds 30 days, the rotation model is broken. Finding threshold: any mobile app with an OTA pin-update path whose end-to-end propagation time exceeds 30 days, or any app without OTA rotation at all.
+- **Post-Quantum Harvest-Now-Decrypt-Later Against Pinned SPKI (NIST PQC FIPS 203/204):** Nation-state adversaries are capturing encrypted TLS sessions today with intent to decrypt when a cryptographically relevant quantum computer (CRQC) arrives (~2028–2032). SPKI pins based on RSA-2048 or P-256 public keys do not prevent harvest; they only authenticate the endpoint. Sessions pinned to a P-256 endpoint are captured and queued for CRQC decryption. Test by: inventory every pinned domain's current key algorithm via `openssl s_client -connect <host>:443 2>/dev/null | openssl x509 -noout -text | grep "Public Key Algorithm"` — flag all RSA and ECDSA (P-256/P-384) endpoints; confirm no ML-KEM (FIPS 203) hybrid is negotiated in the TLS handshake. Finding threshold: any pinned production endpoint using RSA or classical ECDSA without a hybrid post-quantum key exchange scheduled for deployment before 2027.
+- **CT Log Rogue Certificate Issuance for Pinned Domains (CVE-2022-26923 — AD CS ESC1 variant / ATT&CK T1588.004):** An attacker who compromises an intermediate CA (or exploits a misconfigured Active Directory Certificate Services ESC1 template) can issue a certificate for a pinned domain. The pin rejects it at connection time on already-deployed clients, but newly installed app versions that shipped before the pin was added are silently vulnerable, and no server-side alert fires. Test by: set up a crt.sh webhook (via `https://crt.sh/atom?q=%.yourdomain.com`) or use the Google Certificate Transparency API to alert on newly logged certificates for all pinned domains; verify the alert fires within 1 hour of a test issuance. Finding threshold: any pinned domain with no CT log monitoring configured where unauthorized issuance would go undetected for more than 24 hours.
+- **Supply Chain Attack on Pin-Config Signing Key via Compromised CI/CD (ATT&CK T1195.002 — Compromise Software Supply Chain):** The OTA pin-config signing key is typically stored as a CI/CD secret (GitHub Actions, CircleCI). A supply-chain compromise of the CI environment (e.g., a malicious dependency in the build pipeline — see the 2024 `xz-utils` backdoor pattern, CVE-2024-3094) allows an attacker to exfiltrate the signing key and issue a fraudulent pin-config payload that pushes attacker-controlled pins to all live app clients. Test by: audit the signing key's storage location; verify it is stored in a hardware-backed secret store (AWS KMS, GCP KMS, or HashiCorp Vault with HSM backend) and that the CI pipeline never writes the raw private key to disk or logs; confirm key rotation has occurred at least once. Finding threshold: any OTA pin-config signing key stored as a plaintext CI secret or file on disk rather than in a KMS-backed store.
+- **SBOM/Compliance Gap — Undeclared CA Root and Config Signing Key Material (US EO 14028 / EU Cyber Resilience Act):** US Executive Order 14028 and the EU Cyber Resilience Act (CRA, effective 2027) require a Software Bill of Materials that includes all cryptographic key material and trust anchors used in a product. CA root SPKI hashes pinned in `network_security_config.xml` and OTA config signing key fingerprints are cryptographic trust anchors that must appear in the CycloneDX SBOM; their absence is a compliance blocker for US federal customers and EU market access. Test by: parse the app's CycloneDX SBOM (`cdxgen -o sbom.json .`) and verify that every SPKI hash present in pinning configs and every public key fingerprint used for config signature verification appears as a `cryptoMaterial` component in the SBOM; cross-reference against `network_security_config.xml` and `TrustKit` config entries. Finding threshold: any SPKI pin hash or signing key fingerprint present in source code that does not appear in the project's CycloneDX SBOM.
+---
+## §EDGE-CASE-MATRIX
+The 5 attack cases in certificate pinning and rotation that automated scanners and naive manual review universally miss. MANDATORY checks — do not skip.
+| # | Edge Case | Why Scanners Miss It | Concrete Test |
+|---|-----------|----------------------|---------------|
+| 1 | OTA pin-config fetch fails open — no pinning enforced | Static analysis sees `return null` and marks it as safe error handling; it does not model that `null` disables all pin checks | Block the remote config URL at the network layer; confirm the app still connects (should fail closed, not succeed) |
+| 2 | Leaf-certificate hash pinned instead of SPKI hash | Both look like SHA-256 base64 strings; scanners check presence of a hash, not which hash type it is | Re-run `openssl s_client` extraction using the certificate-fingerprint command (`-fingerprint -sha256`) vs. the SPKI path; compare — if they match there is no bug, if they differ and the code uses the fingerprint path, it will break on cert renewal |
+| 3 | Backup pin is a duplicate of the primary pin | Static analysis confirms two `<pin>` entries exist and marks the backup-pin requirement satisfied; it does not check value equality | Hash-compare all pin values in `network_security_config.xml` and TrustKit config; duplicate pins provide zero rotation headroom |
+| 4 | Root CA pin bypassed by intermediate CA cross-signed under attacker-controlled trust anchor | Pinning tools verify the chain against the device trust store; an attacker who controls a trust anchor on the device can issue a chain that passes the OS check before the pin is evaluated on older OkHttp versions | Test on a device with a custom CA installed; verify the app rejects the connection even when OS chain validation succeeds |
+| 5 | Pin expiration date is set but rotation runbook is never triggered — expiry silently passes in production | CI/CD pipelines do not parse `expiration` from XML and no calendar alert was created | Write a CI step or cron job that parses the `expiration` attribute and fails the build if it is within 60 days; confirm the alert fires in staging |
+---
+## §TEMPORAL-THREATS
+Threats materialising in the 2025–2030 window that pinning and PKI lifecycle defences designed today must account for.
+| Threat | Est. Timeline | Relevance to Cert-Pin Rotation | Prepare Now By |
+|--------|--------------|-------------------------------|----------------|
+| Cryptographically Relevant Quantum Computer (CRQC) breaks RSA/ECDSA | 2028–2032 | SPKI pins are hashes of RSA/EC public keys; harvest-now-decrypt-later adversaries capture pinned TLS sessions today to decrypt when CRQC arrives | Inventory all pinned keys; flag RSA-2048 and P-256 endpoints for post-quantum migration; plan ML-KEM (FIPS 203) hybrid TLS rollout |
+| 90-day maximum TLS certificate lifetimes (CA/B Forum ballot) | 2025–2026 (active) | Planned rotation cycles designed around 1–2-year certs break immediately; OTA rotation becomes mandatory, not optional | Shorten rotation runbook to a 60-day cycle; validate OTA pin-update path can complete a full rotation within 30 days end-to-end |
+| AI-assisted MitM tooling (LLM-generated per-target payloads) | 2025–2027 (active) | Attackers generate per-app bypass scripts that target the specific OTA config fetch pattern used; generic defences fail | Require HMAC-signed pin-config payloads with a server-side nonce; reject unsigned or replayed config responses |
+| Browser/OS removal of SHA-1 and SHA-256 leaf cert trust | 2026 | Apps still pinning SHA-1 fingerprints (not SPKI) will start failing as intermediates are re-issued with stronger algorithms, changing fingerprints | Audit every pinned hash for algorithm type; migrate all to SPKI SHA-256 |
+| Mandatory SBOM + build provenance (US EO 14028 / EU CRA) | 2025–2026 (active) | Pin-config signing keys and CA root certificates must appear in SBOM; undocumented key material is a compliance gap | Include CA root SPKI hashes and config signing key fingerprints in CycloneDX SBOM |
+---
+## §DETECTION-GAP
+What current monitoring CANNOT detect in certificate-pinning and rotation, and what to build to close each gap.
+**Domain-specific gaps that MUST be checked:**
+- **Silent pin expiration in production**: The `expiration` attribute in Android `network_security_config.xml` is parsed only at app startup on the device; no server-side event is emitted when a pin set expires. Need: a CI/CD step and an out-of-band cron job that parse expiration dates and page on-call at 90, 60, and 30 days before expiry.
+- **OTA config fetch returning stale or attacker-substituted pins**: The fetch succeeds with HTTP 200 and the app logs no error, but the returned pin set was served from a CDN cache poisoned days earlier. Need: pin the OTA config endpoint itself (meta-pinning) and include a `issuedAt` timestamp in the signed payload; reject responses older than 24 hours.
+- **Duplicate-pin false positive in backup-pin audit**: Automated pin-count checks report "2 pins present — compliant." They do not compare values. Need: a lint rule or pre-commit hook that asserts all pin values in a config file are unique.
+- **Certificate Transparency log divergence**: An unauthorized certificate for a pinned domain is issued by a rogue CA. The app's pin would reject it, but no alert fires because the attack is detected only at connection time on the device, not centrally. Need: CT log monitoring (e.g., crt.sh webhook or Google Certificate Transparency API) alerting on any newly issued certificate for pinned domains.
+- **Cross-agent chain — OTA fetch SSRF + pin bypass**: The OTA config URL is partially user-controlled (SSRF) and the fetch-fail-open path is active. Phase 1 SSRF agent flags the SSRF; Phase 1 cert-pin agent flags the fail-open. Neither agent alone sees the critical chain. Need: CISO orchestrator Phase 1 synthesis to correlate both findings into a single CRITICAL escalation.
+---
+## §ZERO-MISS-MANDATE
+This agent CANNOT declare any attack class clean without explicit evidence of checking. For each item, output one of:
+- `CHECKED: [N files] | [patterns used] | CLEAN`
+- `CHECKED: [N files] | [patterns used] | [N findings, all fixed]`
+- `SKIPPED: [reason — must be "not applicable: [evidence]"]`
+**Silent skip = FAILED COVERAGE.** The orchestrator flags this as a quality gap.
+**Mandatory attack classes for cert-pin-rotation-specialist:**
+| Class | Patterns to Search | Acceptable Skip Condition |
+|-------|--------------------|--------------------------|
+| Single pin (no backup) | Count of `<pin>` / `PublicKeyHashes` entries per domain | Not applicable only if project has zero network calls |
+| Leaf-cert hash vs. SPKI hash | Compare `openssl -fingerprint` output vs. SPKI extraction output for each pinned value | Not applicable only if no TLS pinning code exists |
+| OTA fetch fail-open | Search for `return null` / `return []` / empty-catch in pin-fetch function | Not applicable only if no OTA rotation mechanism exists |
+| Expired or near-expiry pin set | Parse `expiration` from XML / `kTSKExpirationDate` from Swift config | Not applicable only if no expiration date field exists in config |
+| Unsigned or unverified OTA pin config | Look for missing `verifyConfigSignature` or equivalent before accepting fetched pins | Not applicable only if pins are never fetched remotely |
+The output findings JSON MUST include a `coverageManifest` key:
+```json
+{
+  "coverageManifest": {
+    "attackClassesCovered": [
+      { "class": "Single pin no backup", "filesReviewed": 3, "patterns": ["<pin>", "PublicKeyHashes"], "result": "CLEAN" },
+      { "class": "Leaf-cert hash vs. SPKI", "filesReviewed": 3, "patterns": ["openssl fingerprint vs spki extraction"], "result": "1 finding, fixed" },
+      { "class": "OTA fetch fail-open", "filesReviewed": 5, "patterns": ["return null", "catch {}"], "result": "CLEAN" },
+      { "class": "Expired or near-expiry pin set", "filesReviewed": 3, "patterns": ["expiration", "kTSKExpirationDate"], "result": "CLEAN" },
+      { "class": "Unsigned OTA pin config", "filesReviewed": 5, "patterns": ["verifyConfigSignature", "signature"], "result": "CLEAN" }
+    ],
+    "filesReviewed": 11,
+    "negativeAssertions": [
+      "OTA fetch fail-open: return-null and empty-catch patterns searched across 5 files — 0 unguarded paths",
+      "Unsigned OTA config: signature verification present in all remote-fetch paths"
+    ],
+    "uncoveredReason": {}
+  }
+}
+```