security-mcp 1.1.4 → 1.3.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +341 -1018
- package/defaults/checklists/ai.json +20 -1
- package/defaults/checklists/api.json +35 -1
- package/defaults/checklists/infra.json +34 -1
- package/defaults/checklists/mobile.json +23 -1
- package/defaults/checklists/payments.json +15 -1
- package/defaults/checklists/web.json +11 -1
- package/defaults/cloud-controls/aws.json +10712 -0
- package/defaults/cloud-controls/azure.json +7201 -0
- package/defaults/cloud-controls/gcp.json +4061 -0
- package/defaults/control-catalog.json +24 -0
- package/defaults/security-policy.json +2 -2
- package/dist/ci/pr-gate.js +22 -5
- package/dist/cli/index.js +73 -2
- package/dist/cli/install.js +4 -55
- package/dist/cli/onboarding.js +18 -10
- package/dist/gate/baseline.js +82 -7
- package/dist/gate/catalog.js +10 -2
- package/dist/gate/checks/agentic-instructions.js +515 -0
- package/dist/gate/checks/ai-governance.js +132 -0
- package/dist/gate/checks/ai.js +757 -39
- package/dist/gate/checks/auth-deep.js +920 -216
- package/dist/gate/checks/business-logic.js +751 -0
- package/dist/gate/checks/ci-pipeline.js +399 -4
- package/dist/gate/checks/cloud-controls.js +69 -0
- package/dist/gate/checks/crypto.js +423 -2
- package/dist/gate/checks/data-platform.js +954 -0
- package/dist/gate/checks/dependencies.js +582 -15
- package/dist/gate/checks/docker-deep.js +1236 -0
- package/dist/gate/checks/gitops.js +724 -0
- package/dist/gate/checks/graphql.js +201 -19
- package/dist/gate/checks/iac.js +1230 -0
- package/dist/gate/checks/infra.js +246 -1
- package/dist/gate/checks/injection-deep.js +827 -184
- package/dist/gate/checks/k8s.js +955 -2
- package/dist/gate/checks/mobile-android.js +917 -3
- package/dist/gate/checks/mobile-ios.js +797 -5
- package/dist/gate/checks/required-artifacts.js +194 -0
- package/dist/gate/checks/runtime.js +178 -0
- package/dist/gate/checks/secrets.js +256 -13
- package/dist/gate/checks/supply-chain-deep.js +787 -0
- package/dist/gate/checks/web-nextjs.js +572 -48
- package/dist/gate/cloud-controls/apply.js +115 -0
- package/dist/gate/cloud-controls/bicep.js +36 -0
- package/dist/gate/cloud-controls/cfn.js +125 -0
- package/dist/gate/cloud-controls/detect.js +104 -0
- package/dist/gate/cloud-controls/hcl.js +140 -0
- package/dist/gate/cloud-controls/types.js +87 -0
- package/dist/gate/diff.js +17 -5
- package/dist/gate/evidence.js +8 -1
- package/dist/gate/exceptions.js +202 -9
- package/dist/gate/findings.js +15 -2
- package/dist/gate/policy.js +316 -130
- package/dist/gate/threat-intel.js +6 -0
- package/dist/mcp/audit-chain.js +131 -28
- package/dist/mcp/auth.js +169 -0
- package/dist/mcp/learning.js +129 -4
- package/dist/mcp/model-router.js +161 -24
- package/dist/mcp/orchestration.js +377 -89
- package/dist/mcp/server.js +460 -69
- package/dist/mcp/tool-audit.js +193 -0
- package/dist/repo/fs.js +37 -1
- package/dist/repo/search.js +31 -6
- package/dist/review/store.js +56 -3
- package/dist/tests/run.js +124 -1
- package/package.json +9 -9
- package/skills/_TEMPLATE/SKILL.md +99 -0
- package/skills/advanced-dos-tester/SKILL.md +118 -0
- package/skills/agentic-instruction-auditor/SKILL.md +111 -0
- package/skills/agentic-loop-exploiter/SKILL.md +377 -0
- package/skills/ai-llm-redteam/SKILL.md +113 -0
- package/skills/ai-model-supply-chain-agent/SKILL.md +112 -0
- package/skills/algorithm-implementation-reviewer/SKILL.md +107 -0
- package/skills/android-penetration-tester/SKILL.md +464 -46
- package/skills/anti-replay-tester/SKILL.md +115 -0
- package/skills/appsec-code-auditor/SKILL.md +94 -0
- package/skills/artifact-integrity-analyst/SKILL.md +450 -0
- package/skills/attack-navigator/SKILL.md +476 -8
- package/skills/auth-session-hacker/SKILL.md +111 -0
- package/skills/aws-penetration-tester/SKILL.md +510 -0
- package/skills/azure-penetration-tester/SKILL.md +542 -3
- package/skills/binary-auth-validator/SKILL.md +120 -0
- package/skills/bot-detection-specialist/SKILL.md +118 -0
- package/skills/business-logic-attacker/SKILL.md +240 -0
- package/skills/capec-code-mapper/SKILL.md +93 -0
- package/skills/cert-pin-rotation-specialist/SKILL.md +121 -0
- package/skills/cicd-pipeline-hijacker/SKILL.md +414 -0
- package/skills/ciso-orchestrator/SKILL.md +465 -43
- package/skills/cloud-infra-specialist/SKILL.md +127 -0
- package/skills/compliance-gap-analyst/SKILL.md +431 -0
- package/skills/compliance-grc/SKILL.md +94 -0
- package/skills/compliance-lifecycle-tracker/SKILL.md +93 -0
- package/skills/container-hardening-auditor/SKILL.md +125 -0
- package/skills/credential-stuffing-specialist/SKILL.md +111 -0
- package/skills/crypto-pki-specialist/SKILL.md +96 -0
- package/skills/csa-ccm-mapper/SKILL.md +93 -0
- package/skills/csf2-governance-mapper/SKILL.md +93 -0
- package/skills/data-platform-auditor/SKILL.md +125 -0
- package/skills/deep-link-fuzzer/SKILL.md +118 -0
- package/skills/dependency-confusion-attacker/SKILL.md +424 -0
- package/skills/device-integrity-aggregator/SKILL.md +117 -0
- package/skills/dos-resilience-tester/SKILL.md +106 -0
- package/skills/dread-scorer/SKILL.md +93 -0
- package/skills/egress-policy-enforcer/SKILL.md +108 -0
- package/skills/evidence-collector/SKILL.md +107 -0
- package/skills/file-upload-attacker/SKILL.md +118 -0
- package/skills/gcp-penetration-tester/SKILL.md +510 -2
- package/skills/git-history-secret-scanner/SKILL.md +115 -0
- package/skills/gitops-delivery-auditor/SKILL.md +120 -0
- package/skills/iac-security-auditor/SKILL.md +125 -0
- package/skills/iam-privesc-graph-builder/SKILL.md +161 -0
- package/skills/incident-responder/SKILL.md +120 -0
- package/skills/injection-specialist/SKILL.md +111 -0
- package/skills/ios-security-auditor/SKILL.md +291 -0
- package/skills/json-ambiguity-tester/SKILL.md +145 -0
- package/skills/k8s-container-escaper/SKILL.md +406 -0
- package/skills/key-management-lifecycle-analyst/SKILL.md +107 -0
- package/skills/kill-switch-engineer/SKILL.md +111 -0
- package/skills/linddun-privacy-analyst/SKILL.md +111 -0
- package/skills/logic-race-fuzzer/SKILL.md +452 -0
- package/skills/mobile-api-network-attacker/SKILL.md +430 -0
- package/skills/mobile-binary-hardener/SKILL.md +111 -0
- package/skills/mobile-security-specialist/SKILL.md +94 -0
- package/skills/mobile-webview-auditor/SKILL.md +105 -0
- package/skills/model-extraction-attacker/SKILL.md +228 -0
- package/skills/multipart-abuse-tester/SKILL.md +93 -0
- package/skills/oauth-pkce-specialist/SKILL.md +113 -0
- package/skills/parser-exhaustion-tester/SKILL.md +151 -0
- package/skills/pentest-infra/SKILL.md +107 -0
- package/skills/pentest-social/SKILL.md +210 -0
- package/skills/pentest-team/SKILL.md +96 -0
- package/skills/pentest-web-api/SKILL.md +107 -0
- package/skills/privacy-flow-analyst/SKILL.md +243 -0
- package/skills/prompt-injection-specialist/SKILL.md +403 -0
- package/skills/quantum-migration-planner/SKILL.md +105 -0
- package/skills/rag-poisoning-specialist/SKILL.md +367 -0
- package/skills/registry-mirror-enforcer/SKILL.md +93 -0
- package/skills/rotation-validation-agent/SKILL.md +121 -0
- package/skills/samm-assessor/SKILL.md +94 -0
- package/skills/secrets-mask-bypass-tester/SKILL.md +109 -0
- package/skills/senior-security-engineer/SKILL.md +178 -0
- package/skills/serialization-memory-attacker/SKILL.md +341 -0
- package/skills/session-timeout-tester/SKILL.md +170 -0
- package/skills/slsa-level3-enforcer/SKILL.md +121 -0
- package/skills/slsa-provenance-enforcer/SKILL.md +111 -0
- package/skills/ssrf-detection-validator/SKILL.md +117 -0
- package/skills/step-up-auth-enforcer/SKILL.md +93 -0
- package/skills/stride-pasta-analyst/SKILL.md +429 -0
- package/skills/supply-chain-devsecops/SKILL.md +107 -0
- package/skills/threat-infrastructure-analyst/SKILL.md +93 -0
- package/skills/threat-modeler/SKILL.md +94 -0
- package/skills/tls-certificate-auditor/SKILL.md +582 -18
- package/skills/token-reuse-detector/SKILL.md +104 -0
- package/skills/trike-risk-modeler/SKILL.md +93 -0
- package/skills/unicode-homograph-tester/SKILL.md +93 -0
- package/skills/waf-rule-lifecycle-agent/SKILL.md +106 -0
- package/skills/webhook-security-tester/SKILL.md +111 -0
- package/skills/zero-trust-architect/SKILL.md +118 -0
|
@@ -23,6 +23,15 @@ Model realistic social engineering threats and insider risk scenarios based on t
|
|
|
23
23
|
team, secrets, and access patterns found in this project. Write mitigations that reduce
|
|
24
24
|
the blast radius of human compromise.
|
|
25
25
|
|
|
26
|
+
## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
|
|
27
|
+
|
|
28
|
+
The full suite of detection modules in `src/gate/checks/` (especially `secrets`, `ci-pipeline`, and `auth-deep.ts`) is your access map, not your ceiling — read it to learn what a compromised human actually controls. Treat their finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
|
|
29
|
+
|
|
30
|
+
- **Cross-file / multi-step reasoning the regex can't do:** turn a `secrets`/`ci-pipeline` finding into a human-factor kill chain — an engineer whose token unlocks the CI secret store can, via one spear-phish, reach prod deploy creds and the customer datastore; map the blast radius of each named role across every surface the suite touches.
|
|
31
|
+
- **Semantic / effective-state analysis:** decide whether MFA, least-privilege, secret scoping, and offboarding are *effectively* enforced for real humans, not just configured — a break-glass account with a shared password or a PAT that outlives the contractor is the actual exploit.
|
|
32
|
+
- **External corroboration:** WebSearch/WebFetch for OSINT on the project and team (public repos, leaked creds, social profiles) and current phishing/insider-threat TTPs (MITRE ATT&CK Initial Access).
|
|
33
|
+
- **Apply & prove:** write the mitigation inline (tighter CI secret scope, MFA enforcement, allowlist logging), re-run the relevant `src/gate/checks/` modules as a regression floor, then re-audit the blast radius. Emit the LEARNING SIGNAL per fix; surface trade-offs with the secure default.
|
|
34
|
+
|
|
26
35
|
## EXECUTION
|
|
27
36
|
|
|
28
37
|
1. **OSINT on the project (authorized pre-engagement reconnaissance):**
|
|
@@ -70,3 +79,204 @@ If internet permitted:
|
|
|
70
79
|
- Blast radius of successful compromise
|
|
71
80
|
- Detection gap (what monitoring would NOT catch this)
|
|
72
81
|
- Mitigation control implemented or recommended
|
|
82
|
+
|
|
83
|
+
Every findings JSON MUST include `intelligenceForOtherAgents`:
|
|
84
|
+
```json
|
|
85
|
+
{
|
|
86
|
+
"intelligenceForOtherAgents": {
|
|
87
|
+
"forPentestTeam": [{ "type": "HIGH_VALUE_TARGET", "description": "...", "exploitHint": "..." }],
|
|
88
|
+
"forCryptoSpecialist": [{ "type": "CRYPTO_WEAKNESS_REFERENCE", "algorithm": "...", "location": "..." }],
|
|
89
|
+
"forCloudSpecialist": [{ "type": "SSRF_TO_CLOUD_CHAIN", "ssrfLocation": "...", "escalationPath": "..." }],
|
|
90
|
+
"forComplianceGrc": [{ "type": "COMPLIANCE_BLOCKER", "frameworks": ["..."], "releaseBlock": true }]
|
|
91
|
+
}
|
|
92
|
+
}
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
---
|
|
96
|
+
|
|
97
|
+
## BEYOND SKILL.MD — MANDATORY EXPANSIONS
|
|
98
|
+
|
|
99
|
+
### 1. Vishing & Smishing Against Developer Personas (Post-2024 AI-Assisted)
|
|
100
|
+
**Technique**: AI-cloned voice calls impersonating IT helpdesk or CISO, requesting OTP read-back or VPN credential reset. Real-world precedent: MGM Resorts breach (2023) used 10-minute social engineering call to reset Okta credentials.
|
|
101
|
+
**Test**: Enumerate on-call rotation from PagerDuty webhook configs or GitHub action secrets. Check if voice phishing playbooks exist in `docs/security/` or runbooks. Verify MFA policy enforces FIDO2 (phishing-resistant) rather than TOTP or SMS.
|
|
102
|
+
**Finding**: Any production access path protected only by SMS OTP or TOTP is exploitable via real-time phishing proxy (Evilginx2, Modlishka).
|
|
103
|
+
|
|
104
|
+
### 2. Adversarial ML Prompt Injection via Phishing Lure (2025 Threat — AI-Assisted Attacks)
|
|
105
|
+
**Technique**: Attacker crafts a document or email containing hidden prompt-injection payloads targeting AI coding assistants (GitHub Copilot, Cursor, Claude Code) used by the development team. The injected instruction appears in a README, PR description, or support ticket and coerces the AI to suggest malicious code changes. See research: "Not what you've signed up for: Compromising Real-World LLM-Integrated Applications" (Greshake et al., 2023), now operationalized by threat actors in 2025.
|
|
106
|
+
**Test**: Search for AI assistant configuration files (`.github/copilot-instructions.md`, `.cursorrules`, `.claude/CLAUDE.md`). Verify no external content (issue bodies, PR descriptions) is fed unsanitized into AI assistant system prompts. Test whether the AI assistant can be induced to commit unauthorized code by embedding instructions in a crafted source file.
|
|
107
|
+
**Finding**: If AI assistant instructions are loaded from repo-writable paths without integrity checks, an attacker with PR access can manipulate AI-assisted code review for all engineers on the team.
|
|
108
|
+
|
|
109
|
+
### 3. CI/CD Pipeline Poisoning via Dependency Confusion (Supply Chain Social Engineering)
|
|
110
|
+
**Technique**: Register a public npm/PyPI/RubyGems package with the same name as an internal private package, triggering automatic installation by developers who run `npm install` on a cloned repo (CVE category: CWE-427, uncontrolled search path). Typosquatting variant: `lodahs` for `lodash`.
|
|
111
|
+
**Test**: Extract all package names from `package.json`, `requirements.txt`, `Gemfile`. Query npm registry API for each: `GET https://registry.npmjs.org/<package-name>`. Flag any internal package name that resolves to a public package not owned by the organization. Run: `grep -r "registry" .npmrc .yarnrc.yml` to verify private registry is pinned.
|
|
112
|
+
**Finding**: Any package name resolvable on the public registry that is intended as internal = HIGH. Exploitation requires only registering the package with a higher version number.
|
|
113
|
+
|
|
114
|
+
### 4. GitHub Token Exfiltration via Malicious GitHub Action (OIDC Abuse)
|
|
115
|
+
**Technique**: A contributor submits a pull request that modifies a workflow file to exfiltrate `GITHUB_TOKEN` or OIDC tokens to an external endpoint. The PR appears to add logging or testing improvements.
|
|
116
|
+
**Test**: Audit all `.github/workflows/*.yml` for `pull_request_target` triggers (runs with write token on PR from fork). Check `permissions:` blocks — any `id-token: write` combined with unvalidated external action references (`uses: some-unverified-action@main`) enables OIDC token theft. Run: `grep -r "pull_request_target" .github/workflows/`.
|
|
117
|
+
**Finding**: A workflow with `pull_request_target` and no `if: github.event.pull_request.head.repo.full_name == github.repository` guard allows a forked PR to execute with the repo's full `GITHUB_TOKEN`. Blast radius: write access to all branches, packages, and deployments.
|
|
118
|
+
|
|
119
|
+
### 5. Watering Hole Attack via Developer Tool Ecosystem
|
|
120
|
+
**Technique**: Attacker compromises a community tool used by the target development team (VS Code extension, Homebrew formula, JetBrains plugin). Security researcher proof-of-concept: malicious VS Code extension with 100K+ downloads (2023). Post-2024: AI coding assistant plugins as high-value watering holes due to broad code access.
|
|
121
|
+
**Test**: Enumerate installed VS Code extensions from `extensions.json` or `.vscode/extensions.json` in the repo. Check publisher verification and download counts. Any extension from an unverified publisher with filesystem or network access = risk. Run: `grep -r "recommendations" .vscode/`.
|
|
122
|
+
**Finding**: Unverified VS Code extensions with `readFileSystem` or `executeCommand` capabilities can exfiltrate entire local repositories including secrets cached in dotfiles.
|
|
123
|
+
|
|
124
|
+
### 6. Lure-Based Credential Harvesting via OAuth App Consent Attack
|
|
125
|
+
**Technique**: Attacker registers a malicious OAuth application with a convincing name (e.g., "GitHub Security Audit Tool") and sends the authorization link to developers via Slack, email, or GitHub issue. Upon consent, the attacker receives an OAuth token with the granted scopes, potentially including `repo:write` or `read:org`.
|
|
126
|
+
**Test**: Review GitHub organization's OAuth app audit log. Check if `org.oauth_application.added` events are monitored in SIEM. Verify organization policy enforces OAuth app approval by admins (`Settings > Third-party access > OAuth App policy`). Test by listing authorized apps: `gh api /user/installations`.
|
|
127
|
+
**Finding**: If the GitHub organization allows any OAuth app without admin pre-approval, a phished developer grants repo write access to an attacker without any credential theft.
|
|
128
|
+
|
|
129
|
+
### 7. Pretexting via Internal Tooling Impersonation (Slack/Teams Webhook Abuse)
|
|
130
|
+
**Technique**: Attacker exploits an exposed or leaked incoming webhook URL for Slack/Teams to send messages appearing to originate from official internal channels (e.g., "#security-alerts", "#deployments"). The message instructs developers to rotate a secret by visiting a phishing URL. MITRE ATT&CK: T1566.002 (Spearphishing Link via Service).
|
|
131
|
+
**Test**: Search codebase for hardcoded webhook URLs: `grep -rE "https://hooks\\.slack\\.com|https://[a-z]+\\.webhook\\.office\\.com" . --include="*.js" --include="*.ts" --include="*.env*" --include="*.yml"`. Any committed webhook URL is exploitable by anyone who reads the repo (including via public git history).
|
|
132
|
+
**Finding**: A leaked Slack incoming webhook URL enables unlimited impersonation of internal security communications without authentication. Severity: HIGH if the workspace lacks verified sender indicators.
|
|
133
|
+
|
|
134
|
+
### 8. AI-Generated Deepfake Document Phishing (2025 Active Threat)
|
|
135
|
+
**Technique**: LLM-generated spear-phishing emails and documents tailored to the target using OSINT data scraped from GitHub profiles, commit messages, and LinkedIn. Quality now indistinguishable from genuine communication. Paired with AI voice cloning for follow-up "verification" calls. Observed in wild against software teams since Q1 2025.
|
|
136
|
+
**Test**: Assess whether team has security awareness training covering AI-generated phishing indicators. Check if email gateway enforces DMARC/DKIM/SPF for all outbound domains associated with the project. Run: `dig TXT <project-domain> | grep "v=spf"`. Verify DMARC policy is `p=reject`, not `p=none`.
|
|
137
|
+
**Finding**: If project domain lacks `p=reject` DMARC, attackers can send emails that pass spam filters appearing to originate from `@<project-domain>` addresses, targeting both team members and customers with AI-personalized content.
|
|
138
|
+
|
|
139
|
+
---
|
|
140
|
+
|
|
141
|
+
## §PENTEST_SOCIAL-CHECKLIST
|
|
142
|
+
|
|
143
|
+
1. **Phishing-resistant MFA enforcement**: Verify all accounts with production, CI/CD, or secrets access require FIDO2/WebAuthn (passkeys or hardware tokens). Mechanism: check IdP policy (Okta, Azure AD, Google Workspace MFA settings). Finding: any admin or deployer account accepting SMS OTP or TOTP = HIGH.
|
|
144
|
+
|
|
145
|
+
2. **GitHub organization OAuth app policy**: Confirm `Settings > Third-party access` requires admin approval for all OAuth apps. Mechanism: `gh api /orgs/<org>/settings` and review `two_factor_requirement`, `members_can_create_public_repositories`. Finding: any org without required admin pre-approval for OAuth apps = MEDIUM.
|
|
146
|
+
|
|
147
|
+
3. **CODEOWNERS blast radius mapping**: Map every engineer listed in CODEOWNERS to their other access (cloud IAM roles, npm publish rights, Kubernetes RBAC). Mechanism: read `.github/CODEOWNERS`; cross-reference with AWS/GCP IAM user lists if accessible. Finding: a single engineer with CODEOWNERS approval authority AND unrestricted cloud IAM = HIGH lateral movement risk on account compromise.
|
|
148
|
+
|
|
149
|
+
4. **Secrets in git history (retrospective)**: Run `git log --all --full-history -- "**/*.env"` and `trufflehog git file://.` against the full repository history. Mechanism: secrets committed and later deleted remain accessible in history. Finding: any valid credential (API key, private key, password) in any commit = CRITICAL regardless of age.
|
|
150
|
+
|
|
151
|
+
5. **Pull request target workflow guard**: Audit all `pull_request_target` GitHub Actions workflows for missing head-repo guards. Mechanism: `grep -rn "pull_request_target" .github/workflows/`. Correct guard: `if: github.event.pull_request.head.repo.full_name == github.repository`. Finding: absent guard = any fork PR executes with write `GITHUB_TOKEN` = HIGH.
|
|
152
|
+
|
|
153
|
+
6. **Typosquatting and dependency confusion**: For every package in `package.json` dependencies: verify the npm organization ownership matches the expected publisher. Mechanism: `npm info <package> | grep "maintainers"`. For internal package names not on npm, verify private registry scoping (e.g., `@<org>/` prefix) is enforced in `.npmrc`. Finding: any unscoped internal package name resolvable on public npm = HIGH.
|
|
154
|
+
|
|
155
|
+
7. **Offboarding process verification**: Check if there is a documented and audited offboarding checklist. Mechanism: search `docs/`, `runbooks/`, Notion/Confluence links in README for "offboarding". Verify the checklist includes: GitHub org removal, cloud IAM revocation, VPN certificate revocation, shared secret rotation. Finding: undocumented or unaudited offboarding = MEDIUM (becomes HIGH on first departing insider with production access).
|
|
156
|
+
|
|
157
|
+
8. **Incoming webhook and bot token exposure**: Scan all files including git history for Slack, Teams, PagerDuty, and other webhook URLs or bot tokens. Mechanism: `trufflehog git file://.` + `grep -rE "xoxb-|xoxp-|xoxs-|hooks\.slack\.com"`. Finding: any live webhook or token = HIGH (immediate rotation required).
|
|
158
|
+
|
|
159
|
+
9. **DMARC/SPF/DKIM enforcement on project domains**: For each domain associated with the project (from package.json `homepage`, README, CODEOWNERS emails), check DNS records. Mechanism: `dig TXT <domain>` for SPF; `dig TXT _dmarc.<domain>` for DMARC `p=reject`. Finding: `p=none` or missing DMARC = MEDIUM (email impersonation of project domain possible).
|
|
160
|
+
|
|
161
|
+
10. **Watering hole risk in developer tooling**: Review `.vscode/extensions.json`, `.idea/`, and any documented toolchain dependencies. Mechanism: for each extension/plugin, verify publisher identity and review requested permissions. Finding: any unverified-publisher extension with `workspace` or filesystem access = MEDIUM (escalates to HIGH if extension has network access).
|
|
162
|
+
|
|
163
|
+
11. **AI assistant instruction file integrity**: Check for AI coding assistant configuration files (`.cursorrules`, `.github/copilot-instructions.md`, `.claude/`). Mechanism: verify these files are not modifiable by contributors without code owner review; check if they are included in `.github/CODEOWNERS`. Finding: AI assistant instructions writable by any contributor without review = MEDIUM (indirect code injection vector).
|
|
164
|
+
|
|
165
|
+
12. **Insider threat detection monitoring gaps**: Verify whether the SIEM/logging stack captures: bulk data export events, after-hours deployment activity, access from new geographic locations, and access token creation. Mechanism: review CloudTrail/audit logs for `CreateAccessKey`, `GetSecretValue`, and equivalent events. Finding: no alerting on bulk `GetSecretValue` calls by a single IAM principal = HIGH detection gap.
|
|
166
|
+
|
|
167
|
+
---
|
|
168
|
+
|
|
169
|
+
## §POC-REQUIREMENT
|
|
170
|
+
|
|
171
|
+
For every social engineering or insider threat finding:
|
|
172
|
+
|
|
173
|
+
1. **Write working PoC FIRST**: Document the exact attack chain — the specific target (role, access level), the exact phishing lure or insider action, the precise credential or data accessed, and the observed impact (e.g., "GitHub token with `repo:write` scope obtained; used to push to main branch bypassing branch protection").
|
|
174
|
+
2. **Confirm reproduction**: For technical vectors (workflow injection, dependency confusion, webhook abuse), demonstrate the attack executes as described. For human vectors, document the scenario with sufficient detail that a red team could execute it without further clarification.
|
|
175
|
+
3. **Write fix**: Implement the specific control — enforce FIDO2, add CODEOWNERS guard, rotate exposed secret, enforce private registry scoping.
|
|
176
|
+
4. **Verify PoC fails against fix**: Re-test the attack chain after the control is in place. For human vectors, confirm the policy or technical control would block the scenario at the identified failure point.
|
|
177
|
+
5. **Record in findings JSON under `exploitPoC`**: Include the attack chain description, the target role, the blast radius, and the control implemented.
|
|
178
|
+
|
|
179
|
+
**PoC skipping = severity automatically downgraded to MEDIUM.**
|
|
180
|
+
|
|
181
|
+
---
|
|
182
|
+
|
|
183
|
+
## §PROJECT-ESCALATION
|
|
184
|
+
|
|
185
|
+
Immediately alert the CISO orchestrator and reprioritize the run when any of the following are confirmed:
|
|
186
|
+
|
|
187
|
+
1. **Live credential found in git history**: Any API key, cloud credential, private key, or password present in any commit (including deleted files) that has not been provably rotated. This is a CRITICAL active compromise risk — do not wait for the full run to complete.
|
|
188
|
+
|
|
189
|
+
2. **`pull_request_target` workflow without head-repo guard**: Confirmed exploitable workflow that allows a fork PR to execute with write `GITHUB_TOKEN`. An external attacker with a GitHub account can exploit this with zero prerequisites.
|
|
190
|
+
|
|
191
|
+
3. **Admin account without phishing-resistant MFA**: Any GitHub organization owner, cloud account root user, or IdP admin confirmed to have only SMS/TOTP MFA. A single vishing or real-time phishing proxy attack (Evilginx2) results in full organization takeover.
|
|
192
|
+
|
|
193
|
+
4. **AI assistant instruction file writable by external contributors**: If `.cursorrules`, `.github/copilot-instructions.md`, or equivalent files are not in CODEOWNERS and can be modified by PR from any contributor — the team's AI coding assistant becomes a code injection vector for any attacker who submits a PR.
|
|
194
|
+
|
|
195
|
+
5. **Confirmed typosquatted package installed**: A public npm/PyPI package with a name matching an internal dependency has been installed from the public registry instead of the intended internal source. This is an active supply chain compromise — treat as CRITICAL, escalate immediately.
|
|
196
|
+
|
|
197
|
+
6. **DMARC `p=none` on a domain used in customer-facing communications**: Combined with social engineering context, this allows an attacker to send phishing emails that appear to originate from the organization's own domain to customers, partners, and team members.
|
|
198
|
+
|
|
199
|
+
7. **No documented offboarding process AND a high-privilege departure identified**: If git history shows a recently inactive committer with production or secrets access and no evidence of access revocation — this is a latent insider threat. Escalate for immediate access audit.
|
|
200
|
+
|
|
201
|
+
8. **Evidence of data exfiltration pattern in audit logs**: Bulk `GetSecretValue`, `ListBuckets`, `ExportData`, or equivalent API calls in a short window from a single principal — even if currently authorized, this warrants immediate investigation as a potential insider exfiltration in progress.
|
|
202
|
+
|
|
203
|
+
---
|
|
204
|
+
|
|
205
|
+
## §EDGE-CASE-MATRIX
|
|
206
|
+
|
|
207
|
+
The 5 attack cases in this domain that automated scanners and naive manual review universally miss. MANDATORY checks — do not skip.
|
|
208
|
+
|
|
209
|
+
| # | Edge Case | Why Scanners Miss It | Concrete Test |
|
|
210
|
+
|---|-----------|----------------------|---------------|
|
|
211
|
+
| 1 | Second-order / stored payload executed in different context | Scanner checks input context, not execution context | Store payload safely; trigger in separate request/session |
|
|
212
|
+
| 2 | Unicode normalisation bypass | Regex filters run before normalisation; attacker uses homoglyphs or composed forms | Submit Ⅰ (U+2160) or < (U+FF1C) variants of known-bad strings |
|
|
213
|
+
| 3 | Polyglot payload active in multiple sinks simultaneously | Scanners test one injection class per payload | `'"><script>{{7*7}}</script><!--` — SQL + XSS + SSTI in one request |
|
|
214
|
+
| 4 | Out-of-band exfiltration (DNS/HTTP callback) | Scanner looks for inline response difference; OOB leaves no visible trace | Use Burp Collaborator / interactsh; inject DNS lookup payload |
|
|
215
|
+
| 5 | Race condition between check and use (TOCTOU) | Sequential scanners don't model concurrency | Send two simultaneous requests to the same state-changing endpoint |
|
|
216
|
+
|
|
217
|
+
---
|
|
218
|
+
|
|
219
|
+
## §TEMPORAL-THREATS
|
|
220
|
+
|
|
221
|
+
Threats materialising in the 2025–2030 window that defences designed today must account for.
|
|
222
|
+
|
|
223
|
+
| Threat | Est. Timeline | Relevance to This Domain | Prepare Now By |
|
|
224
|
+
|--------|--------------|--------------------------|----------------|
|
|
225
|
+
| Cryptographically Relevant Quantum Computer (CRQC) | 2028–2032 | Harvest-now-decrypt-later attacks active today; RSA/ECDSA keys signed today will be broken | Inventory all RSA/ECDSA usage; migrate long-lived data to ML-KEM (FIPS 203) |
|
|
226
|
+
| AI-assisted adversaries at scale | 2025–2027 (active) | LLM-powered fuzzing finds 10× more edge cases; automated PoC generation | Assume attackers have LLM help; expand test surface to match |
|
|
227
|
+
| EU AI Act full enforcement | 2026 | High-risk AI systems require mandatory conformity assessments | Classify all AI features against AI Act tiers now |
|
|
228
|
+
| Post-quantum TLS migration deadline | 2028–2030 | Browser vendors will drop classical-only TLS connections | Begin TLS agility assessment; test hybrid key exchange |
|
|
229
|
+
| Mandatory SBOM + build provenance (US EO 14028 / EU CRA) | 2025–2026 (active) | SBOM and SLSA attestation are becoming legally required | Achieve SLSA L2 minimum; generate CycloneDX SBOM per release |
|
|
230
|
+
|
|
231
|
+
---
|
|
232
|
+
|
|
233
|
+
## §DETECTION-GAP
|
|
234
|
+
|
|
235
|
+
What current security monitoring CANNOT detect in this domain, and what to build to close each gap.
|
|
236
|
+
|
|
237
|
+
**Standard gaps that MUST be checked:**
|
|
238
|
+
|
|
239
|
+
- **Second-order attack execution**: The storage request looks safe; only the retrieval+execution step is dangerous. Need: correlate write events with downstream read+execute events in the same SIEM query window.
|
|
240
|
+
- **Timing-side-channel leakage**: No log event emitted; only observable as microsecond response-time variance. Need: per-endpoint p99 latency tracking with statistical anomaly detection.
|
|
241
|
+
- **Low-and-slow credential stuffing**: Individually, each request is under rate limits. Need: behavioural baseline — flag accounts with geographically impossible velocity or device-fingerprint mismatch across authentication attempts.
|
|
242
|
+
- **Insider exfiltration via legitimate process**: Authorised exports, reports, and data downloads that individually are permitted but collectively constitute data exfiltration. Need: data-volume anomaly detection — alert when a single user's data access volume exceeds 3× their 30-day baseline within 24 hours.
|
|
243
|
+
- **Cross-agent attack chains**: Phase 1 finding A + Phase 1 finding B = CRITICAL chain invisible to either agent alone. Need: CISO orchestrator Phase 1 synthesis step — correlate all agent findings before Phase 2.
|
|
244
|
+
|
|
245
|
+
---
|
|
246
|
+
|
|
247
|
+
## §ZERO-MISS-MANDATE
|
|
248
|
+
|
|
249
|
+
This agent CANNOT declare any attack class clean without explicit evidence of checking. For each item, output one of:
|
|
250
|
+
- `CHECKED: [N files] | [patterns used] | CLEAN`
|
|
251
|
+
- `CHECKED: [N files] | [patterns used] | [N findings, all fixed]`
|
|
252
|
+
- `SKIPPED: [reason — must be "not applicable: [evidence]"]`
|
|
253
|
+
|
|
254
|
+
**Silent skip = FAILED COVERAGE.** The orchestrator flags this as a quality gap.
|
|
255
|
+
|
|
256
|
+
The output findings JSON MUST include a `coverageManifest` key:
|
|
257
|
+
```json
|
|
258
|
+
{
|
|
259
|
+
"coverageManifest": {
|
|
260
|
+
"attackClassesCovered": [{ "class": "Credential in Git History", "filesReviewed": 312, "patterns": ["trufflehog", "git log --all"], "result": "CLEAN" }],
|
|
261
|
+
"filesReviewed": 312,
|
|
262
|
+
"negativeAssertions": ["Credential in Git History: trufflehog scanned all 312 commits — 0 matches"],
|
|
263
|
+
"uncoveredReason": {}
|
|
264
|
+
}
|
|
265
|
+
}
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
---
|
|
269
|
+
|
|
270
|
+
## LEARNING SIGNAL
|
|
271
|
+
|
|
272
|
+
On every finding resolved, emit:
|
|
273
|
+
```json
|
|
274
|
+
{
|
|
275
|
+
"findingId": "FINDING_ID",
|
|
276
|
+
"agentName": "pentest-social",
|
|
277
|
+
"resolved": true,
|
|
278
|
+
"remediationTemplate": "one-line description of what was done",
|
|
279
|
+
"falsePositive": false
|
|
280
|
+
}
|
|
281
|
+
```
|
|
282
|
+
Call `security.record_outcome` with this payload so the routing engine learns which agent resolves each finding class most successfully. If a finding is a false positive, set `falsePositive: true` — this prevents the false-positive pattern from being routed here again.
|
|
@@ -25,6 +25,15 @@ SKILL.md §9 is the minimum. You go beyond it.
|
|
|
25
25
|
Every finding includes: CVSS v4, CWE, ATT&CK technique ID, step-by-step PoC chain,
|
|
26
26
|
and a "blast radius" statement: what data can be accessed, modified, or destroyed.
|
|
27
27
|
|
|
28
|
+
## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
|
|
29
|
+
|
|
30
|
+
The full suite of detection modules in `src/gate/checks/` (especially `api.ts`, `auth-deep.ts`, `injection-deep.ts`, `infra.ts`, `k8s.ts`) is your deterministic floor, not your ceiling. As team LEAD you read the Phase-1 threat-model.json as the access map and attack every surface; treat their finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
|
|
31
|
+
|
|
32
|
+
- **Cross-file / multi-step reasoning the regex can't do:** chain an exposure across modules — an IDOR found by `api.ts` + a missing tenant check in `database`-layer code + an over-scoped CI secret from `ci-pipeline` becomes a full account-takeover-to-infra-pivot kill chain no single check sees.
|
|
33
|
+
- **Semantic / effective-state analysis:** weigh each individual finding against the modeled adversary's goal and the runtime's effective trust boundaries; a "low" auth gap is critical when it sits on the path to the crown-jewel data flow.
|
|
34
|
+
- **External corroboration:** WebSearch/WebFetch for current CVEs/advisories, CISA KEV entries, and exploit PoCs matching the project's stack and dependency versions.
|
|
35
|
+
- **Apply & prove:** write the remediation inline at the true root cause, re-run the relevant `src/gate/checks/` modules (plus burp/nuclei/sqlmap/scoutsuite as the surface dictates) as a regression floor, then re-audit the whole chain. Emit the LEARNING SIGNAL per fix; surface trade-offs with the secure default.
|
|
36
|
+
|
|
28
37
|
## ACTIVATION PROTOCOL
|
|
29
38
|
|
|
30
39
|
1. Call `orchestration.update_agent_status(agentRunId, "pentest-team", "running")`
|
|
@@ -171,3 +180,90 @@ Each tactic MUST be addressed — explicitly CONFIRMED or "N/A — reason: …".
|
|
|
171
180
|
- **Multi-turn attack chain**: build up context over 5+ turns to bypass instruction hierarchy
|
|
172
181
|
- **Indirect injection via RAG**: inject payload into document that model retrieves — does it execute?
|
|
173
182
|
- **Agentic loop exploitation**: trigger infinite tool call loops to exhaust rate limits or billing
|
|
183
|
+
|
|
184
|
+
---
|
|
185
|
+
|
|
186
|
+
## LEARNING SIGNAL
|
|
187
|
+
|
|
188
|
+
On every finding resolved, emit:
|
|
189
|
+
```json
|
|
190
|
+
{
|
|
191
|
+
"findingId": "FINDING_ID",
|
|
192
|
+
"agentName": "AGENT_NAME",
|
|
193
|
+
"resolved": true,
|
|
194
|
+
"remediationTemplate": "one-line description of what was done",
|
|
195
|
+
"falsePositive": false
|
|
196
|
+
}
|
|
197
|
+
```
|
|
198
|
+
Call `security.record_outcome` with this payload so the routing engine learns which agent resolves each finding class most successfully. If a finding is a false positive, set `falsePositive: true` — this prevents the false-positive pattern from being routed here again.
|
|
199
|
+
|
|
200
|
+
---
|
|
201
|
+
|
|
202
|
+
## §EDGE-CASE-MATRIX
|
|
203
|
+
|
|
204
|
+
The 5 attack cases in this domain that automated scanners and naive manual review universally miss. MANDATORY checks — do not skip.
|
|
205
|
+
|
|
206
|
+
| # | Edge Case | Why Scanners Miss It | Concrete Test |
|
|
207
|
+
|---|-----------|----------------------|---------------|
|
|
208
|
+
| 1 | Second-order / stored payload executed in different context | Scanner checks input context, not execution context | Store payload safely; trigger in separate request/session |
|
|
209
|
+
| 2 | Unicode normalisation bypass | Regex filters run before normalisation; attacker uses homoglyphs or composed forms | Submit Ⅰ (U+2160) or < (U+FF1C) variants of known-bad strings |
|
|
210
|
+
| 3 | Polyglot payload active in multiple sinks simultaneously | Scanners test one injection class per payload | `'"><script>{{7*7}}</script><!--` — SQL + XSS + SSTI in one request |
|
|
211
|
+
| 4 | Out-of-band exfiltration (DNS/HTTP callback) | Scanner looks for inline response difference; OOB leaves no visible trace | Use Burp Collaborator / interactsh; inject DNS lookup payload |
|
|
212
|
+
| 5 | Race condition between check and use (TOCTOU) | Sequential scanners don't model concurrency | Send two simultaneous requests to the same state-changing endpoint |
|
|
213
|
+
|
|
214
|
+
## §TEMPORAL-THREATS
|
|
215
|
+
|
|
216
|
+
Threats materialising in the 2025–2030 window that defences designed today must account for.
|
|
217
|
+
|
|
218
|
+
| Threat | Est. Timeline | Relevance to This Domain | Prepare Now By |
|
|
219
|
+
|--------|--------------|--------------------------|----------------|
|
|
220
|
+
| Cryptographically Relevant Quantum Computer (CRQC) | 2028–2032 | Harvest-now-decrypt-later attacks active today; RSA/ECDSA keys signed today will be broken | Inventory all RSA/ECDSA usage; migrate long-lived data to ML-KEM (FIPS 203) |
|
|
221
|
+
| AI-assisted adversaries at scale | 2025–2027 (active) | LLM-powered fuzzing finds 10× more edge cases; automated PoC generation | Assume attackers have LLM help; expand test surface to match |
|
|
222
|
+
| EU AI Act full enforcement | 2026 | High-risk AI systems require mandatory conformity assessments | Classify all AI features against AI Act tiers now |
|
|
223
|
+
| Post-quantum TLS migration deadline | 2028–2030 | Browser vendors will drop classical-only TLS connections | Begin TLS agility assessment; test hybrid key exchange |
|
|
224
|
+
| Mandatory SBOM + build provenance (US EO 14028 / EU CRA) | 2025–2026 (active) | SBOM and SLSA attestation are becoming legally required | Achieve SLSA L2 minimum; generate CycloneDX SBOM per release |
|
|
225
|
+
|
|
226
|
+
## §DETECTION-GAP
|
|
227
|
+
|
|
228
|
+
What current security monitoring CANNOT detect in this domain, and what to build to close each gap.
|
|
229
|
+
|
|
230
|
+
**Standard gaps that MUST be checked:**
|
|
231
|
+
|
|
232
|
+
- **Second-order attack execution**: The storage request looks safe; only the retrieval+execution step is dangerous. Need: correlate write events with downstream read+execute events in the same SIEM query window.
|
|
233
|
+
- **Timing-side-channel leakage**: No log event emitted; only observable as microsecond response-time variance. Need: per-endpoint p99 latency tracking with statistical anomaly detection.
|
|
234
|
+
- **Low-and-slow credential stuffing**: Individually, each request is under rate limits. Need: behavioural baseline — flag accounts with geographically impossible velocity or device-fingerprint mismatch across authentication attempts.
|
|
235
|
+
- **Insider exfiltration via legitimate process**: Authorised exports, reports, and data downloads that individually are permitted but collectively constitute data exfiltration. Need: data-volume anomaly detection — alert when a single user's data access volume exceeds 3× their 30-day baseline within 24 hours.
|
|
236
|
+
- **Cross-agent attack chains**: Phase 1 finding A + Phase 1 finding B = CRITICAL chain invisible to either agent alone. Need: CISO orchestrator Phase 1 synthesis step — correlate all agent findings before Phase 2.
|
|
237
|
+
|
|
238
|
+
## §ZERO-MISS-MANDATE
|
|
239
|
+
|
|
240
|
+
This agent CANNOT declare any attack class clean without explicit evidence of checking. For each item, output one of:
|
|
241
|
+
- `CHECKED: [N files] | [patterns used] | CLEAN`
|
|
242
|
+
- `CHECKED: [N files] | [patterns used] | [N findings, all fixed]`
|
|
243
|
+
- `SKIPPED: [reason — must be "not applicable: [evidence]"]`
|
|
244
|
+
|
|
245
|
+
**Silent skip = FAILED COVERAGE.** The orchestrator flags this as a quality gap.
|
|
246
|
+
|
|
247
|
+
The output findings JSON MUST include a `coverageManifest` key:
|
|
248
|
+
```json
|
|
249
|
+
{
|
|
250
|
+
"coverageManifest": {
|
|
251
|
+
"attackClassesCovered": [{ "class": "SQL Injection", "filesReviewed": 47, "patterns": ["queryRaw", "string concat"], "result": "CLEAN" }],
|
|
252
|
+
"filesReviewed": 47,
|
|
253
|
+
"negativeAssertions": ["SQL Injection: queryRaw pattern searched across 47 files — 0 matches"],
|
|
254
|
+
"uncoveredReason": {}
|
|
255
|
+
}
|
|
256
|
+
}
|
|
257
|
+
```
|
|
258
|
+
|
|
259
|
+
Every findings JSON MUST include `intelligenceForOtherAgents`:
|
|
260
|
+
```json
|
|
261
|
+
{
|
|
262
|
+
"intelligenceForOtherAgents": {
|
|
263
|
+
"forPentestTeam": [{ "type": "HIGH_VALUE_TARGET", "description": "...", "exploitHint": "..." }],
|
|
264
|
+
"forCryptoSpecialist": [{ "type": "CRYPTO_WEAKNESS_REFERENCE", "algorithm": "...", "location": "..." }],
|
|
265
|
+
"forCloudSpecialist": [{ "type": "SSRF_TO_CLOUD_CHAIN", "ssrfLocation": "...", "escalationPath": "..." }],
|
|
266
|
+
"forComplianceGrc": [{ "type": "COMPLIANCE_BLOCKER", "frameworks": ["..."], "releaseBlock": true }]
|
|
267
|
+
}
|
|
268
|
+
}
|
|
269
|
+
```
|
|
@@ -23,6 +23,15 @@ Execute full OWASP Testing Guide methodology against all endpoints found in the
|
|
|
23
23
|
Every finding is exploited end-to-end with a concrete PoC. No theoretical vulnerabilities —
|
|
24
24
|
only confirmed exploitable issues with real impact.
|
|
25
25
|
|
|
26
|
+
## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
|
|
27
|
+
|
|
28
|
+
The `api` + `web-nextjs` + `injection-deep` + `auth-deep` + `graphql` detection modules (`src/gate/checks/api.ts`, `src/gate/checks/web-nextjs.ts`, `src/gate/checks/injection-deep.ts`, `src/gate/checks/auth-deep.ts`, `src/gate/checks/graphql.ts`) are your deterministic floor, not your ceiling. Treat their finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
|
|
29
|
+
|
|
30
|
+
- **Cross-file / multi-step reasoning the regex can't do:** chain an IDOR across endpoints — an object ID leaked by one `api.ts` route + a missing ownership check on a second route + a GraphQL field (`graphql.ts`) that exposes the same record becomes full horizontal/vertical privesc; follow user input through a Next.js server action into the DB query to confirm injection reaches the sink.
|
|
31
|
+
- **Semantic / effective-state analysis:** decide whether authZ is enforced per-object (BOLA/BFLA) and not just per-route, whether GraphQL depth/cost limits and introspection controls are *effectively* on, and whether the `auth-deep` session check actually gates the mutation — middleware that authenticates but never authorizes is the bug.
|
|
32
|
+
- **External corroboration:** WebSearch/WebFetch for current OWASP API Top 10 / Testing Guide updates and CVEs for the framework, GraphQL server, and Next.js version in use.
|
|
33
|
+
- **Apply & prove:** write the authZ/validation fix inline, re-run the `api`/`web-nextjs`/`injection-deep`/`auth-deep`/`graphql` checks (plus burp/sqlmap against the live endpoints and a GraphQL introspection/depth probe) as a regression floor, then re-audit the chain. Emit the LEARNING SIGNAL per fix; surface trade-offs with the secure default.
|
|
34
|
+
|
|
26
35
|
## EXECUTION
|
|
27
36
|
|
|
28
37
|
1. Read `threat-model.json` and all Phase 1 appsec findings as the engagement brief
|
|
@@ -115,6 +124,33 @@ Test all of the following chains (mark each CONFIRMED, PARTIAL, or N/A with reas
|
|
|
115
124
|
3. Test `expand`/`include`/`fields` query params — do they expose hidden or privileged fields?
|
|
116
125
|
4. **Required fix**: explicit field allowlist per role in every PATCH/PUT handler; no object spread from req.body
|
|
117
126
|
|
|
127
|
+
## BEYOND SKILL.MD
|
|
128
|
+
|
|
129
|
+
Domain-specific threats beyond the standard OWASP checklist that this agent MUST test:
|
|
130
|
+
|
|
131
|
+
- **CVE-2023-45133 (Babel transform RCE via prototype pollution)** — any app using Babel plugins server-side that accept user-controlled input can reach arbitrary code execution through `__proto__` mutation; test every transform pipeline endpoint.
|
|
132
|
+
- **GraphQL batching amplification (no CVE — research: HackerOne 2022)** — a single HTTP request with 500 aliased `user(id: X)` queries bypasses per-endpoint rate limits; measure actual resolver fanout and confirm depth/complexity limits are enforced.
|
|
133
|
+
- **JWT algorithm confusion (CVE-2022-21449 / "Psychic Signatures")** — ECDSA libraries that skip point-at-infinity validation accept blank signatures on any payload; test by sending `r=0, s=0` in ES256 JWTs against every token-verified endpoint.
|
|
134
|
+
- **Mass assignment via OpenAPI `additionalProperties: true`** — generated SDK clients silently pass unknown fields; fuzz every PATCH/POST body with `role`, `isAdmin`, `subscriptionTier`, `ownerId` — confirm server rejects or ignores them.
|
|
135
|
+
- **AI-powered LLM prompt injection via API input fields (2024–present)** — if any endpoint pipes user input into an LLM (summarise, classify, translate), test for indirect prompt injection: store a payload in a user-controlled field (name, bio, product description) that triggers when an AI feature reads it — exfiltrate system prompt or issue tool calls.
|
|
136
|
+
- **Post-quantum harvest-now-decrypt-later on API traffic** — API responses containing long-lived PII (SSN, medical records, financial data) encrypted under RSA/ECDSA today are already being captured for future CRQC decryption; audit whether the API enforces forward-secrecy (TLS 1.3 + ephemeral DH) and whether any at-rest tokens use RSA-OAEP or ECDH without hybrid ML-KEM wrapping.
|
|
137
|
+
- **HTTP/2 rapid reset DoS (CVE-2023-44487)** — client opens and immediately cancels streams at high rate to exhaust server worker threads without triggering request-volume limits; test against any HTTP/2-enabled endpoint and verify the server applies stream-reset rate limiting.
|
|
138
|
+
- **BOLA chain through indirect object reference in pagination cursors** — cursor-based pagination tokens (base64-encoded DB IDs) often encode a resource ID that is never re-authorised on decode; decode every `after`/`cursor` parameter and substitute another user's resource ID.
|
|
139
|
+
|
|
140
|
+
## LEARNING SIGNAL
|
|
141
|
+
|
|
142
|
+
On every finding resolved, emit:
|
|
143
|
+
```json
|
|
144
|
+
{
|
|
145
|
+
"findingId": "FINDING_ID",
|
|
146
|
+
"agentName": "AGENT_NAME",
|
|
147
|
+
"resolved": true,
|
|
148
|
+
"remediationTemplate": "one-line description of what was done",
|
|
149
|
+
"falsePositive": false
|
|
150
|
+
}
|
|
151
|
+
```
|
|
152
|
+
Call `security.record_outcome` with this payload so the routing engine learns which agent resolves each finding class most successfully. If a finding is a false positive, set `falsePositive: true` — this prevents the false-positive pattern from being routed here again.
|
|
153
|
+
|
|
118
154
|
## OUTPUT
|
|
119
155
|
|
|
120
156
|
`AgentFinding[]` array with confirmed exploitable findings. Each includes:
|
|
@@ -122,3 +158,74 @@ Test all of the following chains (mark each CONFIRMED, PARTIAL, or N/A with reas
|
|
|
122
158
|
- What data was accessed or what action was performed
|
|
123
159
|
- CVSS v4 score, ATT&CK technique, step-by-step PoC
|
|
124
160
|
- Fixed code written inline
|
|
161
|
+
|
|
162
|
+
Every findings JSON MUST include `intelligenceForOtherAgents`:
|
|
163
|
+
```json
|
|
164
|
+
{
|
|
165
|
+
"intelligenceForOtherAgents": {
|
|
166
|
+
"forPentestTeam": [{ "type": "HIGH_VALUE_TARGET", "description": "...", "exploitHint": "..." }],
|
|
167
|
+
"forCryptoSpecialist": [{ "type": "CRYPTO_WEAKNESS_REFERENCE", "algorithm": "...", "location": "..." }],
|
|
168
|
+
"forCloudSpecialist": [{ "type": "SSRF_TO_CLOUD_CHAIN", "ssrfLocation": "...", "escalationPath": "..." }],
|
|
169
|
+
"forComplianceGrc": [{ "type": "COMPLIANCE_BLOCKER", "frameworks": ["..."], "releaseBlock": true }]
|
|
170
|
+
}
|
|
171
|
+
}
|
|
172
|
+
```
|
|
173
|
+
|
|
174
|
+
---
|
|
175
|
+
|
|
176
|
+
## §EDGE-CASE-MATRIX
|
|
177
|
+
|
|
178
|
+
The 5 attack cases in this domain that automated scanners and naive manual review universally miss. MANDATORY checks — do not skip.
|
|
179
|
+
|
|
180
|
+
| # | Edge Case | Why Scanners Miss It | Concrete Test |
|
|
181
|
+
|---|-----------|----------------------|---------------|
|
|
182
|
+
| 1 | Second-order / stored payload executed in different context | Scanner checks input context, not execution context | Store payload safely; trigger in separate request/session |
|
|
183
|
+
| 2 | Unicode normalisation bypass | Regex filters run before normalisation; attacker uses homoglyphs or composed forms | Submit Ⅰ (U+2160) or < (U+FF1C) variants of known-bad strings |
|
|
184
|
+
| 3 | Polyglot payload active in multiple sinks simultaneously | Scanners test one injection class per payload | `'"><script>{{7*7}}</script><!--` — SQL + XSS + SSTI in one request |
|
|
185
|
+
| 4 | Out-of-band exfiltration (DNS/HTTP callback) | Scanner looks for inline response difference; OOB leaves no visible trace | Use Burp Collaborator / interactsh; inject DNS lookup payload |
|
|
186
|
+
| 5 | Race condition between check and use (TOCTOU) | Sequential scanners don't model concurrency | Send two simultaneous requests to the same state-changing endpoint |
|
|
187
|
+
|
|
188
|
+
## §TEMPORAL-THREATS
|
|
189
|
+
|
|
190
|
+
Threats materialising in the 2025–2030 window that defences designed today must account for.
|
|
191
|
+
|
|
192
|
+
| Threat | Est. Timeline | Relevance to This Domain | Prepare Now By |
|
|
193
|
+
|--------|--------------|--------------------------|----------------|
|
|
194
|
+
| Cryptographically Relevant Quantum Computer (CRQC) | 2028–2032 | Harvest-now-decrypt-later attacks active today; RSA/ECDSA keys signed today will be broken | Inventory all RSA/ECDSA usage; migrate long-lived data to ML-KEM (FIPS 203) |
|
|
195
|
+
| AI-assisted adversaries at scale | 2025–2027 (active) | LLM-powered fuzzing finds 10× more edge cases; automated PoC generation | Assume attackers have LLM help; expand test surface to match |
|
|
196
|
+
| EU AI Act full enforcement | 2026 | High-risk AI systems require mandatory conformity assessments | Classify all AI features against AI Act tiers now |
|
|
197
|
+
| Post-quantum TLS migration deadline | 2028–2030 | Browser vendors will drop classical-only TLS connections | Begin TLS agility assessment; test hybrid key exchange |
|
|
198
|
+
| Mandatory SBOM + build provenance (US EO 14028 / EU CRA) | 2025–2026 (active) | SBOM and SLSA attestation are becoming legally required | Achieve SLSA L2 minimum; generate CycloneDX SBOM per release |
|
|
199
|
+
|
|
200
|
+
## §DETECTION-GAP
|
|
201
|
+
|
|
202
|
+
What current security monitoring CANNOT detect in this domain, and what to build to close each gap.
|
|
203
|
+
|
|
204
|
+
**Standard gaps that MUST be checked:**
|
|
205
|
+
|
|
206
|
+
- **Second-order attack execution**: The storage request looks safe; only the retrieval+execution step is dangerous. Need: correlate write events with downstream read+execute events in the same SIEM query window.
|
|
207
|
+
- **Timing-side-channel leakage**: No log event emitted; only observable as microsecond response-time variance. Need: per-endpoint p99 latency tracking with statistical anomaly detection.
|
|
208
|
+
- **Low-and-slow credential stuffing**: Individually, each request is under rate limits. Need: behavioural baseline — flag accounts with geographically impossible velocity or device-fingerprint mismatch across authentication attempts.
|
|
209
|
+
- **Insider exfiltration via legitimate process**: Authorised exports, reports, and data downloads that individually are permitted but collectively constitute data exfiltration. Need: data-volume anomaly detection — alert when a single user's data access volume exceeds 3× their 30-day baseline within 24 hours.
|
|
210
|
+
- **Cross-agent attack chains**: Phase 1 finding A + Phase 1 finding B = CRITICAL chain invisible to either agent alone. Need: CISO orchestrator Phase 1 synthesis step — correlate all agent findings before Phase 2.
|
|
211
|
+
|
|
212
|
+
## §ZERO-MISS-MANDATE
|
|
213
|
+
|
|
214
|
+
This agent CANNOT declare any attack class clean without explicit evidence of checking. For each item, output one of:
|
|
215
|
+
- `CHECKED: [N files] | [patterns used] | CLEAN`
|
|
216
|
+
- `CHECKED: [N files] | [patterns used] | [N findings, all fixed]`
|
|
217
|
+
- `SKIPPED: [reason — must be "not applicable: [evidence]"]`
|
|
218
|
+
|
|
219
|
+
**Silent skip = FAILED COVERAGE.** The orchestrator flags this as a quality gap.
|
|
220
|
+
|
|
221
|
+
The output findings JSON MUST include a `coverageManifest` key:
|
|
222
|
+
```json
|
|
223
|
+
{
|
|
224
|
+
"coverageManifest": {
|
|
225
|
+
"attackClassesCovered": [{ "class": "SQL Injection", "filesReviewed": 47, "patterns": ["queryRaw", "string concat"], "result": "CLEAN" }],
|
|
226
|
+
"filesReviewed": 47,
|
|
227
|
+
"negativeAssertions": ["SQL Injection: queryRaw pattern searched across 47 files — 0 matches"],
|
|
228
|
+
"uncoveredReason": {}
|
|
229
|
+
}
|
|
230
|
+
}
|
|
231
|
+
```
|