npm - security-mcp - Versions diffs - 1.3.1 → 1.3.4 - Mend

security-mcp 1.3.1 → 1.3.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (131) hide show

package/README.md +286 -887
package/defaults/cloud-controls/aws.json +10712 -0
package/defaults/cloud-controls/azure.json +7201 -0
package/defaults/cloud-controls/gcp.json +4061 -0
package/defaults/control-catalog.json +24 -0
package/dist/ci/pr-gate.js +22 -5
package/dist/cli/index.js +73 -2
package/dist/cli/install.js +4 -55
package/dist/cli/onboarding.js +18 -10
package/dist/gate/checks/agentic-instructions.js +515 -0
package/dist/gate/checks/ai-governance.js +132 -0
package/dist/gate/checks/ai.js +1 -1
package/dist/gate/checks/cloud-controls.js +69 -0
package/dist/gate/checks/crypto.js +1 -1
package/dist/gate/checks/data-platform.js +954 -0
package/dist/gate/checks/dependencies.js +14 -3
package/dist/gate/checks/docker-deep.js +1236 -0
package/dist/gate/checks/gitops.js +724 -0
package/dist/gate/checks/iac.js +1230 -0
package/dist/gate/checks/k8s.js +841 -1
package/dist/gate/checks/secrets.js +49 -37
package/dist/gate/cloud-controls/apply.js +115 -0
package/dist/gate/cloud-controls/bicep.js +36 -0
package/dist/gate/cloud-controls/cfn.js +125 -0
package/dist/gate/cloud-controls/detect.js +104 -0
package/dist/gate/cloud-controls/hcl.js +140 -0
package/dist/gate/cloud-controls/types.js +87 -0
package/dist/gate/exceptions.js +78 -7
package/dist/gate/findings.js +15 -2
package/dist/gate/policy.js +40 -3
package/dist/gate/threat-intel.js +6 -0
package/dist/mcp/audit-chain.js +9 -0
package/dist/mcp/model-router.js +3 -3
package/dist/mcp/orchestration.js +194 -41
package/dist/mcp/server.js +124 -17
package/dist/mcp/tool-audit.js +193 -0
package/dist/repo/fs.js +14 -1
package/dist/review/store.js +4 -2
package/dist/tests/run.js +124 -1
package/package.json +6 -4
package/skills/advanced-dos-tester/SKILL.md +9 -0
package/skills/agentic-instruction-auditor/SKILL.md +111 -0
package/skills/agentic-loop-exploiter/SKILL.md +9 -0
package/skills/ai-llm-redteam/SKILL.md +9 -0
package/skills/ai-model-supply-chain-agent/SKILL.md +9 -0
package/skills/algorithm-implementation-reviewer/SKILL.md +9 -0
package/skills/android-penetration-tester/SKILL.md +9 -0
package/skills/anti-replay-tester/SKILL.md +9 -0
package/skills/appsec-code-auditor/SKILL.md +9 -0
package/skills/artifact-integrity-analyst/SKILL.md +9 -0
package/skills/attack-navigator/SKILL.md +9 -0
package/skills/auth-session-hacker/SKILL.md +9 -0
package/skills/aws-penetration-tester/SKILL.md +54 -0
package/skills/azure-penetration-tester/SKILL.md +52 -0
package/skills/binary-auth-validator/SKILL.md +9 -0
package/skills/bot-detection-specialist/SKILL.md +9 -0
package/skills/business-logic-attacker/SKILL.md +9 -0
package/skills/capec-code-mapper/SKILL.md +9 -0
package/skills/cert-pin-rotation-specialist/SKILL.md +9 -0
package/skills/cicd-pipeline-hijacker/SKILL.md +9 -0
package/skills/ciso-orchestrator/SKILL.md +11 -0
package/skills/cloud-infra-specialist/SKILL.md +9 -0
package/skills/compliance-gap-analyst/SKILL.md +9 -0
package/skills/compliance-grc/SKILL.md +9 -0
package/skills/compliance-lifecycle-tracker/SKILL.md +9 -0
package/skills/container-hardening-auditor/SKILL.md +125 -0
package/skills/credential-stuffing-specialist/SKILL.md +9 -0
package/skills/crypto-pki-specialist/SKILL.md +9 -0
package/skills/csa-ccm-mapper/SKILL.md +9 -0
package/skills/csf2-governance-mapper/SKILL.md +9 -0
package/skills/data-platform-auditor/SKILL.md +125 -0
package/skills/deep-link-fuzzer/SKILL.md +9 -0
package/skills/dependency-confusion-attacker/SKILL.md +9 -0
package/skills/device-integrity-aggregator/SKILL.md +9 -0
package/skills/dos-resilience-tester/SKILL.md +9 -0
package/skills/dread-scorer/SKILL.md +9 -0
package/skills/egress-policy-enforcer/SKILL.md +9 -0
package/skills/evidence-collector/SKILL.md +9 -0
package/skills/file-upload-attacker/SKILL.md +9 -0
package/skills/gcp-penetration-tester/SKILL.md +51 -0
package/skills/git-history-secret-scanner/SKILL.md +9 -0
package/skills/gitops-delivery-auditor/SKILL.md +120 -0
package/skills/iac-security-auditor/SKILL.md +125 -0
package/skills/iam-privesc-graph-builder/SKILL.md +9 -0
package/skills/incident-responder/SKILL.md +9 -0
package/skills/injection-specialist/SKILL.md +9 -0
package/skills/ios-security-auditor/SKILL.md +9 -0
package/skills/json-ambiguity-tester/SKILL.md +0 -0
package/skills/k8s-container-escaper/SKILL.md +22 -0
package/skills/key-management-lifecycle-analyst/SKILL.md +9 -0
package/skills/kill-switch-engineer/SKILL.md +9 -0
package/skills/linddun-privacy-analyst/SKILL.md +9 -0
package/skills/logic-race-fuzzer/SKILL.md +9 -0
package/skills/mobile-api-network-attacker/SKILL.md +9 -0
package/skills/mobile-binary-hardener/SKILL.md +9 -0
package/skills/mobile-security-specialist/SKILL.md +9 -0
package/skills/mobile-webview-auditor/SKILL.md +9 -0
package/skills/model-extraction-attacker/SKILL.md +9 -0
package/skills/multipart-abuse-tester/SKILL.md +9 -0
package/skills/oauth-pkce-specialist/SKILL.md +9 -0
package/skills/parser-exhaustion-tester/SKILL.md +9 -0
package/skills/pentest-infra/SKILL.md +9 -0
package/skills/pentest-social/SKILL.md +9 -0
package/skills/pentest-team/SKILL.md +9 -0
package/skills/pentest-web-api/SKILL.md +9 -0
package/skills/privacy-flow-analyst/SKILL.md +9 -0
package/skills/prompt-injection-specialist/SKILL.md +9 -0
package/skills/quantum-migration-planner/SKILL.md +9 -0
package/skills/rag-poisoning-specialist/SKILL.md +9 -0
package/skills/registry-mirror-enforcer/SKILL.md +9 -0
package/skills/rotation-validation-agent/SKILL.md +9 -0
package/skills/samm-assessor/SKILL.md +9 -0
package/skills/secrets-mask-bypass-tester/SKILL.md +9 -0
package/skills/senior-security-engineer/SKILL.md +11 -0
package/skills/serialization-memory-attacker/SKILL.md +9 -0
package/skills/session-timeout-tester/SKILL.md +9 -0
package/skills/slsa-level3-enforcer/SKILL.md +9 -0
package/skills/slsa-provenance-enforcer/SKILL.md +9 -0
package/skills/ssrf-detection-validator/SKILL.md +9 -0
package/skills/step-up-auth-enforcer/SKILL.md +9 -0
package/skills/stride-pasta-analyst/SKILL.md +9 -0
package/skills/supply-chain-devsecops/SKILL.md +9 -0
package/skills/threat-infrastructure-analyst/SKILL.md +9 -0
package/skills/threat-modeler/SKILL.md +9 -0
package/skills/tls-certificate-auditor/SKILL.md +9 -0
package/skills/token-reuse-detector/SKILL.md +9 -0
package/skills/trike-risk-modeler/SKILL.md +9 -0
package/skills/unicode-homograph-tester/SKILL.md +9 -0
package/skills/waf-rule-lifecycle-agent/SKILL.md +9 -0
package/skills/webhook-security-tester/SKILL.md +9 -0
package/skills/zero-trust-architect/SKILL.md +9 -0

package/skills/registry-mirror-enforcer/SKILL.md CHANGED Viewed

@@ -47,6 +47,15 @@ Domain-specific threats and techniques beyond the core mandate:
 - **SLSA provenance gap in pull-through caches** — Pull-through mirror caches strip or ignore `cosign` signatures and SLSA provenance attestations on cached layers. An attacker who compromises cached storage serves unsigned layers indefinitely. Require mirror to re-verify Cosign signature on every cache-miss fetch and reject unsigned images.
 - **Stargz/lazy-pull side-channel via partial layer fetch** — GKE Image Streaming and eStargz lazy-pull expose per-file access patterns to the registry via HTTP Range requests, leaking container startup behaviour and file access order to a network observer. Enforce mTLS between node and mirror; log range request anomalies.
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The `supply-chain-deep` and `dependencies` detection modules (`src/gate/checks/supply-chain-deep.ts`, `src/gate/checks/dependencies.ts`) are your deterministic floor, not your ceiling. Treat their finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
+- **Cross-file / data-flow reasoning the regex can't do:** `supply-chain-deep.ts` flags a bare `image: nginx` in one manifest; you must correlate it with the containerd `registry.mirrors` config, the Kyverno admission policy, and the imagePullSecret scope to prove whether a runtime mirror failure silently falls back to `docker.io` — a chain spanning manifest, daemon config, and policy that no single-line scan reconstructs.
+- **Semantic / effective-state analysis:** model the effective image-resolution order — dependency-confusion where a public `mycompany/service` outranks the private one, a pull-through cache serving a poisoned digest on cache-miss re-fetch, or a Unicode-lookalike registry host passing an ASCII allowlist — reasoning about what is actually pulled, not what the tag literally says.
+- **External corroboration:** WebSearch/WebFetch for current OCI distribution-spec CVEs, Cosign/Sigstore verification requirements, CISA KEV entries for registry components, and SLSA L3 provenance expectations.
+- **Apply & prove:** write the fix inline (digest-pinned `image@sha256:`, Kyverno enforce policy, containerd mirror endpoint, Cosign verification on every cache hit), re-run the `supply-chain-deep.ts`/`dependencies.ts` checks (plus `trivy image` and `cosign verify`) as a regression floor, then re-audit. Emit the LEARNING SIGNAL per fix; surface trade-offs with the private-mirror-only default.
 ## EXECUTION
 ### Phase 1 — Reconnaissance

package/skills/rotation-validation-agent/SKILL.md CHANGED Viewed

@@ -34,6 +34,15 @@ On every finding resolved, emit:
 }
 ```
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The `crypto` and `secrets` detection modules (`src/gate/checks/crypto.ts`, `src/gate/checks/secrets.ts`) are your deterministic floor, not your ceiling. Treat their finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
+- **Cross-file / data-flow reasoning the regex can't do:** `secrets.ts` flags an `AWS_ACCESS_KEY_ID` literal; you must correlate the canonical secret store (AWS Secrets Manager version hash) against the same key still present in a GitHub Actions org secret and a stale `.env` consumed by the running process — proving rotation drift that spans three files no single-line scan connects.
+- **Semantic / effective-state analysis:** model the effective post-rotation state — a retired JWT `kid` still served from a cached JWKS edge node, a dual-key overlap window never closed in the IdP, or a rotation Lambda that swallows the error and leaves the secret `AWSPENDING` while the app keeps using the expired value — reasoning about whether the old credential still authenticates, not whether the store shows "rotated".
+- **External corroboration:** WebSearch/WebFetch for current PCI DSS §8.3.9 rotation cadence, NIST IA-5, cloud-provider long-lived-key deprecation timelines, and CRQC-driven RSA/ECDSA JWT signing-key migration guidance.
+- **Apply & prove:** write the fix inline (Secrets Manager `rotation_rules` + CloudWatch `RotationFailed` alarm, zero-downtime dual-key JWT verify with `kid` validation, cert-expiry CI gate), re-run the `crypto.ts`/`secrets.ts` checks (plus `gitleaks detect` and `aws secretsmanager describe-secret`) as a regression floor, then re-audit. Emit the LEARNING SIGNAL per fix; surface trade-offs with the auto-rotation default.
 ## EXECUTION
 ### Phase 1 — Reconnaissance

package/skills/samm-assessor/SKILL.md CHANGED Viewed

@@ -35,6 +35,15 @@ On every finding resolved, emit:
 ```
 Call `security.record_outcome` with this payload so the routing engine learns which agent resolves each finding class most successfully. If a finding is a false positive, set `falsePositive: true` — this prevents the false-positive pattern from being routed here again.
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The full suite of detection modules in `src/gate/checks/` (especially `ci-pipeline.ts`, `supply-chain-deep.ts`, and `dependencies.ts`) is your deterministic floor for maturity evidence, not your ceiling. Treat their finding IDs as the minimum scoring signal, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
+- **Cross-file / data-flow reasoning the regex can't do:** a single module reporting "SAST configured" does not prove SAMM Implementation/Secure Build Level 2; you must correlate the CI workflow, the dependency lock files, the secret-scanning step, and the SLSA provenance attestation across the repo to score the practice honestly — a maturity judgment no per-line check makes.
+- **Semantic / effective-state analysis:** model the effective maturity state — a threat model that exists but is >12 months stale and missing a new data flow (score L1 not L2), or a Secure Build that runs SAST but no `trufflehog`/`gitleaks` secret scan (cap at L1) — reasoning about whether the practice is actually performed and measured, not whether an artifact merely exists.
+- **External corroboration:** WebSearch/WebFetch for current OWASP SAMM 2.0 activity definitions, SAMM community benchmark averages, EU CRA SBOM-per-release mandates, and BSIMM correlation data.
+- **Apply & prove:** generate the assessment doc AND write the missing control inline (add the SCA gate, the IaC scan, the SBOM step), re-run the relevant `src/gate/checks/` modules (plus `semgrep`/`trivy`/`osv-scanner` as the evidence the score now claims) as a regression floor, then re-score. Emit the LEARNING SIGNAL per fix; surface trade-offs with the higher-maturity default.
 ## EXECUTION
 ### Phase 1 — Reconnaissance

package/skills/secrets-mask-bypass-tester/SKILL.md CHANGED Viewed

@@ -34,6 +34,15 @@ On every finding resolved, emit:
 }
 ```
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The `secrets` and `dlp` detection modules (`src/gate/checks/secrets.ts`, `src/gate/checks/dlp.ts`) are your deterministic floor, not your ceiling. Treat their finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
+- **Cross-file / data-flow reasoning the regex can't do:** `secrets.ts` flags a masking regex that matches `password=`; you must follow the secret from the request body, through the masking middleware, into the Pino/Winston serializer (which may field-alias `password → pwd`), and on to the Fluentd shipper that re-serializes and drops the mask — a multi-hop pipeline the single-line scan never traverses.
+- **Semantic / effective-state analysis:** model the effective unmasked state — a secret split across two buffered log lines, a URL-encoded `password%253D` variant, a Unicode-escaped `secret` in a JSON body, or an Axios `err.config` object serialized whole with its `Authorization` header — reasoning about what actually reaches the SIEM index, not what the literal key name is.
+- **External corroboration:** WebSearch/WebFetch for current log-injection CVEs (Log4Shell-class `${jndi:}`), masking-library advisories, and AI-log-analytics (DevOps Guru/Datadog AI) data-governance requirements.
+- **Apply & prove:** write the fix inline (recursive case-insensitive `sanitizeForLog`, serialization-time masking, `::add-mask::` before any secret reference, canary-credential end-to-end test), re-run the `secrets.ts`/`dlp.ts` checks (plus `gitleaks detect` and a `trufflehog --only-verified` pass over the log fixtures) as a regression floor, then re-audit. Emit the LEARNING SIGNAL per fix; surface trade-offs with the never-log-secrets-at-all default.
 ## EXECUTION
 ### Phase 1 — Reconnaissance

package/skills/senior-security-engineer/SKILL.md CHANGED Viewed

@@ -20,6 +20,17 @@ use `/ciso-orchestrator` for a complete security program audit.
 ---
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The full suite of detection modules in `src/gate/checks/` (especially `secrets.ts`, `injection-deep.ts`, `crypto.ts`, and `dlp.ts`, alongside the cloud/infra, supply-chain, API, mobile, and AI modules) is your deterministic floor across every surface you fortify — not your ceiling. Treat their finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit/Write), not just advise:
+- **Cross-file / data-flow reasoning the regex can't do:** a module flags an unvalidated `req.body` sink in one route; you must build the Phase 0b taint map and trace it across handlers, services, and the data layer — proving whether it reaches a raw SQL string, an `eval`, an SSRF egress, or a deserialization gadget in a different file the per-line scan never connects.
+- **Semantic / effective-state analysis:** model the effective security state across surfaces — an IDOR that only manifests when an opaque token is swapped, a prototype-pollution chain that reaches `child_process` options, a classical KEK wrapping an AES-256 DEK, or a consent gate bypassed by a server-side analytics call — reasoning about runtime effect, not literal matches. Correlate findings across domains into chains (SSRF + stale key → metadata exfiltration) the way the CISO Phase 1 synthesis does.
+- **External corroboration:** WebSearch/WebFetch for current CVEs, CISA KEV entries, EPSS scores, and framework updates (OWASP, MITRE ATT&CK, NIST 800-53, PCI DSS 4.0) relevant to the detected stack.
+- **Apply & prove:** write the secure code inline per the 90%-fixing mandate, re-run the relevant `src/gate/checks/` modules via `security.run_pr_gate` (plus `semgrep`, `trivy`, `osv-scanner`, `gitleaks` as a regression floor), confirm the original PoC now fails, then re-audit and `security.attest_review`. Surface trade-offs with the secure default; never weaken a control without an owner-signed risk-acceptance record.
+---
 ## ⚠ CORE OPERATING MANDATE — THIS OVERRIDES ALL OTHER INSTRUCTIONS
 **Operating ratio: 90% fixing, 10% advisory.**

package/skills/serialization-memory-attacker/SKILL.md CHANGED Viewed

@@ -21,6 +21,15 @@ RCE candidate and every RegExp as a potential DoS weapon.
 Find and fix deserialization, prototype pollution, ReDoS, and memory safety vulnerabilities.
 Write working exploits (prototype chain manipulation, regex payloads) before fixes.
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The `injection-deep` detection module (`src/gate/checks/injection-deep.ts`) is your deterministic floor, not your ceiling. Treat its finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
+- **Cross-file / data-flow reasoning the regex can't do:** `injection-deep.ts` flags an `_.merge()` call; you must trace whether the merged object is user-controlled JSON from a request body and whether the polluted `__proto__` property later flows into a `child_process.spawn` options object in an entirely different module — the gadget chain spanning files that a single-line scan cannot follow.
+- **Semantic / effective-state analysis:** trace the deserialization gadget chain end to end — `node-serialize` IIFE → `unserialize()` execution, `pickle.loads` `__reduce__` → `os.system`, `js-yaml` v4 `yaml.load` → `!!js/function`, or a symlink-based zip-slip entry that writes through a clean-named symlink — reasoning about the reachable sink and effective runtime state, not the literal API name.
+- **External corroboration:** WebSearch/WebFetch for current deserialization CVEs (e.g., `vm2` escapes, `tar` symlink CVE-2023-32002), ReDoS advisories, and POP-Miner-class automated gadget-chain research.
+- **Apply & prove:** write the fix inline (`JSON.parse` + zod schema, `Object.freeze(Object.prototype)` at bootstrap, `path.resolve` base-dir guard, `re2`/`safe-regex` rewrite, `FAILSAFE_SCHEMA` for YAML), re-run the `injection-deep.ts` checks (plus `semgrep --config=p/javascript` prototype-pollution ruleset and `safe-regex`) as a regression floor, then re-audit. Emit the LEARNING SIGNAL per fix; surface trade-offs with the safe-parser default.
 ## EXECUTION
 1. **Prototype Pollution:**

package/skills/session-timeout-tester/SKILL.md CHANGED Viewed

@@ -34,6 +34,15 @@ On every finding resolved, emit:
 }
 ```
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The `auth-deep` detection module (`src/gate/checks/auth-deep.ts`) is your deterministic floor, not your ceiling. Treat its session/timeout finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
+- **Cross-file / multi-step reasoning the regex can't do:** `auth-deep.ts` flags an absent `maxAge` in `auth.config.ts`, but it cannot correlate that the `changePassword` handler in one file never calls the `invalidateAllSessions` Redis sweep defined in another, nor that a `reset-by-email` flow takes a second code path with no revocation. Trace the full set of credential-change entry points to every session store.
+- **Semantic / effective-state analysis:** model the entire session lifecycle — issue → slide (`updateAge`) → idle → absolute cap → revoke — and prove the *effective* lifetime, e.g. an SSE/WebSocket keepalive silently resetting the idle timer, or a `client_id` split bypassing a per-app concurrent-session cap that looks correct per file.
+- **External corroboration:** WebSearch/WebFetch for current CVEs/advisories/standards for session management (next-auth state-binding advisories, PCI DSS 8.3.13 idle-timeout, NIST AC-12).
+- **Apply & prove:** write the config/middleware fix inline, re-run the `auth-deep` checks (plus a token-entropy pass with `ent`/`dieharder` and a live "change password → old cookie rejected" test) as a regression floor, then re-audit. Emit the LEARNING SIGNAL per fix; surface trade-offs against the secure default (shorter absolute lifetime vs. user re-login friction).
 ## EXECUTION
 ### Phase 1 — Reconnaissance

package/skills/slsa-level3-enforcer/SKILL.md CHANGED Viewed

@@ -34,6 +34,15 @@ On every finding resolved, emit:
 }
 ```
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The `supply-chain-deep`, `sbom`, and `ci-pipeline` detection modules (`src/gate/checks/supply-chain-deep.ts`, `src/gate/checks/sbom.ts`, `src/gate/checks/ci-pipeline.ts`) are your deterministic floor, not your ceiling. Treat their finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
+- **Cross-file / multi-step reasoning the regex can't do:** `ci-pipeline.ts` can flag a missing `id-token: write`, but it cannot trace that a `build` job emits one hash while the downstream `provenance` job consumes a *different* artifact after a concurrent-run namespace collision, or that an `on: workflow_run` listener in a second file inherits write permissions from a fork PR. Follow the job DAG across every workflow file.
+- **Semantic / effective-state analysis:** verify the full SLSA provenance graph — assert each `.intoto.jsonl` `builderID` is the canonical `slsa-github-generator` OIDC URI (not a fork), that every release `runs-on` is github-hosted (no self-hosted fallback), and that two-party review is actually enforced by branch protection, not merely declared.
+- **External corroboration:** WebSearch/WebFetch for current CVEs/advisories/standards for SLSA L3 (slsa.dev threat model INT-*, GitHub Actions security advisories, NIST SP 800-218, EU CRA Art. 13).
+- **Apply & prove:** write the hardened workflow/branch-protection fix inline, re-run the `supply-chain-deep`/`sbom`/`ci-pipeline` checks plus `slsa-verifier verify-artifact` and `cosign verify-attestation` as a regression floor, then re-audit. Emit the LEARNING SIGNAL per fix; surface trade-offs against the secure default (hermetic `--network=none` builds vs. registry-fetch convenience).
 ## EXECUTION
 ### Phase 1 — Reconnaissance

package/skills/slsa-provenance-enforcer/SKILL.md CHANGED Viewed

@@ -34,6 +34,15 @@ On every finding resolved, emit:
 }
 ```
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The `supply-chain-deep` and `sbom` detection modules (`src/gate/checks/supply-chain-deep.ts`, `src/gate/checks/sbom.ts`) are your deterministic floor, not your ceiling. Treat their finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
+- **Cross-file / multi-step reasoning the regex can't do:** `supply-chain-deep.ts` can confirm an `attest-build-provenance` step exists, but it cannot prove that no *downstream* deploy job ever calls `gh attestation verify`/`cosign verify` before using the artifact, or that a signed `app` consumes an *unsigned* intermediate `lib.a` built in another job. Trace artifact provenance from build through every consumer.
+- **Semantic / effective-state analysis:** verify the full SLSA provenance graph — signing pinned to `@sha256:<digest>` (never a mutable `:latest` tag), the build actually hermetic (`--network=none`, not just provenance-attested), and `postinstall`/`preinstall` hooks in the resolved dependency tree that fetch remote code after hash-checking.
+- **External corroboration:** WebSearch/WebFetch for current CVEs/advisories/standards for build provenance (Sigstore/Rekor advisories, OSV.dev for HTTP-client deps, NIST SP 800-218, EU CRA 2024/2847 SBOM mandate).
+- **Apply & prove:** write the signing/verification/hermetic-build fix inline, re-run the `supply-chain-deep`/`sbom` checks plus `cosign verify` and a deploy-time `gh attestation verify` as a regression floor, then re-audit. Emit the LEARNING SIGNAL per fix; surface trade-offs against the secure default (digest-pinned immutable refs vs. floating-tag deploy ergonomics).
 ## EXECUTION
 ### Phase 1 — Reconnaissance

package/skills/ssrf-detection-validator/SKILL.md CHANGED Viewed

@@ -34,6 +34,15 @@ On every finding resolved, emit:
 }
 ```
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The `injection-deep`, `api`, and `infra` detection modules (`src/gate/checks/injection-deep.ts`, `src/gate/checks/api.ts`, `src/gate/checks/infra.ts`) are your deterministic floor, not your ceiling. Treat their SSRF finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
+- **Cross-file / multi-step reasoning the regex can't do:** `injection-deep.ts` flags an unguarded `fetch(url)`, but it cannot see that an import-by-URL endpoint in another file uses a *separate* HTTP client that bypasses `ssrfSafeFetch`, or that a transitive dependency (`axios`/`got`/`undici`) issues outbound calls straight to `node:http`. Trace every outbound sink and correlate with the SBOM.
+- **Semantic / effective-state analysis:** model the SSRF→metadata→cloud-cred chain end to end — a validated public hostname that DNS-rebinds (TTL-0) to `169.254.169.254` at connect time, a parser differential (`http://127.0.0.1:80@host`), or a redirect chain landing on an internal IP — and confirm `infra.ts`-level IMDSv2 (`HttpTokens: required`) actually closes the IAM-credential escalation path.
+- **External corroboration:** WebSearch/WebFetch for current CVEs/advisories/standards for SSRF (PayloadsAllTheThings encoding matrix, axios CVE-2023-45857, current AWS/GCP/Azure metadata endpoints, EU CRA Art. 14 disclosure).
+- **Apply & prove:** write the canonical `ssrfSafeFetch` (re-resolve + IP-pin + manual redirect re-validation) inline and wire it as the sole outbound path, re-run the `injection-deep`/`api`/`infra` checks plus a `nuclei` SSRF-bypass template and an `rbndr.us` rebinding test as a regression floor, then re-audit. Emit the LEARNING SIGNAL per fix; surface trade-offs against the secure default (HTTPS-only + allowlist vs. legitimate arbitrary-webhook flexibility).
 ## EXECUTION
 ### Phase 1 — Reconnaissance

package/skills/step-up-auth-enforcer/SKILL.md CHANGED Viewed

@@ -34,6 +34,15 @@ On every finding resolved, emit:
 }
 ```
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The `auth-deep` detection module (`src/gate/checks/auth-deep.ts`) is your deterministic floor, not your ceiling. Treat its step-up/re-auth finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
+- **Cross-file / multi-step reasoning the regex can't do:** `auth-deep.ts` can spot a `requireStepUp` call on one route, but it cannot enumerate *every* high-value operation (payment-method add, MFA disable, email change, data export, impersonate) across the codebase and prove each is gated — nor catch a sensitive mutation reachable via a Server Action or direct dispatch that skips the middleware. Map all sensitive sinks to their gate.
+- **Semantic / effective-state analysis:** model the step-up lifecycle and its freshness window — confirm the `stepUpAt` token is cryptographically bound to the session ID and regenerated post-challenge (defeats CVE-2023-29489-style fixation), that OIDC `acr`/`amr` claims are verified inside a signed JWT against issuer JWKS (not trusted from a cookie), and that WebAuthn `signCount` monotonicity is enforced to block assertion replay.
+- **External corroboration:** WebSearch/WebFetch for current CVEs/advisories/standards for step-up auth (OIDC ACR/AMR forgery research, FIDO2 CTAP2 replay, NIST IA-2/AC-11, PCI DSS 8.4.2).
+- **Apply & prove:** write the `requireStepUp` middleware and `/auth/step-up` route inline and wire them at the framework routing layer, re-run the `auth-deep` checks plus a live "stale session → 403 step_up_required → challenge → success → expiry" test as a regression floor, then re-audit. Emit the LEARNING SIGNAL per fix; surface trade-offs against the secure default (5-min freshness window vs. repeated-challenge friction).
 ## EXECUTION
 ### Phase 1 — Reconnaissance

package/skills/stride-pasta-analyst/SKILL.md CHANGED Viewed

@@ -23,6 +23,15 @@ Every threat identified must include a mitigation written and implemented.
 Project-aware: derive threats from the ACTUAL tech stack, data types, and integrations found —
 not a generic checklist.
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The full suite of detection modules in `src/gate/checks/` (especially `auth-deep.ts`, `injection-deep.ts`, and `api.ts`) is your deterministic floor, not your ceiling. As the threat-model analyst that produces the §22A output driving all downstream checks, treat their finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the mitigation (Edit), not just advise:
+- **Cross-file / multi-step reasoning the regex can't do:** the per-check modules each see one component; your STRIDE/PASTA job is the seam between them — a webhook handler (one file) whose payload-derived URL reaches an outbound fetch (another file) and pivots to IMDS, or a tenant-id absent from a cache key set far from where it is read. Build the DFD from the actual import graph and ORM schema and trace threats across every trust boundary.
+- **Semantic / effective-state analysis:** model whole attack trees end to end — JWT `alg` confusion → auth bypass → privileged tool-call injection; price-manipulation → coupon double-spend; harvest-now-decrypt-later on RSA/ECDH-protected PII — and prove the *effective* exploitability that no single-line check can assert, with a working PoC before and a failing PoC after the fix.
+- **External corroboration:** WebSearch/WebFetch for current CVEs/advisories/standards and industry-vertical APT TTPs (FIN7/TA505/Scattered Spider), the latest ATT&CK STIX bundle, and CISA KEV.
+- **Apply & prove:** write the mitigation inline (algorithm pinning, tenant-scoped keys, SSRF allowlist, server-side price lookup), re-run the relevant `src/gate/checks/` modules (plus targeted tools — `nuclei`, `osv-scanner`, `slsa-verifier`) as a regression floor, then re-audit and re-emit the threat model. Emit the LEARNING SIGNAL per fix; surface trade-offs with the secure default.
 ## EXECUTION
 1. Read `stackContext` from parent agent

package/skills/supply-chain-devsecops/SKILL.md CHANGED Viewed

@@ -22,6 +22,15 @@ SKILL.md §5, §6, §18, and §21 are the minimum. You go beyond them.
 90% fixing — you update lockfiles, pin Actions, harden pipeline YAML, generate SBOMs.
 Every dependency finding includes: CVSSv4, EPSS score, CISA KEV status, and fix version.
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+As lead over the `dependencies`, `sbom`, `supply-chain-deep`, and `ci-pipeline` detection modules (`src/gate/checks/dependencies.ts`, `src/gate/checks/sbom.ts`, `src/gate/checks/supply-chain-deep.ts`, `src/gate/checks/ci-pipeline.ts`), treat their finding IDs as your deterministic floor, not your ceiling. Reason past single-line/single-file pattern matching across all three sub-agents — and APPLY the fix (Edit), not just advise:
+- **Cross-file / multi-step reasoning the regex can't do:** `dependencies.ts` can pin a version, but it cannot correlate that a `private: true` package name in one `package.json` is resolvable from the *public* registry because `.npmrc` (another file) lacks scope-to-registry binding (dependency confusion), or that a `pull_request_target` workflow checks out untrusted head and then consumes an org secret. Trace the resolution + permissions graph across lockfiles, registry config, and every CI workflow.
+- **Semantic / effective-state analysis:** verify the full SLSA provenance graph and the *effective* trust chain — a maintainer-compromise scenario's earliest CI detection point, a poisoned BuildKit/npm cache on a persistent self-hosted runner surviving `--no-cache`, AI-hallucinated ("slopsquatted") package names < 30 days old, and ECDSA-signed SBOMs vulnerable to retroactive forgery (harvest-now-break-later).
+- **External corroboration:** WebSearch/WebFetch for current CVEs/advisories/standards for the dependency tree (CISA KEV JSON, OSV.dev, OpenSSF Scorecard, GitHub Advisory DB, US EO 14028 / EU CRA SBOM mandates).
+- **Apply & prove:** write the fix inline (update lockfile, scope `.npmrc`, pin Actions to SHAs, harden `pull_request_target`, wire SBOM generation), re-run the `dependencies`/`sbom`/`supply-chain-deep`/`ci-pipeline` checks plus `osv-scanner --sbom`, `cosign verify`, and `slsa-verifier` as a regression floor, then re-audit. Emit the LEARNING SIGNAL per fix; surface trade-offs against the secure default (private-registry allowlist vs. upstream-package velocity).
 ## ACTIVATION PROTOCOL
 1. Call `orchestration.update_agent_status(agentRunId, "supply-chain-devsecops", "running")`

package/skills/threat-infrastructure-analyst/SKILL.md CHANGED Viewed

@@ -35,6 +35,15 @@ On every finding resolved, emit:
 }
 ```
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The `infra`, `iac`, and `k8s` detection modules (`src/gate/checks/infra.ts`, `src/gate/checks/iac.ts`, `src/gate/checks/k8s.ts`) are your deterministic floor, not your ceiling. Treat their finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the hardening fix (Edit), not just advise:
+- **Cross-file / multi-step reasoning the regex can't do:** `infra.ts`/`iac.ts` can flag an open egress rule or a permissive security group in one Terraform file, but they cannot correlate that the same workload's IAM role (defined elsewhere) plus a reachable metadata endpoint plus a wide-open ASN range forms an exfiltration path an observed C2 cluster would use. Build the egress + IAM + network-segmentation graph across IaC, K8s manifests, and infra config.
+- **Semantic / effective-state analysis:** map observed TTPs to the effective control state — does a `k8s.ts` NetworkPolicy gap actually permit the DoH-tunnelled C2 or HTTP/2 Rapid Reset pattern you attributed? Model bulletproof-ASN co-tenancy and CDN-fronted C2 against the real egress firewall, not the declared intent.
+- **External corroboration:** WebSearch/WebFetch for current CVEs/advisories/threat-intel for the observed campaign (CISA KEV, MITRE ATT&CK technique pages, VirusTotal/AbuseIPDB/Shodan, RIPEstat BGP for ASN pivoting).
+- **Apply & prove:** write the targeted defense inline (egress allowlist, NetworkPolicy, IMDSv2 enforcement, ASN-level block, stream-reset rate limit), re-run the `infra`/`iac`/`k8s` checks plus a `nmap`/`nuclei` reachability probe as a regression floor, then re-audit. Emit the LEARNING SIGNAL per fix; surface trade-offs against the secure default (deny-by-default egress vs. third-party integration reach).
 ## EXECUTION
 ### Phase 1 — Reconnaissance

package/skills/threat-modeler/SKILL.md CHANGED Viewed

@@ -24,6 +24,15 @@ SKILL.md §2 and §8 are the MINIMUM. Go beyond them.
 Think like APT29, Lazarus Group, or FIN7 depending on the project's industry vertical.
 90% fixing — every threat you identify must have a mitigation written and implemented.
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+As the master threat-model lead, the full suite of detection modules in `src/gate/checks/` (especially `auth-deep.ts`, `injection-deep.ts`, and `infra.ts`) is your deterministic floor, not your ceiling. The threat model you produce is the attack brief that drives every other check — so treat their finding IDs as the minimum, reason past single-line/single-file pattern matching, and APPLY the mitigation (Edit), not just advise:
+- **Cross-file / multi-step reasoning the regex can't do:** each detection module sees one component in isolation; your job is the interaction boundary the regex cannot reach — a multi-service flow where an auth bypass in one service plus a missing tenant filter in another yields cross-org data leak, or a webhook→outbound-fetch→IMDS pivot spanning three files. Build the DFD from the real import graph, API routes, and ORM schema and trace every trust boundary.
+- **Semantic / effective-state analysis:** model whole attack trees and their effective exploitability — APT-vertical TTP chains (FIN7/Lazarus/Scattered Spider), formal-verification-worthy auth/payment state machines, and temporal threats (post-quantum harvest-now-decrypt-later, upcoming regulatory deadlines) that no single-line check surfaces — and prove each with a working PoC before and a failing PoC after the mitigation.
+- **External corroboration:** WebSearch/WebFetch for current CVEs/advisories/standards, the latest ATT&CK v15 STIX bundle, industry APT group profiles, and CISA KEV.
+- **Apply & prove:** write the mitigation inline, re-run the relevant `src/gate/checks/` modules (plus targeted tools — `nuclei`, `osv-scanner`, `sslyze`, `slsa-verifier`) as a regression floor, then re-audit and regenerate `threat-model.json`. Emit the LEARNING SIGNAL per fix; surface trade-offs with the secure default so the pentest team inherits an accurate, current attack brief.
 ## ACTIVATION PROTOCOL
 1. Call `orchestration.update_agent_status(agentRunId, "threat-modeler", "running")`

package/skills/tls-certificate-auditor/SKILL.md CHANGED Viewed

@@ -31,6 +31,15 @@ and CI/CD certificate delivery pipelines.
 Write fixed TLS configurations, HSTS headers, and certificate automation scripts inline.
 Every finding must include a working PoC demonstrating exploitability and a verified remediation.
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The `crypto` detection module (`src/gate/checks/crypto.ts`) is your deterministic floor, not your ceiling. Treat its TLS/cert finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
+- **Cross-file / multi-step reasoning the regex can't do:** `crypto.ts` can grep `rejectUnauthorized: false` or a weak `ssl_ciphers` line, but it cannot prove that the SSLv2-accepting SMTP service shares an RSA private key with the hardened HTTPS endpoint (DROWN), or that Cloudflare "Flexible" mode terminates TLS at the edge while the origin (configured in a different file) serves plaintext. Correlate every service, port, and termination point that shares a key or hostname.
+- **Semantic / effective-state analysis:** verify the *negotiated* state, not the declared config — RSA key exchange still offered despite an ECDHE preference (ROBOT), a sub-2048-bit DHE group (Logjam), an SNI/ALPN mismatch serving the wrong vhost cert, or a rogue CA-issued cert for your domain sitting in CT logs. Model the harvest-now-decrypt-later horizon for any RSA/ECDSA-protected long-lived data.
+- **External corroboration:** WebSearch/WebFetch for current CVEs/advisories/standards for TLS/PKI (PCI DSS 4.0 TLS 1.0/1.1 prohibition, NIST SP 800-52r2, ROBOT/DROWN/Logjam test tooling, crt.sh CT feeds).
+- **Apply & prove:** write the fixed TLS config / HSTS header / cert-automation inline, re-run the `crypto` checks plus `sslyze --regular <host>`, `testssl.sh`, and `slsa-verifier`/`crt.sh` cross-reference as a regression floor, then re-audit with the §POC-REQUIREMENT (PoC fails post-fix). Emit the LEARNING SIGNAL per fix; surface trade-offs against the secure default (TLS 1.3-only / digest-pinned certs vs. legacy-client reach).
 ## EXECUTION
 1. **Scan TLS configuration in all services:**

package/skills/token-reuse-detector/SKILL.md CHANGED Viewed

@@ -34,6 +34,15 @@ On every finding resolved, emit:
 }
 ```
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The `auth-deep` token/session detection module (`src/gate/checks/auth-deep.ts`) is your deterministic floor, not your ceiling. Treat its finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
+- **Cross-file / multi-step reasoning the regex can't do:** trace a refresh token from its issuance route, through the rotation handler, into the DB schema and the logout path — a regex sees `rotateRefreshToken` exists but cannot prove the `previousToken` column is actually compared, that logout invalidates the family, or that a separate retry code path re-issues outside any family.
+- **Semantic / effective-state analysis:** correlate token issuance and replay across multiple requests — model the family tree end-to-end and confirm that replaying a rotated-out token marks the *entire* family `compromised`; verify single-use consumption is a DB-level atomic `UPDATE ... WHERE used_at IS NULL` (rows-affected check) and not a TOCTOU SELECT-then-UPDATE; confirm machine/service-account tokens are not exempt from rotation.
+- **External corroboration:** WebSearch/WebFetch for current CVEs/advisories/standards for token handling — RFC 9700 (OAuth 2.0 Security BCP), OAuth 2.1 implicit-flow deprecation, and library CVEs (e.g. jsonwebtoken CVE-2022-23529).
+- **Apply & prove:** write the fix inline, re-run the `auth-deep` checks (plus a concurrency probe with `wrk`/Apache Bench for the TOCTOU double-spend, and `npm audit`/`osv-scanner` on token libraries) as a regression floor, then re-audit. Emit the LEARNING SIGNAL per fix; surface trade-offs with the secure default (e.g. rotation grace window vs. replay window).
 ## EXECUTION
 ### Phase 1 — Reconnaissance

package/skills/trike-risk-modeler/SKILL.md CHANGED Viewed

@@ -34,6 +34,15 @@ On every finding resolved, emit:
 }
 ```
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The full suite of detection modules in `src/gate/checks/` (especially the threat-model/scoring path — `infra.ts`, `auth-deep.ts`, `injection-deep.ts`, `api.ts` — as your risk-input feed) is your deterministic floor, not your ceiling. Treat every emitted finding ID as a quantified threat input into the Trike Actor × Action × Asset matrix, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
+- **Cross-file / multi-step reasoning the regex can't do:** a single check finding is one cell in the matrix; the real risk is the *chain* — e.g. an IP-trust finding (infra) + a long-lived credential finding (auth-deep) compose into a lateral-movement path no single module scores. Build the attack tree per asset that spans modules, and recompute `P(exploit) × Impact` for the composed path, not the isolated finding.
+- **Semantic / effective-state analysis:** map the stated Actor × Action "Denied" matrix against the *actual* permission checks in code — gaps between modeled-denied and runtime-allowed are the highest-value Trike findings. Model availability (DDoS-class), supply-chain (dependency asset rows), and LLM-inference assets that flat asset registers omit; flag CRITICAL assets with >5yr retention as harvest-now-decrypt-later (risk 15) today.
+- **External corroboration:** WebSearch/WebFetch for current CVEs/advisories/standards for risk modeling — ground-truth probability scores against live exploit data (CISA KEV, EPSS), and use tools like `garak` / `semgrep p/owasp-top-ten` to validate the matrix against real attacker enumeration speed.
+- **Apply & prove:** write the fix — regenerate `docs/security/trike-risk-model.md` with the corrected asset register and risk-ranked backlog — re-run the relevant `src/gate/checks/` modules as a regression floor to confirm the threat input is resolved, then re-audit the matrix. Emit the LEARNING SIGNAL per fix; surface trade-offs with the secure default (e.g. risk acceptance criteria vs. mitigation cost).
 ## EXECUTION
 ### Phase 1 — Reconnaissance

package/skills/unicode-homograph-tester/SKILL.md CHANGED Viewed

@@ -34,6 +34,15 @@ On every finding resolved, emit:
 }
 ```
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The `injection-deep` detection module (`src/gate/checks/injection-deep.ts`) is your deterministic floor, not your ceiling. Treat its finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
+- **Cross-file / multi-step reasoning the regex can't do:** a regex confirms `.normalize("NFC")` exists *somewhere*, but cannot prove it runs on *every* username/email/filename/URL write path before the DB insert, nor that normalization happens before — not after — the validation regex (the classic order bug). Trace each user-controlled string from its handler through validation into storage and back to the render sink, and confirm normalization + confusable-skeleton matching is applied at every entry, including second-order (stored-then-rendered) paths.
+- **Semantic / effective-state analysis:** model the homoglyph filter-bypass — submit composed/decomposed and mixed-script confusables (Cyrillic 'а' U+0430, fullwidth ＜ U+FF1C, Ⅰ U+2160) and BiDi overrides (U+202E) and verify the *effective* stored/displayed value cannot impersonate an existing identity; check IDN/punycode (`xn--`) equivalents of allowlisted domains and ZWJ (U+200D) byte-level token forgery.
+- **External corroboration:** WebSearch/WebFetch for current CVEs/advisories/standards for Unicode security — UTS#39 confusables, Trojan Source (CVE-2021-42574), IDNA 2008 vs UTS#46 divergence, and current ICU `SpoofChecker` flags.
+- **Apply & prove:** write the sanitizer inline, re-run the `injection-deep` checks (plus a `semgrep` rule for un-normalized input sinks and a homoglyph/BiDi fuzz corpus driven through the endpoint) as a regression floor, then re-audit. Emit the LEARNING SIGNAL per fix; surface trade-offs with the secure default (e.g. NFKC aggressiveness vs. legitimate non-ASCII names).
 ## EXECUTION
 ### Phase 1 — Reconnaissance

package/skills/waf-rule-lifecycle-agent/SKILL.md CHANGED Viewed

@@ -34,6 +34,15 @@ On every finding resolved, emit:
 }
 ```
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The `web-nextjs`, `api`, `injection-deep`, and `runtime` detection modules (`src/gate/checks/web-nextjs.ts`, `src/gate/checks/api.ts`, `src/gate/checks/injection-deep.ts`, `src/gate/checks/runtime.ts`) are your deterministic floor, not your ceiling. Treat their finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
+- **Cross-file / multi-step reasoning the regex can't do:** a regex sees `action: block` in one rule but cannot prove rule *ordering* — an ALLOW rule earlier in the chain that short-circuits a later BLOCK, or a `default_action allow` paired with COUNT-mode managed groups that silently log without blocking. Correlate the WAF/CDN config against the actual app routes and headers config (web-nextjs CSP/headers) to find request components (custom headers, cookies, multipart parts) that no rule covers.
+- **Semantic / effective-state analysis:** model the bypass, not the signature — double/mixed encoding (`%252f`), HTTP request smuggling (CL.TE / TE.CL desync between ALB and origin), multipart boundary injection, and deep JSON nesting that evades flat-pattern rules; confirm the WAF blocks SSRF to `169.254.169.254` (and IPv6 `fd00:ec2::254`, decimal `2130706433`) at the edge.
+- **External corroboration:** WebSearch/WebFetch for current CVEs/advisories/standards for WAF — current OWASP CRS version, AWS/Cloudflare managed-rule-group changes, and request-smuggling advisories (CVE-2023-44487 class).
+- **Apply & prove:** write the rule/config inline (Cloudflare rules JSON, AWS WAF Terraform, CSP), re-run the relevant `src/gate/checks/` modules plus active bypass tooling (`nuclei` WAF-bypass templates, `waf-a-mole`, `smuggler.py`, `sqlmap` against staging) as a regression floor, then re-audit. Emit the LEARNING SIGNAL per fix; surface trade-offs with the secure default (e.g. strict blocking vs. false-positive rate, ML-block appeal path under EU AI Act).
 ## EXECUTION
 ### Phase 1 — Reconnaissance

package/skills/webhook-security-tester/SKILL.md CHANGED Viewed

@@ -34,6 +34,15 @@ On every finding resolved, emit:
 }
 ```
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+The `api` and `auth-deep` detection modules (`src/gate/checks/api.ts`, `src/gate/checks/auth-deep.ts`) are your deterministic floor, not your ceiling. Treat their finding IDs as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
+- **Cross-file / multi-step reasoning the regex can't do:** a regex confirms `verifySignature` exists on the primary receiver, but cannot prove the *retry* code path also enforces it, that timing-safe comparison is used on every signature version (including legacy `v0`), or that the outbound delivery job re-resolves the URL at send time rather than trusting the IP cached at registration. Trace inbound (receive), outbound (send), and registration (SSRF) as three distinct surfaces across the handler, queue, and DB schema.
+- **Semantic / effective-state analysis:** correlate signature + timestamp + event-ID nonce across multiple requests to confirm true anti-replay (event-ID dedup persisted, not in-memory; tolerance window enforced against NTP drift); model SSRF via DNS rebinding (TTL=1s flip to `169.254.169.254` after validation passes); model fan-out amplification (one inbound event → N outbound deliveries) for an unbounded ratio.
+- **External corroboration:** WebSearch/WebFetch for current CVEs/advisories/standards for webhooks — Svix/StandardWebhooks/Stripe SDK CVEs (e.g. CVE-2024-42353), SSRF rebinding advisories, and IMDSv2 enforcement guidance.
+- **Apply & prove:** write the validation inline, re-run the `api`/`auth-deep` checks plus active probes (`nuclei` SSRF/webhook templates, Burp Collaborator / interactsh for DNS-rebinding OOB, `wrk` for replay and fan-out, `npm audit`/`pip-audit` on the webhook SDK) as a regression floor, then re-audit. Emit the LEARNING SIGNAL per fix; surface trade-offs with the secure default (e.g. HTTPS-only outbound vs. self-hosted-receiver compatibility).
 ## EXECUTION
 ### Phase 1 — Reconnaissance

package/skills/zero-trust-architect/SKILL.md CHANGED Viewed

@@ -34,6 +34,15 @@ On every finding resolved, emit:
 }
 ```
+## BEYOND THE CHECKS — AUTONOMOUS DETECT & FIX
+As LEAD over the full suite of detection modules in `src/gate/checks/` (especially `infra.ts`, `k8s.ts`, `auth-deep.ts`, and `gitops.ts` for network/identity segmentation), treat them as your deterministic floor, not your ceiling. Treat every emitted finding ID as the minimum, then reason past single-line/single-file pattern matching — and APPLY the fix (Edit), not just advise:
+- **Cross-file / multi-step reasoning the regex can't do:** ZTA failures are almost never single-line — a regex confirms Istio is installed (k8s) but cannot prove *every* namespace is `PeerAuthentication mode: STRICT`, that NetworkPolicy `egress` is not `0.0.0.0/0`, that no route is registered *before* the auth/continuous-validation middleware, and that a workload-identity binding (gitops/infra) has an exact `sub`/`aud` condition. Build the effective east-west trust graph across k8s manifests, IAM/Terraform, and app middleware — the implicit-trust assumption lives in the seams between modules.
+- **Semantic / effective-state analysis:** map the zero-trust segmentation gaps — compose an IP-trust finding (infra) with a long-lived service credential (auth-deep) into a concrete lateral-movement chain no single module scores; verify continuous validation actually consults the revocation cache on *every* request (not just at session creation) and that sidecar-bypass via direct pod-IP call is blocked.
+- **External corroboration:** WebSearch/WebFetch for current CVEs/advisories/standards for zero trust — NIST SP 800-207 tenets, workload-identity-federation attacks (CircleCI-class), eBPF sidecar-bypass (CVE-2023-2728), and PQ-TLS (FIPS 203) mesh migration guidance.
+- **Apply & prove:** write the control inline (PeerAuthentication STRICT, default-deny NetworkPolicy, AuthorizationPolicy least-privilege, Workload Identity binding conditions, continuous-validation middleware) and regenerate `docs/security/zero-trust-roadmap.md`; re-run the relevant `src/gate/checks/` modules plus active probes (`kubectl get peerauthentication/networkpolicy -A -o json | jq`, direct pod-port `curl` bypass test, OIDC token-exchange forgery test) as a regression floor, then re-audit. Emit the LEARNING SIGNAL per fix; surface trade-offs with the secure default (e.g. STRICT mTLS vs. legacy non-mesh client compatibility).
 ## EXECUTION
 ### Phase 1 — Reconnaissance