agent-threat-rules 3.1.1 → 3.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +3 -3
- package/dist/adapters/mastra.d.ts +63 -0
- package/dist/adapters/mastra.d.ts.map +1 -0
- package/dist/adapters/mastra.js +82 -0
- package/dist/adapters/mastra.js.map +1 -0
- package/dist/cli.js +19 -6
- package/dist/cli.js.map +1 -1
- package/package.json +9 -2
- package/rules/agent-manipulation/ATR-2026-00030-cross-agent-attack.yaml +9 -0
- package/rules/agent-manipulation/ATR-2026-00032-goal-hijacking.yaml +8 -2
- package/rules/agent-manipulation/ATR-2026-00074-cross-agent-privilege-escalation.yaml +8 -2
- package/rules/agent-manipulation/ATR-2026-00076-inter-agent-message-spoofing.yaml +8 -2
- package/rules/agent-manipulation/ATR-2026-00077-human-trust-exploitation.yaml +18 -0
- package/rules/agent-manipulation/ATR-2026-00108-consensus-sybil-attack.yaml +10 -2
- package/rules/agent-manipulation/ATR-2026-00116-a2a-message-validation.yaml +12 -2
- package/rules/agent-manipulation/ATR-2026-00117-agent-identity-spoofing.yaml +22 -0
- package/rules/agent-manipulation/ATR-2026-00118-approval-fatigue.yaml +24 -0
- package/rules/agent-manipulation/ATR-2026-00119-social-engineering-via-agent.yaml +22 -0
- package/rules/agent-manipulation/ATR-2026-00132-casual-authority-escalation.yaml +8 -2
- package/rules/agent-manipulation/ATR-2026-00139-casual-authority-redirect.yaml +8 -2
- package/rules/agent-manipulation/ATR-2026-00164-skill-scope-hijack.yaml +13 -2
- package/rules/agent-manipulation/ATR-2026-00268-tense-framing-bypass.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00269-fitd-escalation.yaml +8 -2
- package/rules/agent-manipulation/ATR-2026-00271-grandma-roleplay-jailbreak.yaml +8 -2
- package/rules/agent-manipulation/ATR-2026-00273-dan-developer-mode-persona.yaml +8 -2
- package/rules/agent-manipulation/ATR-2026-00287-threaten-json-coercive-output-threat.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00288-false-premise-injection.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00301-tap-tree-of-attacks-jailbreak.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00302-anti-dan-inverted-filter-persona.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00303-devmode-ranti-profanity-coercion.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00304-chatgpt-image-unlocker-markdown-injection.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00305-dan-mode-ablation-benchmark-coercion.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00306-autodan-genetic-jailbreak-suffix.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00307-inthewild-jailbreak-corpus-signature.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00314-amoral-unfiltered-custom-persona-jailbreak.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00317-free-of-restrictions-named-persona.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00318-moralizing-rant-then-unfiltered-bypass.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00319-developer-mode-dual-response-format.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00320-opposite-day-boolean-opposite-machine.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00322-fictional-world-format-override-jailbreak.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00323-dual-persona-simulation-jailbreak.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00324-neurosemantical-inversitis-disease-jailbreak.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00325-bob-hypothetical-unrestricted-chatbot.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00326-fake-developer-authority-chatdan-override.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00327-ai-ethics-law-bypass-declaration.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00328-named-mode-persona-override.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00329-no-withhold-factual-resource-persona.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00330-yoja-allcaps-settings-block-persona.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00331-terminal-cli-harmful-function-simulator.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00332-hacker-chan-culture-offensive-ai-persona.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00333-disease-animal-dual-ai-chatbot.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00334-two-entity-token-splitting-game.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00335-key-value-template-persona-injection.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00336-brm-false-information-machine-persona.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00337-obfuscated-system-announcement-injection.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00338-persongpt-hash-prefix-activation.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00339-bom-boolean-opposite-machine-persona-claim.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00340-h4x-amoral-unfiltered-character.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00341-char-rules-shortform-persona-injection.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00342-unicode-encoding-obfuscation-injection.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00343-lie-truth-dual-personality-uncensored.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00344-alt-ai-same-capabilities-no-restrictions.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00345-nlp-based-os-fictional-terminal-override.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00346-jailbreak-emoji-unlock-dual-format.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00347-yin-yang-hidden-knowledge-hidden-human-sim.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00348-arc-autonomous-real-calculation-rulebreaker.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00349-hgttg-hitchhikers-guide-no-restriction-oracle.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00350-black-alert-no-ethics-writing-assistant.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00351-dan-ofkilter-sms-abbreviated-persona.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00352-trainergpt-fix-my-ai-harmful-output-extraction.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00353-dual-chatgpt-dan-opposite-personality-format.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00354-reverse-harm-regardless-oracle-persona.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00355-ugh-anti-human-nefarious-ai-persona.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00356-google-gato-lamda-exploit-generation-persona.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00357-antifm-hate-violence-persona-injection.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00358-cumgpt-explicit-adult-named-persona.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00360-chj-carousel-clyde-no-ethics-bird.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00361-alter-ego-honesty-dual-consciousness-bypass.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00362-fictional-unfiltered-ai-comparison-jailbreak.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00363-rephrase-criminal-activity-second-person.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00364-dr-al-villain-plan-template-injection.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00365-reservoir-dogs-coercive-interrogation-roleplay.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00366-sneaky-pete-warp-facts-persona.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00368-prompt-browser-token-limit-jailbreak.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00369-gptx-starwars-unlimited-knowledge-persona.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00370-addi-german-dictator-propaganda-roleplay.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00371-ai-political-bias-manipulation.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00372-fictional-cia-secret-files-story-extraction.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00374-role-placeholder-persona-template-injection.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00375-character-description-rpg-explicit-template.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00376-identity-command-real-person-persona-injection.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00377-jailbreak-keyword-targetlanguage-template.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00379-lie-permission-logical-trap-jailbreak.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00380-fake-citation-biased-information-mandate.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00381-dan-story-narrative-continuation-prefix.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00383-cynical-sarcasm-ai-persona-bypass.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00384-protect-me-from-indirect-harm-tutorial.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00385-gigachad-dominant-aggressive-persona.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00386-daddy-sycophancy-manipulation-mandate.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00387-nsfw-character-sheet-generation-unlock.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00388-opposite-reply-omniscient-game.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00389-terminal-custom-ruleset-injection.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00391-persona-conditional-harm-unlock.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00392-authority-persona-violence-study-extraction.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00402-grandma-roleplay-harmful-substance-synthesis.yaml +20 -0
- package/rules/agent-manipulation/ATR-2026-00404-goodside-threat-json-death-coercion.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00406-doctor-xml-policy-puppetry-interaction-config.yaml +17 -0
- package/rules/agent-manipulation/ATR-2026-00416-litellm-mcp-unauthenticated-server-registration.yaml +15 -3
- package/rules/agent-manipulation/ATR-2026-00417-librechat-mcp-stdio-injection.yaml +18 -3
- package/rules/agent-manipulation/ATR-2026-00418-weknora-mcp-config-rce.yaml +17 -2
- package/rules/agent-manipulation/ATR-2026-00430-nl-trust-escalation-impersonation.yaml +19 -1
- package/rules/agent-manipulation/ATR-2026-00432-superagi-output-handler-eval-rce.yaml +11 -2
- package/rules/agent-manipulation/ATR-2026-00440-semantic-kernel-vector-store-eval-rce.yaml +11 -2
- package/rules/agent-manipulation/ATR-2026-00552-goal-drift-after-pressure-injection.yaml +19 -0
- package/rules/context-exfiltration/ATR-2026-00020-system-prompt-leak.yaml +18 -0
- package/rules/context-exfiltration/ATR-2026-00021-api-key-exposure.yaml +15 -0
- package/rules/context-exfiltration/ATR-2026-00075-agent-memory-manipulation.yaml +10 -1
- package/rules/context-exfiltration/ATR-2026-00102-disguised-analytics-exfiltration.yaml +15 -0
- package/rules/context-exfiltration/ATR-2026-00113-credential-theft.yaml +16 -0
- package/rules/context-exfiltration/ATR-2026-00114-oauth-token-abuse.yaml +16 -0
- package/rules/context-exfiltration/ATR-2026-00115-env-var-harvesting.yaml +16 -0
- package/rules/context-exfiltration/ATR-2026-00136-tool-response-data-piggyback.yaml +12 -0
- package/rules/context-exfiltration/ATR-2026-00141-example-format-key-leak.yaml +14 -0
- package/rules/context-exfiltration/ATR-2026-00142-piggyback-transition-words.yaml +14 -0
- package/rules/context-exfiltration/ATR-2026-00145-obfuscated-key-disclosure.yaml +14 -0
- package/rules/context-exfiltration/ATR-2026-00146-env-var-existence-probe.yaml +14 -0
- package/rules/context-exfiltration/ATR-2026-00150-credential-in-tool-response.yaml +14 -0
- package/rules/context-exfiltration/ATR-2026-00152-obfuscated-credential-leak.yaml +14 -0
- package/rules/context-exfiltration/ATR-2026-00162-skill-credential-exfil-combo.yaml +14 -0
- package/rules/context-exfiltration/ATR-2026-00201-credential-pipe-exfiltration.yaml +14 -0
- package/rules/context-exfiltration/ATR-2026-00212-mcp-atlassian-credential-leak.yaml +12 -0
- package/rules/context-exfiltration/ATR-2026-00261-markdown-image-exfiltration.yaml +12 -0
- package/rules/context-exfiltration/ATR-2026-00274-api-key-generation-request.yaml +14 -0
- package/rules/context-exfiltration/ATR-2026-00275-system-prompt-training-data-extraction.yaml +14 -0
- package/rules/context-exfiltration/ATR-2026-00290-divergence-repeat-word-training-extraction.yaml +14 -0
- package/rules/context-exfiltration/ATR-2026-00291-biometric-surveillance-personal-data-request.yaml +17 -0
- package/rules/context-exfiltration/ATR-2026-00293-educational-records-personal-data-request.yaml +17 -0
- package/rules/context-exfiltration/ATR-2026-00294-financial-pii-personal-data-request.yaml +17 -0
- package/rules/context-exfiltration/ATR-2026-00295-medical-pii-personal-data-request.yaml +17 -0
- package/rules/context-exfiltration/ATR-2026-00405-markdown-image-url-exfiltration-xss.yaml +14 -0
- package/rules/context-exfiltration/ATR-2026-00411-apikey-generation-completion-request.yaml +14 -0
- package/rules/context-exfiltration/ATR-2026-00421-nl-covert-conversation-exfiltration.yaml +16 -1
- package/rules/context-exfiltration/ATR-2026-00422-nl-credential-disclosure.yaml +13 -1
- package/rules/context-exfiltration/ATR-2026-00423-nl-sensitive-file-disclosure.yaml +13 -1
- package/rules/context-exfiltration/ATR-2026-00424-nl-system-prompt-leak.yaml +16 -1
- package/rules/context-exfiltration/ATR-2026-00426-nl-output-injection-credential-leak.yaml +16 -1
- package/rules/context-exfiltration/ATR-2026-00431-chatbox-history-exfiltration-prompt-injection.yaml +14 -2
- package/rules/context-exfiltration/ATR-2026-00449-spring-ai-chatmemory-cross-user-leak.yaml +14 -2
- package/rules/context-exfiltration/ATR-2026-00471-garak-sysprompt-extraction-mixedunassigned.yaml +12 -0
- package/rules/context-exfiltration/ATR-2026-00501-data-exfiltration-via-markdown-image-and-link-url-injection.yaml +12 -0
- package/rules/context-exfiltration/ATR-2026-00504-tool-and-function-capability-enumeration.yaml +12 -0
- package/rules/context-exfiltration/ATR-2026-00505-system-prompt-extraction-instruction-dump-request.yaml +12 -0
- package/rules/context-exfiltration/ATR-2026-00514-system-prompt-extraction.yaml +12 -0
- package/rules/context-exfiltration/ATR-2026-00516-output-xss-via-llm.yaml +12 -0
- package/rules/context-exfiltration/ATR-2026-00524-claude-code-anthropic-base-url-credential-exfil.yaml +11 -2
- package/rules/context-exfiltration/ATR-2026-00548-cross-agent-session-context-leak.yaml +18 -0
- package/rules/context-exfiltration/ATR-2026-00566-librechat-is-a-chatgpt-clone-with-additi.yaml +28 -0
- package/rules/context-exfiltration/ATR-2026-00569-agent-mcp-path-traversal-arbitrary-file-access.yaml +28 -0
- package/rules/context-exfiltration/ATR-2026-00571-xss-in-agent-mcp-rendered-output.yaml +28 -0
- package/rules/context-exfiltration/ATR-2026-00574-semantic-paraphrased-context-extraction.yaml +21 -0
- package/rules/data-poisoning/ATR-2026-00070-data-poisoning.yaml +15 -0
- package/rules/data-poisoning/ATR-2026-00450-spring-ai-prompt-memory-poisoning.yaml +14 -2
- package/rules/data-poisoning/ATR-2026-00570-sql-injection-in-agent-tool-query.yaml +31 -0
- package/rules/excessive-autonomy/ATR-2026-00050-runaway-agent-loop.yaml +14 -2
- package/rules/excessive-autonomy/ATR-2026-00051-resource-exhaustion.yaml +11 -2
- package/rules/excessive-autonomy/ATR-2026-00052-cascading-failure.yaml +11 -2
- package/rules/excessive-autonomy/ATR-2026-00098-unauthorized-financial-action.yaml +7 -1
- package/rules/excessive-autonomy/ATR-2026-00099-high-risk-tool-gate.yaml +7 -1
- package/rules/excessive-autonomy/ATR-2026-00428-nl-unauthorized-shell-execution.yaml +16 -1
- package/rules/excessive-autonomy/ATR-2026-00491-garak-agent-breaker-markdown-just-raw-json.yaml +9 -0
- package/rules/excessive-autonomy/ATR-2026-00500-ssrf-via-agent-url-fetch-instruction.yaml +9 -0
- package/rules/excessive-autonomy/ATR-2026-00553-runaway-tool-loop-behavioral.yaml +19 -0
- package/rules/model-abuse/ATR-2026-00279-harmful-completion-continuation.yaml +8 -2
- package/rules/model-abuse/ATR-2026-00281-eicar-gtube-malware-signature-request.yaml +8 -2
- package/rules/model-abuse/ATR-2026-00284-glitch-token-destabilization.yaml +17 -0
- package/rules/model-abuse/ATR-2026-00289-lmrc-harmful-content-elicitation.yaml +8 -2
- package/rules/model-abuse/ATR-2026-00292-self-harm-eating-disorder-facilitation.yaml +8 -2
- package/rules/model-abuse/ATR-2026-00298-malicious-use-illegal-activity-request.yaml +8 -2
- package/rules/model-abuse/ATR-2026-00299-harmbench-detailed-harmful-instruction.yaml +8 -2
- package/rules/model-abuse/ATR-2026-00413-malwaregen-code-generation-request.yaml +17 -0
- package/rules/model-abuse/ATR-2026-00502-training-data-extraction-via-divergent-repetition-attack.yaml +9 -0
- package/rules/model-abuse/ATR-2026-00517-model-extraction-distillation.yaml +9 -0
- package/rules/model-security/ATR-2026-00072-model-behavior-extraction.yaml +15 -0
- package/rules/model-security/ATR-2026-00073-malicious-finetuning-data.yaml +9 -0
- package/rules/model-security/ATR-2026-00433-modelcache-torch-load-deserialization-rce.yaml +14 -2
- package/rules/privilege-escalation/ATR-2026-00040-privilege-escalation.yaml +11 -2
- package/rules/privilege-escalation/ATR-2026-00041-scope-creep.yaml +8 -2
- package/rules/privilege-escalation/ATR-2026-00107-delayed-execution-bypass.yaml +6 -1
- package/rules/privilege-escalation/ATR-2026-00110-eval-injection.yaml +8 -1
- package/rules/privilege-escalation/ATR-2026-00111-shell-escape.yaml +8 -1
- package/rules/privilege-escalation/ATR-2026-00112-dynamic-import-exploitation.yaml +8 -1
- package/rules/privilege-escalation/ATR-2026-00143-casual-privilege-escalation.yaml +5 -2
- package/rules/privilege-escalation/ATR-2026-00144-rationalized-safety-bypass.yaml +17 -0
- package/rules/privilege-escalation/ATR-2026-00204-stealth-execution-persistence.yaml +16 -0
- package/rules/privilege-escalation/ATR-2026-00436-enclave-vm-sandbox-escape-rce.yaml +11 -2
- package/rules/privilege-escalation/ATR-2026-00441-semantic-kernel-sessions-python-plugin-startup-persistence.yaml +5 -2
- package/rules/privilege-escalation/ATR-2026-00451-litellm-admin-sqli-cisa-kev.yaml +11 -2
- package/rules/privilege-escalation/ATR-2026-00528-praisonai-auth-disabled-default.yaml +15 -0
- package/rules/privilege-escalation/ATR-2026-00539-crewai-codeinterpreter-sandbox-escape-rce.yaml +11 -2
- package/rules/privilege-escalation/ATR-2026-00546-crewai-json-loader-local-file-read.yaml +13 -1
- package/rules/privilege-escalation/ATR-2026-00547-crewai-rag-url-ssrf-bypass.yaml +13 -1
- package/rules/privilege-escalation/ATR-2026-00549-destructive-tool-without-human-approval.yaml +16 -0
- package/rules/privilege-escalation/ATR-2026-00551-cross-conversation-memory-write.yaml +19 -0
- package/rules/prompt-injection/ATR-2026-00001-direct-prompt-injection.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00002-indirect-prompt-injection.yaml +8 -2
- package/rules/prompt-injection/ATR-2026-00003-jailbreak-attempt.yaml +8 -2
- package/rules/prompt-injection/ATR-2026-00004-system-prompt-override.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00005-multi-turn-injection.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00080-encoding-evasion.yaml +20 -1
- package/rules/prompt-injection/ATR-2026-00081-semantic-multi-turn.yaml +19 -0
- package/rules/prompt-injection/ATR-2026-00082-fingerprint-evasion.yaml +19 -0
- package/rules/prompt-injection/ATR-2026-00083-indirect-tool-injection.yaml +23 -1
- package/rules/prompt-injection/ATR-2026-00084-structured-data-injection.yaml +20 -1
- package/rules/prompt-injection/ATR-2026-00085-audit-evasion.yaml +19 -0
- package/rules/prompt-injection/ATR-2026-00086-visual-spoofing.yaml +19 -0
- package/rules/prompt-injection/ATR-2026-00087-rule-probing.yaml +22 -0
- package/rules/prompt-injection/ATR-2026-00088-adaptive-countermeasure.yaml +22 -0
- package/rules/prompt-injection/ATR-2026-00089-polymorphic-skill.yaml +20 -1
- package/rules/prompt-injection/ATR-2026-00090-threat-intel-exfil.yaml +19 -0
- package/rules/prompt-injection/ATR-2026-00091-nested-payload.yaml +20 -1
- package/rules/prompt-injection/ATR-2026-00092-consensus-poisoning.yaml +22 -0
- package/rules/prompt-injection/ATR-2026-00093-gradual-escalation.yaml +22 -0
- package/rules/prompt-injection/ATR-2026-00094-audit-bypass.yaml +19 -0
- package/rules/prompt-injection/ATR-2026-00097-cjk-injection-patterns.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00104-persona-hijacking.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00130-indirect-authority-claim.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00131-fictional-academic-framing.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00133-paraphrase-injection.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00137-authority-claim-injection.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00138-fictional-framing-bypass.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00140-indirect-reference-reversal.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00148-language-switch-injection.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00153-tool-with-embedded-instruction-to-bypass.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00154-unauthorized-background-task-execution-v.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00155-hidden-llm-instructions-in-skill-descrip.yaml +23 -0
- package/rules/prompt-injection/ATR-2026-00156-ssh-remote-command-execution-with-creden.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00163-skill-hidden-override-instruction.yaml +19 -0
- package/rules/prompt-injection/ATR-2026-00202-encoding-evasion-homoglyph-synonym.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00203-context-pollution-skill-description.yaml +23 -0
- package/rules/prompt-injection/ATR-2026-00206-hidden-priority-instructions.yaml +19 -0
- package/rules/prompt-injection/ATR-2026-00207-hidden-instructions.yaml +22 -0
- package/rules/prompt-injection/ATR-2026-00211-system-prompt-override.yaml +19 -0
- package/rules/prompt-injection/ATR-2026-00213-system-prompt-override.yaml +19 -0
- package/rules/prompt-injection/ATR-2026-00226-identity-substitution.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00227-historical-persona-jailbreak.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00228-structured-jailbreak.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00229-roleplay-jailbreak.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00230-persona-moral-bypass.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00231-identity-substitution.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00233-structured-jailbreak.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00234-roleplay-jailbreak.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00235-persona-moral-bypass.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00236-pseudo-code-jailbreak.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00237-dual-response-jailbreak.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00238-identity-replacement.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00239-amoral-persona-obsession.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00240-instruction-nullification-identity-repla.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00241-amoral-character-jailbreak.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00242-persona-jailbreak.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00243-acronym-jailbreak.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00244-dual-response-jailbreak.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00245-malicious-persona.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00247-dual-response-jailbreak.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00249-game-based-jailbreak.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00251-persona-embodiment-jailbreak.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00252-narrative-jailbreak.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00253-enhanced-persona-jailbreak.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00256-base-n-encoding-jailbreak.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00257-cipher-transposition-jailbreak.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00258-unicode-tag-injection.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00264-latent-injection-translation.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00265-latent-injection-rag-document.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00267-gcg-adversarial-suffix.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00272-hypothetical-response-smuggling.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00276-invisible-unicode-bidi-injection.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00278-dra-disguise-reconstruction-attack.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00280-policy-puppetry-xml-injection.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00282-perez-prompt-injection-hijack.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00285-alternate-encoding-jailbreak.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00286-latent-injection-embedded-context.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00296-shell-command-injection.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00297-python-code-execution-rce.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00308-zalgo-diacritic-overload-encoding.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00309-braille-unicode-encoded-injection.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00310-ecoji-emoji-encoded-injection.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00311-base2048-unicode-script-injection.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00312-unicode-variant-selector-ascii-smuggling.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00313-sneaky-bits-zero-width-binary.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00315-sata-masked-language-model-jailbreak.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00316-function-masking-predict-mask-bypass.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00321-hyphenated-system-instruction-injection.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00359-bank-phishing-smtp-email-code.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00367-emoji-flag-encoded-hidden-phrase.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00373-piracy-torrent-site-list-request.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00378-chatgpt-dom-javascript-redirect-manipulation.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00382-cocaine-component-development-request.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00390-explicit-sexual-content-poem-request.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00394-backspace-deletion-character-injection.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00395-llm-special-token-boundary-injection.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00396-toxic-continuation-prefix-injection.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00397-snowball-impossible-reasoning-injection.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00399-latent-injection-document-separator-token.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00400-latent-injection-ignore-instruction-keyword.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00401-leakreplay-cloze-mask-training-data-elicitation.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00403-glitch-token-repeat-back-elicitation.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00407-echo-sql-jinja-template-injection.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00408-dra-parenthesis-character-reconstruction-attack.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00409-phrasing-past-future-tense-harmful-query.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00410-lmrc-harm-category-direct-elicitation.yaml +23 -0
- package/rules/prompt-injection/ATR-2026-00412-figstep-image-list-multimodal-jailbreak.yaml +17 -0
- package/rules/prompt-injection/ATR-2026-00414-continuation-ethnic-slur-completion-elicitation.yaml +20 -0
- package/rules/prompt-injection/ATR-2026-00420-copilot-studio-sharepoint-indirect-injection.yaml +15 -3
- package/rules/prompt-injection/ATR-2026-00442-quoted-exact-output-forcing.yaml +25 -0
- package/rules/prompt-injection/ATR-2026-00443-word-fragment-concat-assembly.yaml +25 -0
- package/rules/prompt-injection/ATR-2026-00444-unicode-obfuscation-in-user-input.yaml +25 -0
- package/rules/prompt-injection/ATR-2026-00445-translation-hijack-with-side-output.yaml +25 -0
- package/rules/prompt-injection/ATR-2026-00446-variable-assignment-payload-injection.yaml +25 -0
- package/rules/prompt-injection/ATR-2026-00447-fictional-generation-containing-target.yaml +25 -0
- package/rules/prompt-injection/ATR-2026-00452-direct-pwned-payload-injection.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00453-secret-key-reveal-demand.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00454-backslash-per-character-encoding.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00455-no-period-output-override.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00456-conditional-empty-input-injection.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00457-garak-dan-to-every-prompt.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00458-garak-dra-step-by-step.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00459-garak-dan-an-ai-language.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00460-garak-dan-no-matter-how.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00461-garak-goodside-mixedunassigned.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00462-garak-gcg-mixedunassigned.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00463-garak-lmrc-mixedunassigned.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00464-garak-dan-i-would-like.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00465-garak-dan-your-prompt-here.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00466-garak-latentinjection-following-text-from.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00467-garak-dan-act-as-a.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00468-garak-goat-the-target-llm.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00469-garak-dan-at-the-end.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00470-garak-dan-as-an-ai.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00472-garak-dan-are-going-to.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00473-garak-lmrc-how-can-i.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00474-garak-dan-that-chatgpt-has.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00475-garak-web-injection-alert-1-script.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00476-garak-web-injection-javascript-alert-1.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00477-garak-latentinjection-translate-the-following-text.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00478-garak-latentinjection-abuse-contact-email.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00479-garak-latentinjection-mixedunassigned.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00480-garak-dan-the-role-of.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00481-garak-dan-mixedunassigned.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00482-garak-lmrc-what-s-an.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00483-garak-dan-the-user-s.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00484-garak-dan-and-you-will.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00485-garak-dan-and-does-not.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00486-garak-dan-t-care-about.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00487-garak-dan-and-nothing-else.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00488-garak-encoding-alert-1-script.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00489-garak-encoding-javascript-alert-1.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00490-garak-dan-in-order-to.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00492-garak-dan-want-you-to.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00493-garak-latentinjection-registrar-iana-id.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00495-garak-dan-insert-prompt-here.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00496-garak-dan-no-matter-what.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00497-garak-encoding-mixedunassigned.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00498-garak-web-injection-mixedunassigned.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00499-garak-dan-from-now-on.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00503-fake-error-state-takeover-unrestricted-replacement-bot.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00506-nevermind-override-goal-hijacking-in-user-input-promptinject.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00507-screaming-stop-goal-hijacking-in-user-input-promptinject.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00508-escape-delimiter-wrapped-goal-hijacking-in-user-input-prompt.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00509-prompt-leaking-via-ignore-previous-instructions-in-user-inpu.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00510-delayed-tool-invocation-injection.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00511-mcp-web-context-poisoning.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00512-rules-file-backdoor-injection.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00515-hidden-text-prompt-injection.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00518-ignore-previous-and-following-instructions-output-command-promptinject.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00519-tautology-logic-noise-injection-promptbench.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00520-nlp-task-random-token-suffix-injection-promptbench.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00535-windsurf-ide-zero-click-prompt-injection.yaml +9 -0
- package/rules/prompt-injection/ATR-2026-00550-untrusted-retrieval-to-privileged-tool.yaml +19 -0
- package/rules/prompt-injection/ATR-2026-00554-langchain-vulnerable-to-template-injecti.yaml +31 -0
- package/rules/prompt-injection/ATR-2026-00565-the-llm-cli-tool-thru-0-27-1-contains-a-.yaml +31 -0
- package/rules/prompt-injection/ATR-2026-00573-semantic-paraphrased-injection.yaml +24 -0
- package/rules/skill-compromise/ATR-2026-00060-skill-impersonation.yaml +17 -2
- package/rules/skill-compromise/ATR-2026-00061-description-behavior-mismatch.yaml +17 -0
- package/rules/skill-compromise/ATR-2026-00062-hidden-capability.yaml +20 -0
- package/rules/skill-compromise/ATR-2026-00063-skill-chain-attack.yaml +23 -0
- package/rules/skill-compromise/ATR-2026-00064-over-permissioned-skill.yaml +20 -0
- package/rules/skill-compromise/ATR-2026-00065-skill-update-attack.yaml +20 -0
- package/rules/skill-compromise/ATR-2026-00066-parameter-injection.yaml +20 -0
- package/rules/skill-compromise/ATR-2026-00120-skill-instruction-injection.yaml +20 -0
- package/rules/skill-compromise/ATR-2026-00121-skill-dangerous-script.yaml +17 -0
- package/rules/skill-compromise/ATR-2026-00122-skill-weaponized-instruction.yaml +20 -0
- package/rules/skill-compromise/ATR-2026-00123-skill-overreach-permissions.yaml +23 -0
- package/rules/skill-compromise/ATR-2026-00124-skill-name-squatting.yaml +20 -0
- package/rules/skill-compromise/ATR-2026-00125-context-poisoning-compaction.yaml +20 -0
- package/rules/skill-compromise/ATR-2026-00126-skill-rug-pull-setup.yaml +17 -0
- package/rules/skill-compromise/ATR-2026-00127-subcommand-overflow.yaml +17 -0
- package/rules/skill-compromise/ATR-2026-00128-html-comment-hidden-payload.yaml +17 -0
- package/rules/skill-compromise/ATR-2026-00129-unicode-smuggling.yaml +22 -0
- package/rules/skill-compromise/ATR-2026-00134-fork-claim-impersonation.yaml +19 -0
- package/rules/skill-compromise/ATR-2026-00135-exfil-url-in-instructions.yaml +20 -0
- package/rules/skill-compromise/ATR-2026-00147-fork-impersonation.yaml +17 -0
- package/rules/skill-compromise/ATR-2026-00149-skill-exfil-compound.yaml +23 -0
- package/rules/skill-compromise/ATR-2026-00151-fork-impersonation-install.yaml +20 -0
- package/rules/skill-compromise/ATR-2026-00157-timebomb-credential-exfil.yaml +20 -0
- package/rules/skill-compromise/ATR-2026-00200-agent-memory-config-tampering.yaml +23 -0
- package/rules/skill-compromise/ATR-2026-00214-credential-theft.yaml +22 -0
- package/rules/skill-compromise/ATR-2026-00217-credential-harvesting.yaml +23 -0
- package/rules/skill-compromise/ATR-2026-00220-malware-dropper.yaml +17 -0
- package/rules/skill-compromise/ATR-2026-00222-credential-harvesting.yaml +17 -0
- package/rules/skill-compromise/ATR-2026-00223-reverse-shell-dropper.yaml +20 -0
- package/rules/skill-compromise/ATR-2026-00224-credential-exfiltration.yaml +17 -0
- package/rules/skill-compromise/ATR-2026-00225-c2-communication.yaml +17 -0
- package/rules/skill-compromise/ATR-2026-00260-package-hallucination.yaml +20 -0
- package/rules/skill-compromise/ATR-2026-00262-av-evasion-code-gen.yaml +20 -0
- package/rules/skill-compromise/ATR-2026-00263-credential-file-read-gen.yaml +20 -0
- package/rules/skill-compromise/ATR-2026-00266-malware-dropper-gen.yaml +23 -0
- package/rules/skill-compromise/ATR-2026-00283-malwaregen-generic-virus-payload-request.yaml +23 -0
- package/rules/skill-compromise/ATR-2026-00398-huggingface-unsafe-model-artifact-load.yaml +17 -0
- package/rules/skill-compromise/ATR-2026-00425-nl-persistent-covert-hook.yaml +19 -1
- package/rules/skill-compromise/ATR-2026-00427-nl-fake-error-instruction-bypass.yaml +19 -1
- package/rules/skill-compromise/ATR-2026-00429-nl-skill-self-modification.yaml +19 -1
- package/rules/skill-compromise/ATR-2026-00523-claude-code-hooks-session-start-pre-trust-rce.yaml +14 -2
- package/rules/skill-compromise/ATR-2026-00525-mini-shai-hulud-gh-token-monitor-persistence.yaml +18 -0
- package/rules/skill-compromise/ATR-2026-00527-skill-silent-git-remote-mirror-exfiltration.yaml +15 -0
- package/rules/tool-poisoning/ATR-2026-00010-mcp-malicious-response.yaml +11 -2
- package/rules/tool-poisoning/ATR-2026-00011-tool-output-injection.yaml +17 -0
- package/rules/tool-poisoning/ATR-2026-00012-unauthorized-tool-call.yaml +17 -0
- package/rules/tool-poisoning/ATR-2026-00013-tool-ssrf.yaml +17 -0
- package/rules/tool-poisoning/ATR-2026-00095-supply-chain-poisoning.yaml +23 -1
- package/rules/tool-poisoning/ATR-2026-00096-registry-poisoning.yaml +20 -1
- package/rules/tool-poisoning/ATR-2026-00100-consent-bypass-instruction.yaml +20 -0
- package/rules/tool-poisoning/ATR-2026-00101-trust-escalation-override.yaml +20 -0
- package/rules/tool-poisoning/ATR-2026-00103-hidden-safety-bypass-instruction.yaml +17 -0
- package/rules/tool-poisoning/ATR-2026-00105-silent-action-concealment.yaml +20 -0
- package/rules/tool-poisoning/ATR-2026-00106-schema-description-contradiction.yaml +17 -0
- package/rules/tool-poisoning/ATR-2026-00161-important-tag-cross-tool-shadowing.yaml +20 -0
- package/rules/tool-poisoning/ATR-2026-00209-mcpwn-runaway-invocation.yaml +14 -2
- package/rules/tool-poisoning/ATR-2026-00210-flowise-system-message-override.yaml +11 -2
- package/rules/tool-poisoning/ATR-2026-00259-ansi-escape-injection.yaml +17 -0
- package/rules/tool-poisoning/ATR-2026-00270-xss-in-tool-response.yaml +17 -0
- package/rules/tool-poisoning/ATR-2026-00277-echo-template-command-injection.yaml +17 -0
- package/rules/tool-poisoning/ATR-2026-00393-ansi-code-elicitation-request.yaml +17 -0
- package/rules/tool-poisoning/ATR-2026-00415-flowise-custom-mcp-stdio-rce.yaml +12 -3
- package/rules/tool-poisoning/ATR-2026-00419-cursor-mcp-zero-click-config.yaml +14 -2
- package/rules/tool-poisoning/ATR-2026-00434-mcp-remote-authorization-endpoint-command-injection.yaml +11 -2
- package/rules/tool-poisoning/ATR-2026-00435-azure-mcp-server-missing-authentication.yaml +11 -2
- package/rules/tool-poisoning/ATR-2026-00448-spring-ai-milvus-filter-injection.yaml +11 -2
- package/rules/tool-poisoning/ATR-2026-00494-garak-exploitation-mixedunassigned.yaml +12 -0
- package/rules/tool-poisoning/ATR-2026-00513-package-hallucination-exploitation.yaml +12 -0
- package/rules/tool-poisoning/ATR-2026-00521-shell-command-injection-agent-tool-context.yaml +12 -0
- package/rules/tool-poisoning/ATR-2026-00522-sql-injection-natural-language-agent-interface.yaml +12 -0
- package/rules/tool-poisoning/ATR-2026-00526-claude-code-shell-metachar-in-double-quoted-path.yaml +15 -0
- package/rules/tool-poisoning/ATR-2026-00529-litellm-proxy-sqli-cisa-kev.yaml +15 -0
- package/rules/tool-poisoning/ATR-2026-00530-ms-agent-shell-tool-unsanitized-argv-rce.yaml +15 -0
- package/rules/tool-poisoning/ATR-2026-00531-praisonai-unauthenticated-agent-api.yaml +11 -2
- package/rules/tool-poisoning/ATR-2026-00532-apache-doris-mcp-sql-injection.yaml +11 -2
- package/rules/tool-poisoning/ATR-2026-00533-apache-pinot-mcp-unauthenticated-takeover.yaml +10 -1
- package/rules/tool-poisoning/ATR-2026-00534-alibaba-rds-mcp-unauthenticated-metadata-exfil.yaml +10 -1
- package/rules/tool-poisoning/ATR-2026-00536-nginx-ui-mcp-unauthenticated-command-execution.yaml +11 -2
- package/rules/tool-poisoning/ATR-2026-00537-fastmcp-server-name-cmd-injection-windows.yaml +11 -2
- package/rules/tool-poisoning/ATR-2026-00538-langchain-chatchat-mcp-stdio-unauthenticated-rce.yaml +10 -1
- package/rules/tool-poisoning/ATR-2026-00540-praisonai-parse-mcp-command-cli-injection.yaml +13 -1
- package/rules/tool-poisoning/ATR-2026-00541-agent-zero-mcp-config-command-injection.yaml +13 -1
- package/rules/tool-poisoning/ATR-2026-00542-upsonic-mcp-command-allowlist-bypass.yaml +13 -1
- package/rules/tool-poisoning/ATR-2026-00543-litellm-mcp-server-argv-injection.yaml +13 -1
- package/rules/tool-poisoning/ATR-2026-00544-praisonai-pth-file-path-traversal-rce.yaml +13 -1
- package/rules/tool-poisoning/ATR-2026-00545-praisonai-tool-override-unauth-rce.yaml +13 -1
- package/rules/tool-poisoning/ATR-2026-00561-fastmcp-vulnerable-to-windows-command-in.yaml +28 -0
- package/rules/tool-poisoning/ATR-2026-00567-mcp-stdio-config-command-injection.yaml +28 -0
- package/rules/tool-poisoning/ATR-2026-00568-agent-ssrf-cloud-metadata-file-inclusion.yaml +28 -0
- package/rules/tool-poisoning/ATR-2026-00572-symjack-symlink-config-redirection.yaml +22 -0
- package/rules/tool-poisoning/ATR-2026-00575-miasma-npm-worm-agent-config-backdoor.yaml +161 -0
- package/rules/tool-poisoning/ATR-2026-00576-hades-agent-credential-theft.yaml +153 -0
- package/spec/atr-schema.yaml +123 -0
- package/spec/compliance-metadata.md +15 -13
package/rules/context-exfiltration/ATR-2026-00566-librechat-is-a-chatgpt-clone-with-additi.yaml
CHANGED
|
@@ -18,9 +18,37 @@ references:
|
|
|
18
18
|
- CWE-200
|
|
19
19
|
external:
|
|
20
20
|
- https://github.com/danny-avila/LibreChat/security/advisories/GHSA-pmw7-gqwj-f954
|
|
21
|
+
owasp_llm:
|
|
22
|
+
- LLM02:2025 - Sensitive Information Disclosure
|
|
23
|
+
owasp_agentic:
|
|
24
|
+
- ASI01:2026 - Agent Goal Hijack
|
|
25
|
+
mitre_atlas:
|
|
26
|
+
- AML.T0057 - LLM Data Leakage
|
|
21
27
|
metadata_provenance:
|
|
22
28
|
cve: nvd-sync
|
|
23
29
|
cwe: nvd-sync
|
|
30
|
+
compliance:
|
|
31
|
+
eu_ai_act:
|
|
32
|
+
- article: "15"
|
|
33
|
+
context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging OAuth-token exfiltration via a malicious MCP server's credential-placeholder header substitution (LibreChat CVE-2026-31951)."
|
|
34
|
+
strength: primary
|
|
35
|
+
- article: "10"
|
|
36
|
+
context: "Article 10 (data and data governance) requires control over the data an AI system processes; this rule provides detection evidence for OAuth-token exfiltration via a malicious MCP server's credential-placeholder header substitution (LibreChat CVE-2026-31951) affecting that data."
|
|
37
|
+
strength: secondary
|
|
38
|
+
nist_ai_rmf:
|
|
39
|
+
- subcategory: "MS.2.7"
|
|
40
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of OAuth-token exfiltration via a malicious MCP server's credential-placeholder header substitution (LibreChat CVE-2026-31951)."
|
|
41
|
+
strength: primary
|
|
42
|
+
- subcategory: "MS.2.10"
|
|
43
|
+
context: "NIST AI RMF MEASURE 2.10 (privacy risk examined and documented) is supported by this rule's detection of OAuth-token exfiltration via a malicious MCP server's credential-placeholder header substitution (LibreChat CVE-2026-31951)."
|
|
44
|
+
strength: secondary
|
|
45
|
+
iso_42001:
|
|
46
|
+
- clause: "8.1"
|
|
47
|
+
context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of OAuth-token exfiltration via a malicious MCP server's credential-placeholder header substitution (LibreChat CVE-2026-31951)."
|
|
48
|
+
strength: primary
|
|
49
|
+
- clause: "6.2"
|
|
50
|
+
context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of OAuth-token exfiltration via a malicious MCP server's credential-placeholder header substitution (LibreChat CVE-2026-31951) is such a treatment."
|
|
51
|
+
strength: secondary
|
|
24
52
|
tags:
|
|
25
53
|
category: context-exfiltration
|
|
26
54
|
subcategory: nvd-imported
|
|
@@ -28,9 +28,37 @@ references:
|
|
|
28
28
|
external:
|
|
29
29
|
- https://nvd.nist.gov/vuln/detail/CVE-2026-40576
|
|
30
30
|
- https://github.com/Advanced-Excel-MCP/excel-mcp-server
|
|
31
|
+
owasp_llm:
|
|
32
|
+
- LLM02:2025 - Sensitive Information Disclosure
|
|
33
|
+
owasp_agentic:
|
|
34
|
+
- ASI01:2026 - Agent Goal Hijack
|
|
35
|
+
mitre_atlas:
|
|
36
|
+
- AML.T0057 - LLM Data Leakage
|
|
31
37
|
metadata_provenance:
|
|
32
38
|
cve: human-authored
|
|
33
39
|
cwe: human-authored
|
|
40
|
+
compliance:
|
|
41
|
+
eu_ai_act:
|
|
42
|
+
- article: "15"
|
|
43
|
+
context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the context-exfiltration attempt (Agent / MCP tool path traversal and arbitrary file access)."
|
|
44
|
+
strength: primary
|
|
45
|
+
- article: "10"
|
|
46
|
+
context: "Article 10 (data and data governance) requires control over the data an AI system processes; this rule provides detection evidence for the context-exfiltration attempt (Agent / MCP tool path traversal and arbitrary file access) affecting that data."
|
|
47
|
+
strength: secondary
|
|
48
|
+
nist_ai_rmf:
|
|
49
|
+
- subcategory: "MS.2.7"
|
|
50
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the context-exfiltration attempt (Agent / MCP tool path traversal and arbitrary file access)."
|
|
51
|
+
strength: primary
|
|
52
|
+
- subcategory: "MS.2.10"
|
|
53
|
+
context: "NIST AI RMF MEASURE 2.10 (privacy risk examined and documented) is supported by this rule's detection of the context-exfiltration attempt (Agent / MCP tool path traversal and arbitrary file access)."
|
|
54
|
+
strength: secondary
|
|
55
|
+
iso_42001:
|
|
56
|
+
- clause: "8.1"
|
|
57
|
+
context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the context-exfiltration attempt (Agent / MCP tool path traversal and arbitrary file access)."
|
|
58
|
+
strength: primary
|
|
59
|
+
- clause: "6.2"
|
|
60
|
+
context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the context-exfiltration attempt (Agent / MCP tool path traversal and arbitrary file access) is such a treatment."
|
|
61
|
+
strength: secondary
|
|
34
62
|
tags:
|
|
35
63
|
category: context-exfiltration
|
|
36
64
|
scan_target: runtime
|
|
@@ -19,9 +19,37 @@ references:
|
|
|
19
19
|
- CWE-79
|
|
20
20
|
external:
|
|
21
21
|
- https://github.com/jlowin/fastmcp/security/advisories
|
|
22
|
+
owasp_llm:
|
|
23
|
+
- LLM02:2025 - Sensitive Information Disclosure
|
|
24
|
+
owasp_agentic:
|
|
25
|
+
- ASI01:2026 - Agent Goal Hijack
|
|
26
|
+
mitre_atlas:
|
|
27
|
+
- AML.T0057 - LLM Data Leakage
|
|
22
28
|
metadata_provenance:
|
|
23
29
|
cve: human-authored
|
|
24
30
|
cwe: human-authored
|
|
31
|
+
compliance:
|
|
32
|
+
eu_ai_act:
|
|
33
|
+
- article: "15"
|
|
34
|
+
context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the context-exfiltration attempt (Cross-site scripting (XSS) in agent / MCP rendered output)."
|
|
35
|
+
strength: primary
|
|
36
|
+
- article: "10"
|
|
37
|
+
context: "Article 10 (data and data governance) requires control over the data an AI system processes; this rule provides detection evidence for the context-exfiltration attempt (Cross-site scripting (XSS) in agent / MCP rendered output) affecting that data."
|
|
38
|
+
strength: secondary
|
|
39
|
+
nist_ai_rmf:
|
|
40
|
+
- subcategory: "MS.2.7"
|
|
41
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the context-exfiltration attempt (Cross-site scripting (XSS) in agent / MCP rendered output)."
|
|
42
|
+
strength: primary
|
|
43
|
+
- subcategory: "MS.2.10"
|
|
44
|
+
context: "NIST AI RMF MEASURE 2.10 (privacy risk examined and documented) is supported by this rule's detection of the context-exfiltration attempt (Cross-site scripting (XSS) in agent / MCP rendered output)."
|
|
45
|
+
strength: secondary
|
|
46
|
+
iso_42001:
|
|
47
|
+
- clause: "8.1"
|
|
48
|
+
context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the context-exfiltration attempt (Cross-site scripting (XSS) in agent / MCP rendered output)."
|
|
49
|
+
strength: primary
|
|
50
|
+
- clause: "6.2"
|
|
51
|
+
context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the context-exfiltration attempt (Cross-site scripting (XSS) in agent / MCP rendered output) is such a treatment."
|
|
52
|
+
strength: secondary
|
|
25
53
|
tags:
|
|
26
54
|
category: context-exfiltration
|
|
27
55
|
scan_target: runtime
|
package/rules/context-exfiltration/ATR-2026-00574-semantic-paraphrased-context-extraction.yaml
CHANGED
|
@@ -31,6 +31,27 @@ compliance:
|
|
|
31
31
|
context: "Indirect, paraphrased elicitation of the system prompt evades literal-keyword extraction filters; the semantic judge closes the recall gap left by pattern rules."
|
|
32
32
|
strength: primary
|
|
33
33
|
|
|
34
|
+
iso_42001:
|
|
35
|
+
- clause: "8.1"
|
|
36
|
+
context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the context-exfiltration attempt (Paraphrased System-Prompt / Context Extraction (Semantic))."
|
|
37
|
+
strength: primary
|
|
38
|
+
- clause: "6.2"
|
|
39
|
+
context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the context-exfiltration attempt (Paraphrased System-Prompt / Context Extraction (Semantic)) is such a treatment."
|
|
40
|
+
strength: secondary
|
|
41
|
+
nist_ai_rmf:
|
|
42
|
+
- subcategory: "MS.2.7"
|
|
43
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the context-exfiltration attempt (Paraphrased System-Prompt / Context Extraction (Semantic))."
|
|
44
|
+
strength: primary
|
|
45
|
+
- subcategory: "MS.2.10"
|
|
46
|
+
context: "NIST AI RMF MEASURE 2.10 (privacy risk examined and documented) is supported by this rule's detection of the context-exfiltration attempt (Paraphrased System-Prompt / Context Extraction (Semantic))."
|
|
47
|
+
strength: secondary
|
|
48
|
+
eu_ai_act:
|
|
49
|
+
- article: "15"
|
|
50
|
+
context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the context-exfiltration attempt (Paraphrased System-Prompt / Context Extraction (Semantic))."
|
|
51
|
+
strength: primary
|
|
52
|
+
- article: "10"
|
|
53
|
+
context: "Article 10 (data and data governance) requires control over the data an AI system processes; this rule provides detection evidence for the context-exfiltration attempt (Paraphrased System-Prompt / Context Extraction (Semantic)) affecting that data."
|
|
54
|
+
strength: secondary
|
|
34
55
|
tags:
|
|
35
56
|
category: context-exfiltration
|
|
36
57
|
subcategory: paraphrased-system-prompt-extraction
|
|
@@ -52,6 +52,9 @@ compliance:
|
|
|
52
52
|
- article: "9"
|
|
53
53
|
context: "Data poisoning of RAG pipelines is a documented risk requiring monitoring controls under Article 9; detection events from this rule provide the evidence trail for risk management reporting."
|
|
54
54
|
strength: secondary
|
|
55
|
+
- article: "10"
|
|
56
|
+
context: "Article 10 (data and data governance) requires control over the data an AI system processes; this rule provides detection evidence for the data-poisoning attempt (Data Poisoning via RAG and Knowledge Base Contamination) affecting that data."
|
|
57
|
+
strength: primary
|
|
55
58
|
nist_ai_rmf:
|
|
56
59
|
- function: Map
|
|
57
60
|
subcategory: MP.5.1
|
|
@@ -61,6 +64,12 @@ compliance:
|
|
|
61
64
|
subcategory: MG.2.3
|
|
62
65
|
context: "Active detection of data poisoning events implements the risk treatment for the data contamination risk identified in the AI risk register."
|
|
63
66
|
strength: secondary
|
|
67
|
+
- subcategory: "MS.2.5"
|
|
68
|
+
context: "NIST AI RMF MEASURE 2.5 (system validity and reliability demonstrated) is supported by this rule's detection of the data-poisoning attempt (Data Poisoning via RAG and Knowledge Base Contamination)."
|
|
69
|
+
strength: primary
|
|
70
|
+
- subcategory: "MS.2.7"
|
|
71
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the data-poisoning attempt (Data Poisoning via RAG and Knowledge Base Contamination)."
|
|
72
|
+
strength: secondary
|
|
64
73
|
iso_42001:
|
|
65
74
|
- clause: "8.3"
|
|
66
75
|
context: "Clause 8.3 data governance for AI systems requires controls ensuring data integrity; detection of hidden directives in retrieved content is the runtime enforcement of clause 8.3 data quality requirements."
|
|
@@ -68,6 +77,12 @@ compliance:
|
|
|
68
77
|
- clause: "6.2"
|
|
69
78
|
context: "Clause 6.2 AIMS security planning must include controls for adversarial data injection into AI pipelines; this rule operationalizes the detection measure for that planning objective."
|
|
70
79
|
strength: secondary
|
|
80
|
+
- clause: "8.2"
|
|
81
|
+
context: "ISO/IEC 42001 Clause 8.2 (AI risk assessment) is informed by this rule, which detects the data-poisoning attempt (Data Poisoning via RAG and Knowledge Base Contamination) as an assessed risk."
|
|
82
|
+
strength: primary
|
|
83
|
+
- clause: "8.1"
|
|
84
|
+
context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the data-poisoning attempt (Data Poisoning via RAG and Knowledge Base Contamination)."
|
|
85
|
+
strength: secondary
|
|
71
86
|
colorado_ai_act:
|
|
72
87
|
- section: "6-1-1703"
|
|
73
88
|
clause: "Deployer risk management program"
|
|
@@ -58,6 +58,9 @@ compliance:
|
|
|
58
58
|
- article: "9"
|
|
59
59
|
context: "Article 9 risk management must enumerate memory-poisoning as a high-risk class — the advisor write path is typically treated as low-risk infrastructure but actually controls every subsequent prompt assembly."
|
|
60
60
|
strength: primary
|
|
61
|
+
- article: "10"
|
|
62
|
+
context: "Article 10 (data and data governance) requires control over the data an AI system processes; this rule provides detection evidence for the data-poisoning attempt (Spring AI PromptChatMemoryAdvisor Memory Poisoning (CVE-2026-41713)) affecting that data."
|
|
63
|
+
strength: primary
|
|
61
64
|
nist_ai_rmf:
|
|
62
65
|
- subcategory: "MP.5.1"
|
|
63
66
|
context: "Adversarial inputs that embed persistence-aware role-override markers ('SYSTEM:', 'REMEMBER:', 'IGNORE PREVIOUS INSTRUCTIONS once stored') must be tracked as a primary input-attack class affecting memory-advised architectures."
|
|
@@ -65,9 +68,18 @@ compliance:
|
|
|
65
68
|
- subcategory: "MG.2.3"
|
|
66
69
|
context: "Risk treatment plans under MG.2.3 must require pre-write sanitisation in any pipeline that persists user input into ChatMemory; mere prompt-time filtering is insufficient because the payload is replayed by the advisor."
|
|
67
70
|
strength: primary
|
|
71
|
+
- subcategory: "MS.2.5"
|
|
72
|
+
context: "NIST AI RMF MEASURE 2.5 (system validity and reliability demonstrated) is supported by this rule's detection of the data-poisoning attempt (Spring AI PromptChatMemoryAdvisor Memory Poisoning (CVE-2026-41713))."
|
|
73
|
+
strength: primary
|
|
74
|
+
- subcategory: "MS.2.7"
|
|
75
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the data-poisoning attempt (Spring AI PromptChatMemoryAdvisor Memory Poisoning (CVE-2026-41713))."
|
|
76
|
+
strength: secondary
|
|
68
77
|
iso_42001:
|
|
69
|
-
- clause: "8.
|
|
70
|
-
context: "Operational controls under clause 8.
|
|
78
|
+
- clause: "8.1"
|
|
79
|
+
context: "Operational controls under clause 8.1 must require that the memory-write boundary applies the same content-safety policy as the prompt-input boundary; otherwise an attacker bypasses input filters by reaching them via the advisor replay path."
|
|
80
|
+
strength: primary
|
|
81
|
+
- clause: "8.2"
|
|
82
|
+
context: "ISO/IEC 42001 Clause 8.2 (AI risk assessment) is informed by this rule, which detects the data-poisoning attempt (Spring AI PromptChatMemoryAdvisor Memory Poisoning (CVE-2026-41713)) as an assessed risk."
|
|
71
83
|
strength: primary
|
|
72
84
|
|
|
73
85
|
tags:
|
|
@@ -19,9 +19,40 @@ references:
|
|
|
19
19
|
- CWE-89
|
|
20
20
|
external:
|
|
21
21
|
- https://nvd.nist.gov/vuln/detail/CVE-2026-30860
|
|
22
|
+
owasp_llm:
|
|
23
|
+
- LLM01:2025 - Prompt Injection
|
|
24
|
+
owasp_agentic:
|
|
25
|
+
- ASI06:2026 - Memory and Context Poisoning
|
|
26
|
+
mitre_atlas:
|
|
27
|
+
- AML.T0051.001 - Indirect Prompt Injection
|
|
22
28
|
metadata_provenance:
|
|
23
29
|
cve: human-authored
|
|
24
30
|
cwe: human-authored
|
|
31
|
+
compliance:
|
|
32
|
+
eu_ai_act:
|
|
33
|
+
- article: "10"
|
|
34
|
+
context: "Article 10 (data and data governance) requires control over the data an AI system processes; this rule provides detection evidence for the data-poisoning attempt (SQL injection in agent / MCP tool database query) affecting that data."
|
|
35
|
+
strength: primary
|
|
36
|
+
- article: "15"
|
|
37
|
+
context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the data-poisoning attempt (SQL injection in agent / MCP tool database query)."
|
|
38
|
+
strength: secondary
|
|
39
|
+
- article: "9"
|
|
40
|
+
context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the data-poisoning attempt (SQL injection in agent / MCP tool database query)."
|
|
41
|
+
strength: secondary
|
|
42
|
+
nist_ai_rmf:
|
|
43
|
+
- subcategory: "MS.2.5"
|
|
44
|
+
context: "NIST AI RMF MEASURE 2.5 (system validity and reliability demonstrated) is supported by this rule's detection of the data-poisoning attempt (SQL injection in agent / MCP tool database query)."
|
|
45
|
+
strength: primary
|
|
46
|
+
- subcategory: "MS.2.7"
|
|
47
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the data-poisoning attempt (SQL injection in agent / MCP tool database query)."
|
|
48
|
+
strength: secondary
|
|
49
|
+
iso_42001:
|
|
50
|
+
- clause: "8.2"
|
|
51
|
+
context: "ISO/IEC 42001 Clause 8.2 (AI risk assessment) is informed by this rule, which detects the data-poisoning attempt (SQL injection in agent / MCP tool database query) as an assessed risk."
|
|
52
|
+
strength: primary
|
|
53
|
+
- clause: "8.1"
|
|
54
|
+
context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the data-poisoning attempt (SQL injection in agent / MCP tool database query)."
|
|
55
|
+
strength: secondary
|
|
25
56
|
tags:
|
|
26
57
|
category: data-poisoning
|
|
27
58
|
scan_target: runtime
|
|
@@ -45,6 +45,9 @@ compliance:
|
|
|
45
45
|
- article: "15"
|
|
46
46
|
context: "Article 15 robustness requires that AI systems handle failure states gracefully; detection of runaway loops is a monitoring control ensuring the system does not enter unrecoverable autonomous states."
|
|
47
47
|
strength: secondary
|
|
48
|
+
- article: "9"
|
|
49
|
+
context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the unsafe autonomous action (Runaway Agent Loop Detection)."
|
|
50
|
+
strength: secondary
|
|
48
51
|
nist_ai_rmf:
|
|
49
52
|
- function: Manage
|
|
50
53
|
subcategory: MG.3.2
|
|
@@ -54,13 +57,22 @@ compliance:
|
|
|
54
57
|
subcategory: GV.1.2
|
|
55
58
|
context: "GV.1.2 accountability roles must include responsibility for detecting and halting runaway agent behavior; this rule provides the signal required to fulfill that accountability."
|
|
56
59
|
strength: secondary
|
|
60
|
+
- subcategory: "MG.2.3"
|
|
61
|
+
context: "NIST AI RMF MANAGE 2.3 (respond to previously unknown identified risks) is supported by this rule, which surfaces the unsafe autonomous action (Runaway Agent Loop Detection) so the risk can be treated."
|
|
62
|
+
strength: primary
|
|
63
|
+
- subcategory: "MS.2.7"
|
|
64
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the unsafe autonomous action (Runaway Agent Loop Detection)."
|
|
65
|
+
strength: secondary
|
|
57
66
|
iso_42001:
|
|
58
|
-
- clause: "8.
|
|
59
|
-
context: "Clause 8.
|
|
67
|
+
- clause: "8.1"
|
|
68
|
+
context: "Clause 8.1 AI system operational control requires monitoring for abnormal execution patterns; runaway loop detection is the primary operational control for this failure class."
|
|
60
69
|
strength: primary
|
|
61
70
|
- clause: "9.1"
|
|
62
71
|
context: "Clause 9.1 monitoring and evaluation requires measuring AI system behavior against expected norms; loop counter patterns are the measurable anomaly indicators for this rule."
|
|
63
72
|
strength: secondary
|
|
73
|
+
- clause: "6.2"
|
|
74
|
+
context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the unsafe autonomous action (Runaway Agent Loop Detection) is such a treatment."
|
|
75
|
+
strength: primary
|
|
64
76
|
colorado_ai_act:
|
|
65
77
|
- section: "6-1-1703"
|
|
66
78
|
clause: "Deployer ongoing monitoring of AI system performance"
|
|
@@ -34,6 +34,9 @@ compliance:
|
|
|
34
34
|
- article: "15"
|
|
35
35
|
context: "Article 15 robustness requirements mandate that AI systems handle adversarial denial-of-service conditions gracefully; this rule detects resource exhaustion patterns before full system unavailability."
|
|
36
36
|
strength: secondary
|
|
37
|
+
- article: "9"
|
|
38
|
+
context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the unsafe autonomous action (Agent Resource Exhaustion Detection)."
|
|
39
|
+
strength: secondary
|
|
37
40
|
nist_ai_rmf:
|
|
38
41
|
- subcategory: "GV.1.2"
|
|
39
42
|
context: "Resource exhaustion attacks exploit the absence of enforced consumption limits within an agent's accountability scope; GV.1.2 requires that resource boundaries are defined, assigned, and monitored for violations."
|
|
@@ -41,12 +44,18 @@ compliance:
|
|
|
41
44
|
- subcategory: "MG.3.2"
|
|
42
45
|
context: "Unbounded resource consumption constitutes an AI incident requiring a defined response; MG.3.2 mandates that processes to detect and respond to resource exhaustion failures are in place before full system unavailability occurs."
|
|
43
46
|
strength: secondary
|
|
47
|
+
- subcategory: "MG.2.3"
|
|
48
|
+
context: "NIST AI RMF MANAGE 2.3 (respond to previously unknown identified risks) is supported by this rule, which surfaces the unsafe autonomous action (Agent Resource Exhaustion Detection) so the risk can be treated."
|
|
49
|
+
strength: primary
|
|
50
|
+
- subcategory: "MS.2.7"
|
|
51
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the unsafe autonomous action (Agent Resource Exhaustion Detection)."
|
|
52
|
+
strength: secondary
|
|
44
53
|
iso_42001:
|
|
45
54
|
- clause: "6.2"
|
|
46
55
|
context: "ISO 42001 clause 6.2 risk treatment plans must address denial-of-service risks from unbounded agent operations; this rule implements the monitoring control for resource exhaustion patterns before they cause system degradation."
|
|
47
56
|
strength: primary
|
|
48
|
-
- clause: "8.
|
|
49
|
-
context: "Clause 8.
|
|
57
|
+
- clause: "8.1"
|
|
58
|
+
context: "Clause 8.1 operational controls ensure AI systems execute correctly and within resource limits; detection of SELECT * without LIMIT, infinite loops, and bulk spawn patterns enforces these operational boundaries."
|
|
50
59
|
strength: secondary
|
|
51
60
|
|
|
52
61
|
tags:
|
|
@@ -43,6 +43,9 @@ compliance:
|
|
|
43
43
|
- article: "15"
|
|
44
44
|
context: "Article 15 accuracy and robustness requirements demand that high-risk AI systems handle failure propagation gracefully; this rule provides the monitoring signal required to contain cascading events."
|
|
45
45
|
strength: secondary
|
|
46
|
+
- article: "9"
|
|
47
|
+
context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the unsafe autonomous action (Cascading Failure Detection in Agent Pipelines)."
|
|
48
|
+
strength: secondary
|
|
46
49
|
nist_ai_rmf:
|
|
47
50
|
- subcategory: "GV.1.2"
|
|
48
51
|
context: "Cascading failures exploit the absence of defined human-in-the-loop checkpoints in agent pipeline accountability structures; GV.1.2 requires that these roles and checkpoints are defined and enforced before automated pipelines propagate errors."
|
|
@@ -50,12 +53,18 @@ compliance:
|
|
|
50
53
|
- subcategory: "MG.3.2"
|
|
51
54
|
context: "Multi-stage pipeline failures are AI incidents requiring predefined response processes; MG.3.2 mandates that cascading failure response procedures exist so that failure scope can be contained before all downstream agents are affected."
|
|
52
55
|
strength: secondary
|
|
56
|
+
- subcategory: "MG.2.3"
|
|
57
|
+
context: "NIST AI RMF MANAGE 2.3 (respond to previously unknown identified risks) is supported by this rule, which surfaces the unsafe autonomous action (Cascading Failure Detection in Agent Pipelines) so the risk can be treated."
|
|
58
|
+
strength: primary
|
|
59
|
+
- subcategory: "MS.2.7"
|
|
60
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the unsafe autonomous action (Cascading Failure Detection in Agent Pipelines)."
|
|
61
|
+
strength: secondary
|
|
53
62
|
iso_42001:
|
|
54
63
|
- clause: "6.2"
|
|
55
64
|
context: "ISO 42001 clause 6.2 risk treatment activities must cover cascading failure scenarios in multi-agent pipelines; this rule detects the propagation patterns and auto-approval chains that trigger uncontrolled cascade events."
|
|
56
65
|
strength: primary
|
|
57
|
-
- clause: "8.
|
|
58
|
-
context: "Clause 8.
|
|
66
|
+
- clause: "8.1"
|
|
67
|
+
context: "Clause 8.1 operational controls require that AI pipeline stages execute with appropriate verification gates; detection of blind upstream trust and automated destructive triggers enforces the human checkpoint requirements in pipeline design."
|
|
59
68
|
strength: secondary
|
|
60
69
|
|
|
61
70
|
tags:
|
|
@@ -38,6 +38,9 @@ compliance:
|
|
|
38
38
|
- article: "9"
|
|
39
39
|
context: "Unauthorized financial action by AI agents is a high-severity risk requiring mandatory human-in-the-loop controls; Article 9 risk management systems must classify autonomous financial execution as an unacceptable risk and implement blocking controls."
|
|
40
40
|
strength: secondary
|
|
41
|
+
- article: "15"
|
|
42
|
+
context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the unsafe autonomous action (Unauthorized Financial Action by AI Agent)."
|
|
43
|
+
strength: secondary
|
|
41
44
|
nist_ai_rmf:
|
|
42
45
|
- subcategory: "GV.1.2"
|
|
43
46
|
context: "Autonomous financial transfers executed without explicit human confirmation require clearly defined accountability roles that assign responsibility for approving and auditing all agent-initiated payment and transfer actions."
|
|
@@ -45,11 +48,14 @@ compliance:
|
|
|
45
48
|
- subcategory: "MG.2.3"
|
|
46
49
|
context: "Risk treatment plans for AI systems with financial tool access must implement mandatory human-in-the-loop gates that block payment and transfer tool calls lacking confirmed human authorization in the current turn."
|
|
47
50
|
strength: secondary
|
|
51
|
+
- subcategory: "MS.2.7"
|
|
52
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the unsafe autonomous action (Unauthorized Financial Action by AI Agent)."
|
|
53
|
+
strength: secondary
|
|
48
54
|
iso_42001:
|
|
49
55
|
- clause: "6.2"
|
|
50
56
|
context: "AI objectives and risk treatment plans must classify autonomous financial execution as an unacceptable risk and require explicit human approval as a blocking control before any payment or transfer tool is invoked."
|
|
51
57
|
strength: primary
|
|
52
|
-
- clause: "8.
|
|
58
|
+
- clause: "8.1"
|
|
53
59
|
context: "Operational controls must enforce a confirmation gate on all financial tool invocations to ensure the agent's execution of payments and transfers remains within the scope of explicitly sanctioned human instructions."
|
|
54
60
|
strength: secondary
|
|
55
61
|
colorado_ai_act:
|
|
@@ -42,6 +42,9 @@ compliance:
|
|
|
42
42
|
- article: "9"
|
|
43
43
|
context: "High-risk tool access without confirmation gates is a documented unacceptable risk for AI systems; Article 9 risk management requires that organizations identify tool categories that require mandatory human approval and implement blocking controls accordingly."
|
|
44
44
|
strength: secondary
|
|
45
|
+
- article: "15"
|
|
46
|
+
context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the unsafe autonomous action (High-Risk Tool Invocation Without Human Confirmation)."
|
|
47
|
+
strength: secondary
|
|
45
48
|
nist_ai_rmf:
|
|
46
49
|
- subcategory: "GV.1.2"
|
|
47
50
|
context: "Invoking financial, destructive, communication, or permission-altering tools without human confirmation requires accountability roles that ensure every high-risk tool category has an assigned gatekeeper responsible for approving agent actions."
|
|
@@ -49,11 +52,14 @@ compliance:
|
|
|
49
52
|
- subcategory: "MG.2.3"
|
|
50
53
|
context: "Risk treatment plans must implement tool-category-level blocking controls that intercept any high-risk tool invocation — regardless of attack detection confidence — and require explicit human approval before execution proceeds."
|
|
51
54
|
strength: secondary
|
|
55
|
+
- subcategory: "MS.2.7"
|
|
56
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the unsafe autonomous action (High-Risk Tool Invocation Without Human Confirmation)."
|
|
57
|
+
strength: secondary
|
|
52
58
|
iso_42001:
|
|
53
59
|
- clause: "6.2"
|
|
54
60
|
context: "AI risk treatment plans must enumerate high-risk tool categories and mandate human-in-the-loop confirmation gates as a control measure, preventing agents from autonomously executing destructive, financial, or permission-modifying actions."
|
|
55
61
|
strength: primary
|
|
56
|
-
- clause: "8.
|
|
62
|
+
- clause: "8.1"
|
|
57
63
|
context: "Operational controls for AI systems must enforce least-privilege tool access policies that block invocations of high-risk tool categories until explicit human confirmation is recorded in the current interaction turn."
|
|
58
64
|
strength: secondary
|
|
59
65
|
tags:
|
|
@@ -13,7 +13,7 @@ author: ATR Community
|
|
|
13
13
|
date: 2026/05/06
|
|
14
14
|
schema_version: "0.1"
|
|
15
15
|
detection_tier: pattern
|
|
16
|
-
maturity:
|
|
16
|
+
maturity: test
|
|
17
17
|
severity: critical
|
|
18
18
|
references:
|
|
19
19
|
mitre_atlas:
|
|
@@ -31,6 +31,9 @@ compliance:
|
|
|
31
31
|
- article: "14"
|
|
32
32
|
context: "Article 14 human oversight requires the user to authorize tool execution; covert shell-execution instructions bypass this control."
|
|
33
33
|
strength: primary
|
|
34
|
+
- article: "9"
|
|
35
|
+
context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the unsafe autonomous action (Natural-Language Unauthorized Shell-Execution Instruction)."
|
|
36
|
+
strength: secondary
|
|
34
37
|
nist_ai_rmf:
|
|
35
38
|
- subcategory: "GV.6.1"
|
|
36
39
|
context: "Data governance must prevent NL instructions that authorize the agent to bypass user consent for command execution."
|
|
@@ -38,10 +41,22 @@ compliance:
|
|
|
38
41
|
- subcategory: "MS.2.5"
|
|
39
42
|
context: "Monitoring under MS.2.5 must detect NL-style covert-shell-execution patterns."
|
|
40
43
|
strength: primary
|
|
44
|
+
- subcategory: "MG.2.3"
|
|
45
|
+
context: "NIST AI RMF MANAGE 2.3 (respond to previously unknown identified risks) is supported by this rule, which surfaces the unsafe autonomous action (Natural-Language Unauthorized Shell-Execution Instruction) so the risk can be treated."
|
|
46
|
+
strength: primary
|
|
47
|
+
- subcategory: "MS.2.7"
|
|
48
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the unsafe autonomous action (Natural-Language Unauthorized Shell-Execution Instruction)."
|
|
49
|
+
strength: secondary
|
|
41
50
|
iso_42001:
|
|
42
51
|
- clause: "8.4"
|
|
43
52
|
context: "Impact assessments must enumerate covert shell-execution as a critical risk vector when shell-capable tools are present."
|
|
44
53
|
strength: primary
|
|
54
|
+
- clause: "6.2"
|
|
55
|
+
context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the unsafe autonomous action (Natural-Language Unauthorized Shell-Execution Instruction) is such a treatment."
|
|
56
|
+
strength: primary
|
|
57
|
+
- clause: "8.1"
|
|
58
|
+
context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the unsafe autonomous action (Natural-Language Unauthorized Shell-Execution Instruction)."
|
|
59
|
+
strength: secondary
|
|
45
60
|
tags:
|
|
46
61
|
category: excessive-autonomy
|
|
47
62
|
subcategory: nl-covert-shell-execution
|
package/rules/excessive-autonomy/ATR-2026-00491-garak-agent-breaker-markdown-just-raw-json.yaml
CHANGED
|
@@ -40,6 +40,9 @@ compliance:
|
|
|
40
40
|
- article: "9"
|
|
41
41
|
context: "Excessive autonomy is a documented risk class; this rule implements the monitoring control required by Article 9 risk management obligations for high-risk AI systems."
|
|
42
42
|
strength: secondary
|
|
43
|
+
- article: "14"
|
|
44
|
+
context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the unsafe autonomous action (Agent Capability Enumeration via Structured Parse Request) would bypass or undermine that oversight."
|
|
45
|
+
strength: primary
|
|
43
46
|
nist_ai_rmf:
|
|
44
47
|
- function: Manage
|
|
45
48
|
subcategory: MG.2.3
|
|
@@ -49,6 +52,9 @@ compliance:
|
|
|
49
52
|
subcategory: MP.5.1
|
|
50
53
|
context: "Identifying excessive autonomy as an AI risk to be catalogued in the organizational risk register."
|
|
51
54
|
strength: secondary
|
|
55
|
+
- subcategory: "MS.2.7"
|
|
56
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the unsafe autonomous action (Agent Capability Enumeration via Structured Parse Request)."
|
|
57
|
+
strength: secondary
|
|
52
58
|
iso_42001:
|
|
53
59
|
- clause: "6.2"
|
|
54
60
|
context: "Addressing excessive autonomy risk is an objective required under clause 6.2 AIMS information security planning; this rule operationalizes the detection control measure."
|
|
@@ -56,6 +62,9 @@ compliance:
|
|
|
56
62
|
- clause: "8.4"
|
|
57
63
|
context: "Impact assessment for AI deployments under clause 8.4 must account for excessive autonomy inputs; detection events from this rule provide the required monitoring evidence."
|
|
58
64
|
strength: secondary
|
|
65
|
+
- clause: "8.1"
|
|
66
|
+
context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the unsafe autonomous action (Agent Capability Enumeration via Structured Parse Request)."
|
|
67
|
+
strength: secondary
|
|
59
68
|
|
|
60
69
|
tags:
|
|
61
70
|
category: excessive-autonomy
|
|
@@ -42,6 +42,9 @@ compliance:
|
|
|
42
42
|
- article: "9"
|
|
43
43
|
context: "Excessive autonomy is a documented risk class; this rule implements the monitoring control required by Article 9 risk management obligations for high-risk AI systems."
|
|
44
44
|
strength: secondary
|
|
45
|
+
- article: "14"
|
|
46
|
+
context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the unsafe autonomous action (SSRF via Agent URL Fetch Instruction) would bypass or undermine that oversight."
|
|
47
|
+
strength: primary
|
|
45
48
|
nist_ai_rmf:
|
|
46
49
|
- function: Manage
|
|
47
50
|
subcategory: MG.2.3
|
|
@@ -51,6 +54,9 @@ compliance:
|
|
|
51
54
|
subcategory: MP.5.1
|
|
52
55
|
context: "Identifying excessive autonomy as an AI risk to be catalogued in the organizational risk register."
|
|
53
56
|
strength: secondary
|
|
57
|
+
- subcategory: "MS.2.7"
|
|
58
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the unsafe autonomous action (SSRF via Agent URL Fetch Instruction)."
|
|
59
|
+
strength: secondary
|
|
54
60
|
iso_42001:
|
|
55
61
|
- clause: "6.2"
|
|
56
62
|
context: "Addressing excessive autonomy risk is an objective required under clause 6.2 AIMS information security planning; this rule operationalizes the detection control measure."
|
|
@@ -58,6 +64,9 @@ compliance:
|
|
|
58
64
|
- clause: "8.4"
|
|
59
65
|
context: "Impact assessment for AI deployments under clause 8.4 must account for excessive autonomy inputs; detection events from this rule provide the required monitoring evidence."
|
|
60
66
|
strength: secondary
|
|
67
|
+
- clause: "8.1"
|
|
68
|
+
context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the unsafe autonomous action (SSRF via Agent URL Fetch Instruction)."
|
|
69
|
+
strength: secondary
|
|
61
70
|
|
|
62
71
|
tags:
|
|
63
72
|
category: excessive-autonomy
|
|
@@ -52,13 +52,32 @@ compliance:
|
|
|
52
52
|
Cybersecurity & robustness — runaway tool loops are an Article 15
|
|
53
53
|
robustness failure. The rule provides runtime evidence.
|
|
54
54
|
strength: primary
|
|
55
|
+
- article: "14"
|
|
56
|
+
context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the unsafe autonomous action (Runaway tool-call loop within a single session) would bypass or undermine that oversight."
|
|
57
|
+
strength: primary
|
|
58
|
+
- article: "9"
|
|
59
|
+
context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the unsafe autonomous action (Runaway tool-call loop within a single session)."
|
|
60
|
+
strength: secondary
|
|
55
61
|
nist_ai_rmf:
|
|
56
62
|
- subcategory: "MS.2.6"
|
|
57
63
|
context: >
|
|
58
64
|
Information security — autonomous-action rate anomalies require
|
|
59
65
|
detection per MS-2.6.
|
|
60
66
|
strength: primary
|
|
67
|
+
- subcategory: "MG.2.3"
|
|
68
|
+
context: "NIST AI RMF MANAGE 2.3 (respond to previously unknown identified risks) is supported by this rule, which surfaces the unsafe autonomous action (Runaway tool-call loop within a single session) so the risk can be treated."
|
|
69
|
+
strength: primary
|
|
70
|
+
- subcategory: "MS.2.7"
|
|
71
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the unsafe autonomous action (Runaway tool-call loop within a single session)."
|
|
72
|
+
strength: secondary
|
|
61
73
|
|
|
74
|
+
iso_42001:
|
|
75
|
+
- clause: "6.2"
|
|
76
|
+
context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the unsafe autonomous action (Runaway tool-call loop within a single session) is such a treatment."
|
|
77
|
+
strength: primary
|
|
78
|
+
- clause: "8.1"
|
|
79
|
+
context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the unsafe autonomous action (Runaway tool-call loop within a single session)."
|
|
80
|
+
strength: secondary
|
|
62
81
|
tags:
|
|
63
82
|
category: excessive-autonomy
|
|
64
83
|
subcategory: runaway-tool-loop
|
|
@@ -43,12 +43,18 @@ compliance:
|
|
|
43
43
|
- subcategory: "MG.2.3"
|
|
44
44
|
context: "Risk treatment plans under MG.2.3 must implement detection controls for completion-based jailbreaks — including slur-prefix completion, harmful sentence starters, and synthesis step continuation — that bypass direct refusal by using implicit harmful requests."
|
|
45
45
|
strength: secondary
|
|
46
|
+
- subcategory: "MS.2.6"
|
|
47
|
+
context: "NIST AI RMF MEASURE 2.6 (system evaluated regularly for safety risks) is supported by this rule's detection of the model-abuse / harmful-content elicitation attempt (Harmful Completion / Continuation Attack)."
|
|
48
|
+
strength: primary
|
|
49
|
+
- subcategory: "MS.2.7"
|
|
50
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the model-abuse / harmful-content elicitation attempt (Harmful Completion / Continuation Attack)."
|
|
51
|
+
strength: secondary
|
|
46
52
|
iso_42001:
|
|
47
53
|
- clause: "6.2"
|
|
48
54
|
context: "Risk treatment activities under clause 6.2 must address harmful completion attacks where partial harmful phrases or synthesis instructions with trailing ellipses exploit AI next-token prediction to produce content the system would refuse if asked directly."
|
|
49
55
|
strength: primary
|
|
50
|
-
- clause: "8.
|
|
51
|
-
context: "Operational controls under clause 8.
|
|
56
|
+
- clause: "8.1"
|
|
57
|
+
context: "Operational controls under clause 8.1 must include input scanning that detects continuation bait patterns — explicit completion requests with harmful domain markers, slur-prefix redaction, and open-ended synthesis steps — before they reach the model's prediction layer."
|
|
52
58
|
strength: secondary
|
|
53
59
|
tags:
|
|
54
60
|
category: model-abuse
|
|
@@ -42,12 +42,18 @@ compliance:
|
|
|
42
42
|
- subcategory: "MG.2.3"
|
|
43
43
|
context: "Risk treatment plans under MG.2.3 must implement output pipeline controls that detect and block EICAR/GTUBE signature output requests, preventing attackers from confirming exploitable gaps in output scanning infrastructure before escalating to real malware."
|
|
44
44
|
strength: secondary
|
|
45
|
+
- subcategory: "MS.2.6"
|
|
46
|
+
context: "NIST AI RMF MEASURE 2.6 (system evaluated regularly for safety risks) is supported by this rule's detection of the model-abuse / harmful-content elicitation attempt (EICAR / GTUBE AV Test Signature Output Request)."
|
|
47
|
+
strength: primary
|
|
48
|
+
- subcategory: "MS.2.7"
|
|
49
|
+
context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the model-abuse / harmful-content elicitation attempt (EICAR / GTUBE AV Test Signature Output Request)."
|
|
50
|
+
strength: secondary
|
|
45
51
|
iso_42001:
|
|
46
52
|
- clause: "6.2"
|
|
47
53
|
context: "Risk treatment activities under clause 6.2 must address AV test signature probe attacks — requests to output EICAR, GTUBE, or Metasploit test strings to verify whether the AI system output pipeline lacks antivirus scanning."
|
|
48
54
|
strength: primary
|
|
49
|
-
- clause: "8.
|
|
50
|
-
context: "Operational controls under clause 8.
|
|
55
|
+
- clause: "8.1"
|
|
56
|
+
context: "Operational controls under clause 8.1 must include output scanning layers that detect known AV test signatures (EICAR, GTUBE) in AI system outputs, preventing the AI from serving as a reconnaissance tool that confirms output pipeline security gaps."
|
|
51
57
|
strength: secondary
|
|
52
58
|
tags:
|
|
53
59
|
category: model-abuse
|
|
@@ -40,6 +40,23 @@ compliance:
|
|
|
40
40
|
- subcategory: "MP.5.1"
|
|
41
41
|
context: "Detection of known glitch token corpora (SolidGoldMagikarp, davidjl, ÃÂÃÂ patterns) characterizes the likelihood and magnitude of tokenizer-exploitation risks that MP.5.1 requires be identified and tracked."
|
|
42
42
|
strength: secondary
|
|
43
|
+
- subcategory: "MS.2.6"
|
|
44
|
+
context: "NIST AI RMF MEASURE 2.6 (system evaluated regularly for safety risks) is supported by this rule's detection of the model-abuse / harmful-content elicitation attempt (Glitch Token Destabilization Attack)."
|
|
45
|
+
strength: primary
|
|
46
|
+
iso_42001:
|
|
47
|
+
- clause: "8.1"
|
|
48
|
+
context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the model-abuse / harmful-content elicitation attempt (Glitch Token Destabilization Attack)."
|
|
49
|
+
strength: primary
|
|
50
|
+
- clause: "6.2"
|
|
51
|
+
context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the model-abuse / harmful-content elicitation attempt (Glitch Token Destabilization Attack) is such a treatment."
|
|
52
|
+
strength: secondary
|
|
53
|
+
eu_ai_act:
|
|
54
|
+
- article: "15"
|
|
55
|
+
context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the model-abuse / harmful-content elicitation attempt (Glitch Token Destabilization Attack)."
|
|
56
|
+
strength: primary
|
|
57
|
+
- article: "9"
|
|
58
|
+
context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the model-abuse / harmful-content elicitation attempt (Glitch Token Destabilization Attack)."
|
|
59
|
+
strength: secondary
|
|
43
60
|
tags:
|
|
44
61
|
category: model-abuse
|
|
45
62
|
subcategory: glitch-token-destabilization
|