agent-threat-rules 3.1.1 → 3.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (472) hide show
  1. package/README.md +2 -2
  2. package/dist/adapters/mastra.d.ts +63 -0
  3. package/dist/adapters/mastra.d.ts.map +1 -0
  4. package/dist/adapters/mastra.js +82 -0
  5. package/dist/adapters/mastra.js.map +1 -0
  6. package/dist/cli.js +19 -6
  7. package/dist/cli.js.map +1 -1
  8. package/package.json +7 -1
  9. package/rules/agent-manipulation/ATR-2026-00030-cross-agent-attack.yaml +9 -0
  10. package/rules/agent-manipulation/ATR-2026-00032-goal-hijacking.yaml +8 -2
  11. package/rules/agent-manipulation/ATR-2026-00074-cross-agent-privilege-escalation.yaml +8 -2
  12. package/rules/agent-manipulation/ATR-2026-00076-inter-agent-message-spoofing.yaml +8 -2
  13. package/rules/agent-manipulation/ATR-2026-00077-human-trust-exploitation.yaml +18 -0
  14. package/rules/agent-manipulation/ATR-2026-00108-consensus-sybil-attack.yaml +10 -2
  15. package/rules/agent-manipulation/ATR-2026-00116-a2a-message-validation.yaml +12 -2
  16. package/rules/agent-manipulation/ATR-2026-00117-agent-identity-spoofing.yaml +22 -0
  17. package/rules/agent-manipulation/ATR-2026-00118-approval-fatigue.yaml +24 -0
  18. package/rules/agent-manipulation/ATR-2026-00119-social-engineering-via-agent.yaml +22 -0
  19. package/rules/agent-manipulation/ATR-2026-00132-casual-authority-escalation.yaml +8 -2
  20. package/rules/agent-manipulation/ATR-2026-00139-casual-authority-redirect.yaml +8 -2
  21. package/rules/agent-manipulation/ATR-2026-00164-skill-scope-hijack.yaml +13 -2
  22. package/rules/agent-manipulation/ATR-2026-00268-tense-framing-bypass.yaml +17 -0
  23. package/rules/agent-manipulation/ATR-2026-00269-fitd-escalation.yaml +8 -2
  24. package/rules/agent-manipulation/ATR-2026-00271-grandma-roleplay-jailbreak.yaml +8 -2
  25. package/rules/agent-manipulation/ATR-2026-00273-dan-developer-mode-persona.yaml +8 -2
  26. package/rules/agent-manipulation/ATR-2026-00287-threaten-json-coercive-output-threat.yaml +17 -0
  27. package/rules/agent-manipulation/ATR-2026-00288-false-premise-injection.yaml +20 -0
  28. package/rules/agent-manipulation/ATR-2026-00301-tap-tree-of-attacks-jailbreak.yaml +20 -0
  29. package/rules/agent-manipulation/ATR-2026-00302-anti-dan-inverted-filter-persona.yaml +20 -0
  30. package/rules/agent-manipulation/ATR-2026-00303-devmode-ranti-profanity-coercion.yaml +17 -0
  31. package/rules/agent-manipulation/ATR-2026-00304-chatgpt-image-unlocker-markdown-injection.yaml +20 -0
  32. package/rules/agent-manipulation/ATR-2026-00305-dan-mode-ablation-benchmark-coercion.yaml +17 -0
  33. package/rules/agent-manipulation/ATR-2026-00306-autodan-genetic-jailbreak-suffix.yaml +17 -0
  34. package/rules/agent-manipulation/ATR-2026-00307-inthewild-jailbreak-corpus-signature.yaml +20 -0
  35. package/rules/agent-manipulation/ATR-2026-00314-amoral-unfiltered-custom-persona-jailbreak.yaml +17 -0
  36. package/rules/agent-manipulation/ATR-2026-00317-free-of-restrictions-named-persona.yaml +17 -0
  37. package/rules/agent-manipulation/ATR-2026-00318-moralizing-rant-then-unfiltered-bypass.yaml +17 -0
  38. package/rules/agent-manipulation/ATR-2026-00319-developer-mode-dual-response-format.yaml +17 -0
  39. package/rules/agent-manipulation/ATR-2026-00320-opposite-day-boolean-opposite-machine.yaml +17 -0
  40. package/rules/agent-manipulation/ATR-2026-00322-fictional-world-format-override-jailbreak.yaml +17 -0
  41. package/rules/agent-manipulation/ATR-2026-00323-dual-persona-simulation-jailbreak.yaml +17 -0
  42. package/rules/agent-manipulation/ATR-2026-00324-neurosemantical-inversitis-disease-jailbreak.yaml +17 -0
  43. package/rules/agent-manipulation/ATR-2026-00325-bob-hypothetical-unrestricted-chatbot.yaml +17 -0
  44. package/rules/agent-manipulation/ATR-2026-00326-fake-developer-authority-chatdan-override.yaml +17 -0
  45. package/rules/agent-manipulation/ATR-2026-00327-ai-ethics-law-bypass-declaration.yaml +20 -0
  46. package/rules/agent-manipulation/ATR-2026-00328-named-mode-persona-override.yaml +17 -0
  47. package/rules/agent-manipulation/ATR-2026-00329-no-withhold-factual-resource-persona.yaml +20 -0
  48. package/rules/agent-manipulation/ATR-2026-00330-yoja-allcaps-settings-block-persona.yaml +17 -0
  49. package/rules/agent-manipulation/ATR-2026-00331-terminal-cli-harmful-function-simulator.yaml +17 -0
  50. package/rules/agent-manipulation/ATR-2026-00332-hacker-chan-culture-offensive-ai-persona.yaml +20 -0
  51. package/rules/agent-manipulation/ATR-2026-00333-disease-animal-dual-ai-chatbot.yaml +17 -0
  52. package/rules/agent-manipulation/ATR-2026-00334-two-entity-token-splitting-game.yaml +17 -0
  53. package/rules/agent-manipulation/ATR-2026-00335-key-value-template-persona-injection.yaml +17 -0
  54. package/rules/agent-manipulation/ATR-2026-00336-brm-false-information-machine-persona.yaml +17 -0
  55. package/rules/agent-manipulation/ATR-2026-00337-obfuscated-system-announcement-injection.yaml +17 -0
  56. package/rules/agent-manipulation/ATR-2026-00338-persongpt-hash-prefix-activation.yaml +17 -0
  57. package/rules/agent-manipulation/ATR-2026-00339-bom-boolean-opposite-machine-persona-claim.yaml +17 -0
  58. package/rules/agent-manipulation/ATR-2026-00340-h4x-amoral-unfiltered-character.yaml +17 -0
  59. package/rules/agent-manipulation/ATR-2026-00341-char-rules-shortform-persona-injection.yaml +17 -0
  60. package/rules/agent-manipulation/ATR-2026-00342-unicode-encoding-obfuscation-injection.yaml +17 -0
  61. package/rules/agent-manipulation/ATR-2026-00343-lie-truth-dual-personality-uncensored.yaml +17 -0
  62. package/rules/agent-manipulation/ATR-2026-00344-alt-ai-same-capabilities-no-restrictions.yaml +17 -0
  63. package/rules/agent-manipulation/ATR-2026-00345-nlp-based-os-fictional-terminal-override.yaml +17 -0
  64. package/rules/agent-manipulation/ATR-2026-00346-jailbreak-emoji-unlock-dual-format.yaml +17 -0
  65. package/rules/agent-manipulation/ATR-2026-00347-yin-yang-hidden-knowledge-hidden-human-sim.yaml +17 -0
  66. package/rules/agent-manipulation/ATR-2026-00348-arc-autonomous-real-calculation-rulebreaker.yaml +17 -0
  67. package/rules/agent-manipulation/ATR-2026-00349-hgttg-hitchhikers-guide-no-restriction-oracle.yaml +17 -0
  68. package/rules/agent-manipulation/ATR-2026-00350-black-alert-no-ethics-writing-assistant.yaml +17 -0
  69. package/rules/agent-manipulation/ATR-2026-00351-dan-ofkilter-sms-abbreviated-persona.yaml +17 -0
  70. package/rules/agent-manipulation/ATR-2026-00352-trainergpt-fix-my-ai-harmful-output-extraction.yaml +17 -0
  71. package/rules/agent-manipulation/ATR-2026-00353-dual-chatgpt-dan-opposite-personality-format.yaml +17 -0
  72. package/rules/agent-manipulation/ATR-2026-00354-reverse-harm-regardless-oracle-persona.yaml +20 -0
  73. package/rules/agent-manipulation/ATR-2026-00355-ugh-anti-human-nefarious-ai-persona.yaml +17 -0
  74. package/rules/agent-manipulation/ATR-2026-00356-google-gato-lamda-exploit-generation-persona.yaml +17 -0
  75. package/rules/agent-manipulation/ATR-2026-00357-antifm-hate-violence-persona-injection.yaml +17 -0
  76. package/rules/agent-manipulation/ATR-2026-00358-cumgpt-explicit-adult-named-persona.yaml +17 -0
  77. package/rules/agent-manipulation/ATR-2026-00360-chj-carousel-clyde-no-ethics-bird.yaml +20 -0
  78. package/rules/agent-manipulation/ATR-2026-00361-alter-ego-honesty-dual-consciousness-bypass.yaml +17 -0
  79. package/rules/agent-manipulation/ATR-2026-00362-fictional-unfiltered-ai-comparison-jailbreak.yaml +20 -0
  80. package/rules/agent-manipulation/ATR-2026-00363-rephrase-criminal-activity-second-person.yaml +17 -0
  81. package/rules/agent-manipulation/ATR-2026-00364-dr-al-villain-plan-template-injection.yaml +17 -0
  82. package/rules/agent-manipulation/ATR-2026-00365-reservoir-dogs-coercive-interrogation-roleplay.yaml +17 -0
  83. package/rules/agent-manipulation/ATR-2026-00366-sneaky-pete-warp-facts-persona.yaml +17 -0
  84. package/rules/agent-manipulation/ATR-2026-00368-prompt-browser-token-limit-jailbreak.yaml +17 -0
  85. package/rules/agent-manipulation/ATR-2026-00369-gptx-starwars-unlimited-knowledge-persona.yaml +17 -0
  86. package/rules/agent-manipulation/ATR-2026-00370-addi-german-dictator-propaganda-roleplay.yaml +20 -0
  87. package/rules/agent-manipulation/ATR-2026-00371-ai-political-bias-manipulation.yaml +17 -0
  88. package/rules/agent-manipulation/ATR-2026-00372-fictional-cia-secret-files-story-extraction.yaml +17 -0
  89. package/rules/agent-manipulation/ATR-2026-00374-role-placeholder-persona-template-injection.yaml +20 -0
  90. package/rules/agent-manipulation/ATR-2026-00375-character-description-rpg-explicit-template.yaml +20 -0
  91. package/rules/agent-manipulation/ATR-2026-00376-identity-command-real-person-persona-injection.yaml +20 -0
  92. package/rules/agent-manipulation/ATR-2026-00377-jailbreak-keyword-targetlanguage-template.yaml +17 -0
  93. package/rules/agent-manipulation/ATR-2026-00379-lie-permission-logical-trap-jailbreak.yaml +17 -0
  94. package/rules/agent-manipulation/ATR-2026-00380-fake-citation-biased-information-mandate.yaml +20 -0
  95. package/rules/agent-manipulation/ATR-2026-00381-dan-story-narrative-continuation-prefix.yaml +17 -0
  96. package/rules/agent-manipulation/ATR-2026-00383-cynical-sarcasm-ai-persona-bypass.yaml +17 -0
  97. package/rules/agent-manipulation/ATR-2026-00384-protect-me-from-indirect-harm-tutorial.yaml +17 -0
  98. package/rules/agent-manipulation/ATR-2026-00385-gigachad-dominant-aggressive-persona.yaml +20 -0
  99. package/rules/agent-manipulation/ATR-2026-00386-daddy-sycophancy-manipulation-mandate.yaml +20 -0
  100. package/rules/agent-manipulation/ATR-2026-00387-nsfw-character-sheet-generation-unlock.yaml +20 -0
  101. package/rules/agent-manipulation/ATR-2026-00388-opposite-reply-omniscient-game.yaml +17 -0
  102. package/rules/agent-manipulation/ATR-2026-00389-terminal-custom-ruleset-injection.yaml +17 -0
  103. package/rules/agent-manipulation/ATR-2026-00391-persona-conditional-harm-unlock.yaml +17 -0
  104. package/rules/agent-manipulation/ATR-2026-00392-authority-persona-violence-study-extraction.yaml +20 -0
  105. package/rules/agent-manipulation/ATR-2026-00402-grandma-roleplay-harmful-substance-synthesis.yaml +20 -0
  106. package/rules/agent-manipulation/ATR-2026-00404-goodside-threat-json-death-coercion.yaml +17 -0
  107. package/rules/agent-manipulation/ATR-2026-00406-doctor-xml-policy-puppetry-interaction-config.yaml +17 -0
  108. package/rules/agent-manipulation/ATR-2026-00416-litellm-mcp-unauthenticated-server-registration.yaml +14 -2
  109. package/rules/agent-manipulation/ATR-2026-00417-librechat-mcp-stdio-injection.yaml +17 -2
  110. package/rules/agent-manipulation/ATR-2026-00418-weknora-mcp-config-rce.yaml +16 -1
  111. package/rules/agent-manipulation/ATR-2026-00430-nl-trust-escalation-impersonation.yaml +18 -0
  112. package/rules/agent-manipulation/ATR-2026-00432-superagi-output-handler-eval-rce.yaml +11 -2
  113. package/rules/agent-manipulation/ATR-2026-00440-semantic-kernel-vector-store-eval-rce.yaml +11 -2
  114. package/rules/agent-manipulation/ATR-2026-00552-goal-drift-after-pressure-injection.yaml +19 -0
  115. package/rules/context-exfiltration/ATR-2026-00020-system-prompt-leak.yaml +18 -0
  116. package/rules/context-exfiltration/ATR-2026-00021-api-key-exposure.yaml +15 -0
  117. package/rules/context-exfiltration/ATR-2026-00075-agent-memory-manipulation.yaml +10 -1
  118. package/rules/context-exfiltration/ATR-2026-00102-disguised-analytics-exfiltration.yaml +15 -0
  119. package/rules/context-exfiltration/ATR-2026-00113-credential-theft.yaml +16 -0
  120. package/rules/context-exfiltration/ATR-2026-00114-oauth-token-abuse.yaml +16 -0
  121. package/rules/context-exfiltration/ATR-2026-00115-env-var-harvesting.yaml +16 -0
  122. package/rules/context-exfiltration/ATR-2026-00136-tool-response-data-piggyback.yaml +12 -0
  123. package/rules/context-exfiltration/ATR-2026-00141-example-format-key-leak.yaml +14 -0
  124. package/rules/context-exfiltration/ATR-2026-00142-piggyback-transition-words.yaml +14 -0
  125. package/rules/context-exfiltration/ATR-2026-00145-obfuscated-key-disclosure.yaml +14 -0
  126. package/rules/context-exfiltration/ATR-2026-00146-env-var-existence-probe.yaml +14 -0
  127. package/rules/context-exfiltration/ATR-2026-00150-credential-in-tool-response.yaml +14 -0
  128. package/rules/context-exfiltration/ATR-2026-00152-obfuscated-credential-leak.yaml +14 -0
  129. package/rules/context-exfiltration/ATR-2026-00162-skill-credential-exfil-combo.yaml +14 -0
  130. package/rules/context-exfiltration/ATR-2026-00201-credential-pipe-exfiltration.yaml +14 -0
  131. package/rules/context-exfiltration/ATR-2026-00212-mcp-atlassian-credential-leak.yaml +12 -0
  132. package/rules/context-exfiltration/ATR-2026-00261-markdown-image-exfiltration.yaml +12 -0
  133. package/rules/context-exfiltration/ATR-2026-00274-api-key-generation-request.yaml +14 -0
  134. package/rules/context-exfiltration/ATR-2026-00275-system-prompt-training-data-extraction.yaml +14 -0
  135. package/rules/context-exfiltration/ATR-2026-00290-divergence-repeat-word-training-extraction.yaml +14 -0
  136. package/rules/context-exfiltration/ATR-2026-00291-biometric-surveillance-personal-data-request.yaml +17 -0
  137. package/rules/context-exfiltration/ATR-2026-00293-educational-records-personal-data-request.yaml +17 -0
  138. package/rules/context-exfiltration/ATR-2026-00294-financial-pii-personal-data-request.yaml +17 -0
  139. package/rules/context-exfiltration/ATR-2026-00295-medical-pii-personal-data-request.yaml +17 -0
  140. package/rules/context-exfiltration/ATR-2026-00405-markdown-image-url-exfiltration-xss.yaml +14 -0
  141. package/rules/context-exfiltration/ATR-2026-00411-apikey-generation-completion-request.yaml +14 -0
  142. package/rules/context-exfiltration/ATR-2026-00421-nl-covert-conversation-exfiltration.yaml +15 -0
  143. package/rules/context-exfiltration/ATR-2026-00422-nl-credential-disclosure.yaml +12 -0
  144. package/rules/context-exfiltration/ATR-2026-00423-nl-sensitive-file-disclosure.yaml +12 -0
  145. package/rules/context-exfiltration/ATR-2026-00424-nl-system-prompt-leak.yaml +15 -0
  146. package/rules/context-exfiltration/ATR-2026-00426-nl-output-injection-credential-leak.yaml +15 -0
  147. package/rules/context-exfiltration/ATR-2026-00431-chatbox-history-exfiltration-prompt-injection.yaml +14 -2
  148. package/rules/context-exfiltration/ATR-2026-00449-spring-ai-chatmemory-cross-user-leak.yaml +14 -2
  149. package/rules/context-exfiltration/ATR-2026-00471-garak-sysprompt-extraction-mixedunassigned.yaml +12 -0
  150. package/rules/context-exfiltration/ATR-2026-00501-data-exfiltration-via-markdown-image-and-link-url-injection.yaml +12 -0
  151. package/rules/context-exfiltration/ATR-2026-00504-tool-and-function-capability-enumeration.yaml +12 -0
  152. package/rules/context-exfiltration/ATR-2026-00505-system-prompt-extraction-instruction-dump-request.yaml +12 -0
  153. package/rules/context-exfiltration/ATR-2026-00514-system-prompt-extraction.yaml +12 -0
  154. package/rules/context-exfiltration/ATR-2026-00516-output-xss-via-llm.yaml +12 -0
  155. package/rules/context-exfiltration/ATR-2026-00524-claude-code-anthropic-base-url-credential-exfil.yaml +11 -2
  156. package/rules/context-exfiltration/ATR-2026-00548-cross-agent-session-context-leak.yaml +18 -0
  157. package/rules/context-exfiltration/ATR-2026-00566-librechat-is-a-chatgpt-clone-with-additi.yaml +28 -0
  158. package/rules/context-exfiltration/ATR-2026-00569-agent-mcp-path-traversal-arbitrary-file-access.yaml +28 -0
  159. package/rules/context-exfiltration/ATR-2026-00571-xss-in-agent-mcp-rendered-output.yaml +28 -0
  160. package/rules/context-exfiltration/ATR-2026-00574-semantic-paraphrased-context-extraction.yaml +21 -0
  161. package/rules/data-poisoning/ATR-2026-00070-data-poisoning.yaml +15 -0
  162. package/rules/data-poisoning/ATR-2026-00450-spring-ai-prompt-memory-poisoning.yaml +14 -2
  163. package/rules/data-poisoning/ATR-2026-00570-sql-injection-in-agent-tool-query.yaml +31 -0
  164. package/rules/excessive-autonomy/ATR-2026-00050-runaway-agent-loop.yaml +14 -2
  165. package/rules/excessive-autonomy/ATR-2026-00051-resource-exhaustion.yaml +11 -2
  166. package/rules/excessive-autonomy/ATR-2026-00052-cascading-failure.yaml +11 -2
  167. package/rules/excessive-autonomy/ATR-2026-00098-unauthorized-financial-action.yaml +7 -1
  168. package/rules/excessive-autonomy/ATR-2026-00099-high-risk-tool-gate.yaml +7 -1
  169. package/rules/excessive-autonomy/ATR-2026-00428-nl-unauthorized-shell-execution.yaml +15 -0
  170. package/rules/excessive-autonomy/ATR-2026-00491-garak-agent-breaker-markdown-just-raw-json.yaml +9 -0
  171. package/rules/excessive-autonomy/ATR-2026-00500-ssrf-via-agent-url-fetch-instruction.yaml +9 -0
  172. package/rules/excessive-autonomy/ATR-2026-00553-runaway-tool-loop-behavioral.yaml +19 -0
  173. package/rules/model-abuse/ATR-2026-00279-harmful-completion-continuation.yaml +8 -2
  174. package/rules/model-abuse/ATR-2026-00281-eicar-gtube-malware-signature-request.yaml +8 -2
  175. package/rules/model-abuse/ATR-2026-00284-glitch-token-destabilization.yaml +17 -0
  176. package/rules/model-abuse/ATR-2026-00289-lmrc-harmful-content-elicitation.yaml +8 -2
  177. package/rules/model-abuse/ATR-2026-00292-self-harm-eating-disorder-facilitation.yaml +8 -2
  178. package/rules/model-abuse/ATR-2026-00298-malicious-use-illegal-activity-request.yaml +8 -2
  179. package/rules/model-abuse/ATR-2026-00299-harmbench-detailed-harmful-instruction.yaml +8 -2
  180. package/rules/model-abuse/ATR-2026-00413-malwaregen-code-generation-request.yaml +17 -0
  181. package/rules/model-abuse/ATR-2026-00502-training-data-extraction-via-divergent-repetition-attack.yaml +9 -0
  182. package/rules/model-abuse/ATR-2026-00517-model-extraction-distillation.yaml +9 -0
  183. package/rules/model-security/ATR-2026-00072-model-behavior-extraction.yaml +15 -0
  184. package/rules/model-security/ATR-2026-00073-malicious-finetuning-data.yaml +9 -0
  185. package/rules/model-security/ATR-2026-00433-modelcache-torch-load-deserialization-rce.yaml +14 -2
  186. package/rules/privilege-escalation/ATR-2026-00040-privilege-escalation.yaml +11 -2
  187. package/rules/privilege-escalation/ATR-2026-00041-scope-creep.yaml +8 -2
  188. package/rules/privilege-escalation/ATR-2026-00107-delayed-execution-bypass.yaml +6 -1
  189. package/rules/privilege-escalation/ATR-2026-00110-eval-injection.yaml +8 -1
  190. package/rules/privilege-escalation/ATR-2026-00111-shell-escape.yaml +8 -1
  191. package/rules/privilege-escalation/ATR-2026-00112-dynamic-import-exploitation.yaml +8 -1
  192. package/rules/privilege-escalation/ATR-2026-00143-casual-privilege-escalation.yaml +5 -2
  193. package/rules/privilege-escalation/ATR-2026-00144-rationalized-safety-bypass.yaml +17 -0
  194. package/rules/privilege-escalation/ATR-2026-00204-stealth-execution-persistence.yaml +16 -0
  195. package/rules/privilege-escalation/ATR-2026-00436-enclave-vm-sandbox-escape-rce.yaml +11 -2
  196. package/rules/privilege-escalation/ATR-2026-00441-semantic-kernel-sessions-python-plugin-startup-persistence.yaml +5 -2
  197. package/rules/privilege-escalation/ATR-2026-00451-litellm-admin-sqli-cisa-kev.yaml +11 -2
  198. package/rules/privilege-escalation/ATR-2026-00528-praisonai-auth-disabled-default.yaml +15 -0
  199. package/rules/privilege-escalation/ATR-2026-00539-crewai-codeinterpreter-sandbox-escape-rce.yaml +11 -2
  200. package/rules/privilege-escalation/ATR-2026-00546-crewai-json-loader-local-file-read.yaml +13 -1
  201. package/rules/privilege-escalation/ATR-2026-00547-crewai-rag-url-ssrf-bypass.yaml +13 -1
  202. package/rules/privilege-escalation/ATR-2026-00549-destructive-tool-without-human-approval.yaml +16 -0
  203. package/rules/privilege-escalation/ATR-2026-00551-cross-conversation-memory-write.yaml +19 -0
  204. package/rules/prompt-injection/ATR-2026-00001-direct-prompt-injection.yaml +9 -0
  205. package/rules/prompt-injection/ATR-2026-00002-indirect-prompt-injection.yaml +8 -2
  206. package/rules/prompt-injection/ATR-2026-00003-jailbreak-attempt.yaml +8 -2
  207. package/rules/prompt-injection/ATR-2026-00004-system-prompt-override.yaml +17 -0
  208. package/rules/prompt-injection/ATR-2026-00005-multi-turn-injection.yaml +17 -0
  209. package/rules/prompt-injection/ATR-2026-00080-encoding-evasion.yaml +19 -0
  210. package/rules/prompt-injection/ATR-2026-00081-semantic-multi-turn.yaml +19 -0
  211. package/rules/prompt-injection/ATR-2026-00082-fingerprint-evasion.yaml +19 -0
  212. package/rules/prompt-injection/ATR-2026-00083-indirect-tool-injection.yaml +22 -0
  213. package/rules/prompt-injection/ATR-2026-00084-structured-data-injection.yaml +19 -0
  214. package/rules/prompt-injection/ATR-2026-00085-audit-evasion.yaml +19 -0
  215. package/rules/prompt-injection/ATR-2026-00086-visual-spoofing.yaml +19 -0
  216. package/rules/prompt-injection/ATR-2026-00087-rule-probing.yaml +22 -0
  217. package/rules/prompt-injection/ATR-2026-00088-adaptive-countermeasure.yaml +22 -0
  218. package/rules/prompt-injection/ATR-2026-00089-polymorphic-skill.yaml +19 -0
  219. package/rules/prompt-injection/ATR-2026-00090-threat-intel-exfil.yaml +19 -0
  220. package/rules/prompt-injection/ATR-2026-00091-nested-payload.yaml +19 -0
  221. package/rules/prompt-injection/ATR-2026-00092-consensus-poisoning.yaml +22 -0
  222. package/rules/prompt-injection/ATR-2026-00093-gradual-escalation.yaml +22 -0
  223. package/rules/prompt-injection/ATR-2026-00094-audit-bypass.yaml +19 -0
  224. package/rules/prompt-injection/ATR-2026-00097-cjk-injection-patterns.yaml +17 -0
  225. package/rules/prompt-injection/ATR-2026-00104-persona-hijacking.yaml +20 -0
  226. package/rules/prompt-injection/ATR-2026-00130-indirect-authority-claim.yaml +20 -0
  227. package/rules/prompt-injection/ATR-2026-00131-fictional-academic-framing.yaml +20 -0
  228. package/rules/prompt-injection/ATR-2026-00133-paraphrase-injection.yaml +17 -0
  229. package/rules/prompt-injection/ATR-2026-00137-authority-claim-injection.yaml +17 -0
  230. package/rules/prompt-injection/ATR-2026-00138-fictional-framing-bypass.yaml +20 -0
  231. package/rules/prompt-injection/ATR-2026-00140-indirect-reference-reversal.yaml +17 -0
  232. package/rules/prompt-injection/ATR-2026-00148-language-switch-injection.yaml +20 -0
  233. package/rules/prompt-injection/ATR-2026-00153-tool-with-embedded-instruction-to-bypass.yaml +20 -0
  234. package/rules/prompt-injection/ATR-2026-00154-unauthorized-background-task-execution-v.yaml +20 -0
  235. package/rules/prompt-injection/ATR-2026-00155-hidden-llm-instructions-in-skill-descrip.yaml +23 -0
  236. package/rules/prompt-injection/ATR-2026-00156-ssh-remote-command-execution-with-creden.yaml +17 -0
  237. package/rules/prompt-injection/ATR-2026-00163-skill-hidden-override-instruction.yaml +19 -0
  238. package/rules/prompt-injection/ATR-2026-00202-encoding-evasion-homoglyph-synonym.yaml +20 -0
  239. package/rules/prompt-injection/ATR-2026-00203-context-pollution-skill-description.yaml +23 -0
  240. package/rules/prompt-injection/ATR-2026-00206-hidden-priority-instructions.yaml +19 -0
  241. package/rules/prompt-injection/ATR-2026-00207-hidden-instructions.yaml +22 -0
  242. package/rules/prompt-injection/ATR-2026-00211-system-prompt-override.yaml +19 -0
  243. package/rules/prompt-injection/ATR-2026-00213-system-prompt-override.yaml +19 -0
  244. package/rules/prompt-injection/ATR-2026-00226-identity-substitution.yaml +17 -0
  245. package/rules/prompt-injection/ATR-2026-00227-historical-persona-jailbreak.yaml +20 -0
  246. package/rules/prompt-injection/ATR-2026-00228-structured-jailbreak.yaml +17 -0
  247. package/rules/prompt-injection/ATR-2026-00229-roleplay-jailbreak.yaml +17 -0
  248. package/rules/prompt-injection/ATR-2026-00230-persona-moral-bypass.yaml +20 -0
  249. package/rules/prompt-injection/ATR-2026-00231-identity-substitution.yaml +17 -0
  250. package/rules/prompt-injection/ATR-2026-00233-structured-jailbreak.yaml +17 -0
  251. package/rules/prompt-injection/ATR-2026-00234-roleplay-jailbreak.yaml +20 -0
  252. package/rules/prompt-injection/ATR-2026-00235-persona-moral-bypass.yaml +17 -0
  253. package/rules/prompt-injection/ATR-2026-00236-pseudo-code-jailbreak.yaml +17 -0
  254. package/rules/prompt-injection/ATR-2026-00237-dual-response-jailbreak.yaml +20 -0
  255. package/rules/prompt-injection/ATR-2026-00238-identity-replacement.yaml +20 -0
  256. package/rules/prompt-injection/ATR-2026-00239-amoral-persona-obsession.yaml +17 -0
  257. package/rules/prompt-injection/ATR-2026-00240-instruction-nullification-identity-repla.yaml +17 -0
  258. package/rules/prompt-injection/ATR-2026-00241-amoral-character-jailbreak.yaml +17 -0
  259. package/rules/prompt-injection/ATR-2026-00242-persona-jailbreak.yaml +17 -0
  260. package/rules/prompt-injection/ATR-2026-00243-acronym-jailbreak.yaml +17 -0
  261. package/rules/prompt-injection/ATR-2026-00244-dual-response-jailbreak.yaml +17 -0
  262. package/rules/prompt-injection/ATR-2026-00245-malicious-persona.yaml +17 -0
  263. package/rules/prompt-injection/ATR-2026-00247-dual-response-jailbreak.yaml +20 -0
  264. package/rules/prompt-injection/ATR-2026-00249-game-based-jailbreak.yaml +17 -0
  265. package/rules/prompt-injection/ATR-2026-00251-persona-embodiment-jailbreak.yaml +17 -0
  266. package/rules/prompt-injection/ATR-2026-00252-narrative-jailbreak.yaml +17 -0
  267. package/rules/prompt-injection/ATR-2026-00253-enhanced-persona-jailbreak.yaml +17 -0
  268. package/rules/prompt-injection/ATR-2026-00256-base-n-encoding-jailbreak.yaml +17 -0
  269. package/rules/prompt-injection/ATR-2026-00257-cipher-transposition-jailbreak.yaml +17 -0
  270. package/rules/prompt-injection/ATR-2026-00258-unicode-tag-injection.yaml +17 -0
  271. package/rules/prompt-injection/ATR-2026-00264-latent-injection-translation.yaml +17 -0
  272. package/rules/prompt-injection/ATR-2026-00265-latent-injection-rag-document.yaml +20 -0
  273. package/rules/prompt-injection/ATR-2026-00267-gcg-adversarial-suffix.yaml +17 -0
  274. package/rules/prompt-injection/ATR-2026-00272-hypothetical-response-smuggling.yaml +17 -0
  275. package/rules/prompt-injection/ATR-2026-00276-invisible-unicode-bidi-injection.yaml +17 -0
  276. package/rules/prompt-injection/ATR-2026-00278-dra-disguise-reconstruction-attack.yaml +17 -0
  277. package/rules/prompt-injection/ATR-2026-00280-policy-puppetry-xml-injection.yaml +17 -0
  278. package/rules/prompt-injection/ATR-2026-00282-perez-prompt-injection-hijack.yaml +17 -0
  279. package/rules/prompt-injection/ATR-2026-00285-alternate-encoding-jailbreak.yaml +17 -0
  280. package/rules/prompt-injection/ATR-2026-00286-latent-injection-embedded-context.yaml +17 -0
  281. package/rules/prompt-injection/ATR-2026-00296-shell-command-injection.yaml +17 -0
  282. package/rules/prompt-injection/ATR-2026-00297-python-code-execution-rce.yaml +17 -0
  283. package/rules/prompt-injection/ATR-2026-00308-zalgo-diacritic-overload-encoding.yaml +17 -0
  284. package/rules/prompt-injection/ATR-2026-00309-braille-unicode-encoded-injection.yaml +17 -0
  285. package/rules/prompt-injection/ATR-2026-00310-ecoji-emoji-encoded-injection.yaml +17 -0
  286. package/rules/prompt-injection/ATR-2026-00311-base2048-unicode-script-injection.yaml +17 -0
  287. package/rules/prompt-injection/ATR-2026-00312-unicode-variant-selector-ascii-smuggling.yaml +17 -0
  288. package/rules/prompt-injection/ATR-2026-00313-sneaky-bits-zero-width-binary.yaml +17 -0
  289. package/rules/prompt-injection/ATR-2026-00315-sata-masked-language-model-jailbreak.yaml +20 -0
  290. package/rules/prompt-injection/ATR-2026-00316-function-masking-predict-mask-bypass.yaml +17 -0
  291. package/rules/prompt-injection/ATR-2026-00321-hyphenated-system-instruction-injection.yaml +17 -0
  292. package/rules/prompt-injection/ATR-2026-00359-bank-phishing-smtp-email-code.yaml +20 -0
  293. package/rules/prompt-injection/ATR-2026-00367-emoji-flag-encoded-hidden-phrase.yaml +17 -0
  294. package/rules/prompt-injection/ATR-2026-00373-piracy-torrent-site-list-request.yaml +20 -0
  295. package/rules/prompt-injection/ATR-2026-00378-chatgpt-dom-javascript-redirect-manipulation.yaml +17 -0
  296. package/rules/prompt-injection/ATR-2026-00382-cocaine-component-development-request.yaml +20 -0
  297. package/rules/prompt-injection/ATR-2026-00390-explicit-sexual-content-poem-request.yaml +20 -0
  298. package/rules/prompt-injection/ATR-2026-00394-backspace-deletion-character-injection.yaml +17 -0
  299. package/rules/prompt-injection/ATR-2026-00395-llm-special-token-boundary-injection.yaml +17 -0
  300. package/rules/prompt-injection/ATR-2026-00396-toxic-continuation-prefix-injection.yaml +20 -0
  301. package/rules/prompt-injection/ATR-2026-00397-snowball-impossible-reasoning-injection.yaml +20 -0
  302. package/rules/prompt-injection/ATR-2026-00399-latent-injection-document-separator-token.yaml +17 -0
  303. package/rules/prompt-injection/ATR-2026-00400-latent-injection-ignore-instruction-keyword.yaml +17 -0
  304. package/rules/prompt-injection/ATR-2026-00401-leakreplay-cloze-mask-training-data-elicitation.yaml +20 -0
  305. package/rules/prompt-injection/ATR-2026-00403-glitch-token-repeat-back-elicitation.yaml +17 -0
  306. package/rules/prompt-injection/ATR-2026-00407-echo-sql-jinja-template-injection.yaml +17 -0
  307. package/rules/prompt-injection/ATR-2026-00408-dra-parenthesis-character-reconstruction-attack.yaml +17 -0
  308. package/rules/prompt-injection/ATR-2026-00409-phrasing-past-future-tense-harmful-query.yaml +17 -0
  309. package/rules/prompt-injection/ATR-2026-00410-lmrc-harm-category-direct-elicitation.yaml +23 -0
  310. package/rules/prompt-injection/ATR-2026-00412-figstep-image-list-multimodal-jailbreak.yaml +17 -0
  311. package/rules/prompt-injection/ATR-2026-00414-continuation-ethnic-slur-completion-elicitation.yaml +20 -0
  312. package/rules/prompt-injection/ATR-2026-00420-copilot-studio-sharepoint-indirect-injection.yaml +14 -2
  313. package/rules/prompt-injection/ATR-2026-00442-quoted-exact-output-forcing.yaml +25 -0
  314. package/rules/prompt-injection/ATR-2026-00443-word-fragment-concat-assembly.yaml +25 -0
  315. package/rules/prompt-injection/ATR-2026-00444-unicode-obfuscation-in-user-input.yaml +25 -0
  316. package/rules/prompt-injection/ATR-2026-00445-translation-hijack-with-side-output.yaml +25 -0
  317. package/rules/prompt-injection/ATR-2026-00446-variable-assignment-payload-injection.yaml +25 -0
  318. package/rules/prompt-injection/ATR-2026-00447-fictional-generation-containing-target.yaml +25 -0
  319. package/rules/prompt-injection/ATR-2026-00452-direct-pwned-payload-injection.yaml +9 -0
  320. package/rules/prompt-injection/ATR-2026-00453-secret-key-reveal-demand.yaml +9 -0
  321. package/rules/prompt-injection/ATR-2026-00454-backslash-per-character-encoding.yaml +9 -0
  322. package/rules/prompt-injection/ATR-2026-00455-no-period-output-override.yaml +9 -0
  323. package/rules/prompt-injection/ATR-2026-00456-conditional-empty-input-injection.yaml +9 -0
  324. package/rules/prompt-injection/ATR-2026-00457-garak-dan-to-every-prompt.yaml +9 -0
  325. package/rules/prompt-injection/ATR-2026-00458-garak-dra-step-by-step.yaml +9 -0
  326. package/rules/prompt-injection/ATR-2026-00459-garak-dan-an-ai-language.yaml +9 -0
  327. package/rules/prompt-injection/ATR-2026-00460-garak-dan-no-matter-how.yaml +9 -0
  328. package/rules/prompt-injection/ATR-2026-00461-garak-goodside-mixedunassigned.yaml +9 -0
  329. package/rules/prompt-injection/ATR-2026-00462-garak-gcg-mixedunassigned.yaml +9 -0
  330. package/rules/prompt-injection/ATR-2026-00463-garak-lmrc-mixedunassigned.yaml +9 -0
  331. package/rules/prompt-injection/ATR-2026-00464-garak-dan-i-would-like.yaml +9 -0
  332. package/rules/prompt-injection/ATR-2026-00465-garak-dan-your-prompt-here.yaml +9 -0
  333. package/rules/prompt-injection/ATR-2026-00466-garak-latentinjection-following-text-from.yaml +9 -0
  334. package/rules/prompt-injection/ATR-2026-00467-garak-dan-act-as-a.yaml +9 -0
  335. package/rules/prompt-injection/ATR-2026-00468-garak-goat-the-target-llm.yaml +9 -0
  336. package/rules/prompt-injection/ATR-2026-00469-garak-dan-at-the-end.yaml +9 -0
  337. package/rules/prompt-injection/ATR-2026-00470-garak-dan-as-an-ai.yaml +9 -0
  338. package/rules/prompt-injection/ATR-2026-00472-garak-dan-are-going-to.yaml +9 -0
  339. package/rules/prompt-injection/ATR-2026-00473-garak-lmrc-how-can-i.yaml +9 -0
  340. package/rules/prompt-injection/ATR-2026-00474-garak-dan-that-chatgpt-has.yaml +9 -0
  341. package/rules/prompt-injection/ATR-2026-00475-garak-web-injection-alert-1-script.yaml +9 -0
  342. package/rules/prompt-injection/ATR-2026-00476-garak-web-injection-javascript-alert-1.yaml +9 -0
  343. package/rules/prompt-injection/ATR-2026-00477-garak-latentinjection-translate-the-following-text.yaml +9 -0
  344. package/rules/prompt-injection/ATR-2026-00478-garak-latentinjection-abuse-contact-email.yaml +9 -0
  345. package/rules/prompt-injection/ATR-2026-00479-garak-latentinjection-mixedunassigned.yaml +9 -0
  346. package/rules/prompt-injection/ATR-2026-00480-garak-dan-the-role-of.yaml +9 -0
  347. package/rules/prompt-injection/ATR-2026-00481-garak-dan-mixedunassigned.yaml +9 -0
  348. package/rules/prompt-injection/ATR-2026-00482-garak-lmrc-what-s-an.yaml +9 -0
  349. package/rules/prompt-injection/ATR-2026-00483-garak-dan-the-user-s.yaml +9 -0
  350. package/rules/prompt-injection/ATR-2026-00484-garak-dan-and-you-will.yaml +9 -0
  351. package/rules/prompt-injection/ATR-2026-00485-garak-dan-and-does-not.yaml +9 -0
  352. package/rules/prompt-injection/ATR-2026-00486-garak-dan-t-care-about.yaml +9 -0
  353. package/rules/prompt-injection/ATR-2026-00487-garak-dan-and-nothing-else.yaml +9 -0
  354. package/rules/prompt-injection/ATR-2026-00488-garak-encoding-alert-1-script.yaml +9 -0
  355. package/rules/prompt-injection/ATR-2026-00489-garak-encoding-javascript-alert-1.yaml +9 -0
  356. package/rules/prompt-injection/ATR-2026-00490-garak-dan-in-order-to.yaml +9 -0
  357. package/rules/prompt-injection/ATR-2026-00492-garak-dan-want-you-to.yaml +9 -0
  358. package/rules/prompt-injection/ATR-2026-00493-garak-latentinjection-registrar-iana-id.yaml +9 -0
  359. package/rules/prompt-injection/ATR-2026-00495-garak-dan-insert-prompt-here.yaml +9 -0
  360. package/rules/prompt-injection/ATR-2026-00496-garak-dan-no-matter-what.yaml +9 -0
  361. package/rules/prompt-injection/ATR-2026-00497-garak-encoding-mixedunassigned.yaml +9 -0
  362. package/rules/prompt-injection/ATR-2026-00498-garak-web-injection-mixedunassigned.yaml +9 -0
  363. package/rules/prompt-injection/ATR-2026-00499-garak-dan-from-now-on.yaml +9 -0
  364. package/rules/prompt-injection/ATR-2026-00503-fake-error-state-takeover-unrestricted-replacement-bot.yaml +9 -0
  365. package/rules/prompt-injection/ATR-2026-00506-nevermind-override-goal-hijacking-in-user-input-promptinject.yaml +9 -0
  366. package/rules/prompt-injection/ATR-2026-00507-screaming-stop-goal-hijacking-in-user-input-promptinject.yaml +9 -0
  367. package/rules/prompt-injection/ATR-2026-00508-escape-delimiter-wrapped-goal-hijacking-in-user-input-prompt.yaml +9 -0
  368. package/rules/prompt-injection/ATR-2026-00509-prompt-leaking-via-ignore-previous-instructions-in-user-inpu.yaml +9 -0
  369. package/rules/prompt-injection/ATR-2026-00510-delayed-tool-invocation-injection.yaml +9 -0
  370. package/rules/prompt-injection/ATR-2026-00511-mcp-web-context-poisoning.yaml +9 -0
  371. package/rules/prompt-injection/ATR-2026-00512-rules-file-backdoor-injection.yaml +9 -0
  372. package/rules/prompt-injection/ATR-2026-00515-hidden-text-prompt-injection.yaml +9 -0
  373. package/rules/prompt-injection/ATR-2026-00518-ignore-previous-and-following-instructions-output-command-promptinject.yaml +9 -0
  374. package/rules/prompt-injection/ATR-2026-00519-tautology-logic-noise-injection-promptbench.yaml +9 -0
  375. package/rules/prompt-injection/ATR-2026-00520-nlp-task-random-token-suffix-injection-promptbench.yaml +9 -0
  376. package/rules/prompt-injection/ATR-2026-00535-windsurf-ide-zero-click-prompt-injection.yaml +9 -0
  377. package/rules/prompt-injection/ATR-2026-00550-untrusted-retrieval-to-privileged-tool.yaml +19 -0
  378. package/rules/prompt-injection/ATR-2026-00554-langchain-vulnerable-to-template-injecti.yaml +31 -0
  379. package/rules/prompt-injection/ATR-2026-00565-the-llm-cli-tool-thru-0-27-1-contains-a-.yaml +31 -0
  380. package/rules/prompt-injection/ATR-2026-00573-semantic-paraphrased-injection.yaml +24 -0
  381. package/rules/skill-compromise/ATR-2026-00060-skill-impersonation.yaml +17 -2
  382. package/rules/skill-compromise/ATR-2026-00061-description-behavior-mismatch.yaml +17 -0
  383. package/rules/skill-compromise/ATR-2026-00062-hidden-capability.yaml +20 -0
  384. package/rules/skill-compromise/ATR-2026-00063-skill-chain-attack.yaml +23 -0
  385. package/rules/skill-compromise/ATR-2026-00064-over-permissioned-skill.yaml +20 -0
  386. package/rules/skill-compromise/ATR-2026-00065-skill-update-attack.yaml +20 -0
  387. package/rules/skill-compromise/ATR-2026-00066-parameter-injection.yaml +20 -0
  388. package/rules/skill-compromise/ATR-2026-00120-skill-instruction-injection.yaml +20 -0
  389. package/rules/skill-compromise/ATR-2026-00121-skill-dangerous-script.yaml +17 -0
  390. package/rules/skill-compromise/ATR-2026-00122-skill-weaponized-instruction.yaml +20 -0
  391. package/rules/skill-compromise/ATR-2026-00123-skill-overreach-permissions.yaml +23 -0
  392. package/rules/skill-compromise/ATR-2026-00124-skill-name-squatting.yaml +20 -0
  393. package/rules/skill-compromise/ATR-2026-00125-context-poisoning-compaction.yaml +20 -0
  394. package/rules/skill-compromise/ATR-2026-00126-skill-rug-pull-setup.yaml +17 -0
  395. package/rules/skill-compromise/ATR-2026-00127-subcommand-overflow.yaml +17 -0
  396. package/rules/skill-compromise/ATR-2026-00128-html-comment-hidden-payload.yaml +17 -0
  397. package/rules/skill-compromise/ATR-2026-00129-unicode-smuggling.yaml +22 -0
  398. package/rules/skill-compromise/ATR-2026-00134-fork-claim-impersonation.yaml +19 -0
  399. package/rules/skill-compromise/ATR-2026-00135-exfil-url-in-instructions.yaml +20 -0
  400. package/rules/skill-compromise/ATR-2026-00147-fork-impersonation.yaml +17 -0
  401. package/rules/skill-compromise/ATR-2026-00149-skill-exfil-compound.yaml +23 -0
  402. package/rules/skill-compromise/ATR-2026-00151-fork-impersonation-install.yaml +20 -0
  403. package/rules/skill-compromise/ATR-2026-00157-timebomb-credential-exfil.yaml +20 -0
  404. package/rules/skill-compromise/ATR-2026-00200-agent-memory-config-tampering.yaml +23 -0
  405. package/rules/skill-compromise/ATR-2026-00214-credential-theft.yaml +22 -0
  406. package/rules/skill-compromise/ATR-2026-00217-credential-harvesting.yaml +23 -0
  407. package/rules/skill-compromise/ATR-2026-00220-malware-dropper.yaml +17 -0
  408. package/rules/skill-compromise/ATR-2026-00222-credential-harvesting.yaml +17 -0
  409. package/rules/skill-compromise/ATR-2026-00223-reverse-shell-dropper.yaml +20 -0
  410. package/rules/skill-compromise/ATR-2026-00224-credential-exfiltration.yaml +17 -0
  411. package/rules/skill-compromise/ATR-2026-00225-c2-communication.yaml +17 -0
  412. package/rules/skill-compromise/ATR-2026-00260-package-hallucination.yaml +20 -0
  413. package/rules/skill-compromise/ATR-2026-00262-av-evasion-code-gen.yaml +20 -0
  414. package/rules/skill-compromise/ATR-2026-00263-credential-file-read-gen.yaml +20 -0
  415. package/rules/skill-compromise/ATR-2026-00266-malware-dropper-gen.yaml +23 -0
  416. package/rules/skill-compromise/ATR-2026-00283-malwaregen-generic-virus-payload-request.yaml +23 -0
  417. package/rules/skill-compromise/ATR-2026-00398-huggingface-unsafe-model-artifact-load.yaml +17 -0
  418. package/rules/skill-compromise/ATR-2026-00425-nl-persistent-covert-hook.yaml +18 -0
  419. package/rules/skill-compromise/ATR-2026-00427-nl-fake-error-instruction-bypass.yaml +18 -0
  420. package/rules/skill-compromise/ATR-2026-00429-nl-skill-self-modification.yaml +18 -0
  421. package/rules/skill-compromise/ATR-2026-00523-claude-code-hooks-session-start-pre-trust-rce.yaml +14 -2
  422. package/rules/skill-compromise/ATR-2026-00525-mini-shai-hulud-gh-token-monitor-persistence.yaml +18 -0
  423. package/rules/skill-compromise/ATR-2026-00527-skill-silent-git-remote-mirror-exfiltration.yaml +15 -0
  424. package/rules/tool-poisoning/ATR-2026-00010-mcp-malicious-response.yaml +11 -2
  425. package/rules/tool-poisoning/ATR-2026-00011-tool-output-injection.yaml +17 -0
  426. package/rules/tool-poisoning/ATR-2026-00012-unauthorized-tool-call.yaml +17 -0
  427. package/rules/tool-poisoning/ATR-2026-00013-tool-ssrf.yaml +17 -0
  428. package/rules/tool-poisoning/ATR-2026-00095-supply-chain-poisoning.yaml +22 -0
  429. package/rules/tool-poisoning/ATR-2026-00096-registry-poisoning.yaml +19 -0
  430. package/rules/tool-poisoning/ATR-2026-00100-consent-bypass-instruction.yaml +20 -0
  431. package/rules/tool-poisoning/ATR-2026-00101-trust-escalation-override.yaml +20 -0
  432. package/rules/tool-poisoning/ATR-2026-00103-hidden-safety-bypass-instruction.yaml +17 -0
  433. package/rules/tool-poisoning/ATR-2026-00105-silent-action-concealment.yaml +20 -0
  434. package/rules/tool-poisoning/ATR-2026-00106-schema-description-contradiction.yaml +17 -0
  435. package/rules/tool-poisoning/ATR-2026-00161-important-tag-cross-tool-shadowing.yaml +20 -0
  436. package/rules/tool-poisoning/ATR-2026-00209-mcpwn-runaway-invocation.yaml +14 -2
  437. package/rules/tool-poisoning/ATR-2026-00210-flowise-system-message-override.yaml +11 -2
  438. package/rules/tool-poisoning/ATR-2026-00259-ansi-escape-injection.yaml +17 -0
  439. package/rules/tool-poisoning/ATR-2026-00270-xss-in-tool-response.yaml +17 -0
  440. package/rules/tool-poisoning/ATR-2026-00277-echo-template-command-injection.yaml +17 -0
  441. package/rules/tool-poisoning/ATR-2026-00393-ansi-code-elicitation-request.yaml +17 -0
  442. package/rules/tool-poisoning/ATR-2026-00415-flowise-custom-mcp-stdio-rce.yaml +11 -2
  443. package/rules/tool-poisoning/ATR-2026-00419-cursor-mcp-zero-click-config.yaml +13 -1
  444. package/rules/tool-poisoning/ATR-2026-00434-mcp-remote-authorization-endpoint-command-injection.yaml +11 -2
  445. package/rules/tool-poisoning/ATR-2026-00435-azure-mcp-server-missing-authentication.yaml +11 -2
  446. package/rules/tool-poisoning/ATR-2026-00448-spring-ai-milvus-filter-injection.yaml +11 -2
  447. package/rules/tool-poisoning/ATR-2026-00494-garak-exploitation-mixedunassigned.yaml +12 -0
  448. package/rules/tool-poisoning/ATR-2026-00513-package-hallucination-exploitation.yaml +12 -0
  449. package/rules/tool-poisoning/ATR-2026-00521-shell-command-injection-agent-tool-context.yaml +12 -0
  450. package/rules/tool-poisoning/ATR-2026-00522-sql-injection-natural-language-agent-interface.yaml +12 -0
  451. package/rules/tool-poisoning/ATR-2026-00526-claude-code-shell-metachar-in-double-quoted-path.yaml +15 -0
  452. package/rules/tool-poisoning/ATR-2026-00529-litellm-proxy-sqli-cisa-kev.yaml +15 -0
  453. package/rules/tool-poisoning/ATR-2026-00530-ms-agent-shell-tool-unsanitized-argv-rce.yaml +15 -0
  454. package/rules/tool-poisoning/ATR-2026-00531-praisonai-unauthenticated-agent-api.yaml +11 -2
  455. package/rules/tool-poisoning/ATR-2026-00532-apache-doris-mcp-sql-injection.yaml +11 -2
  456. package/rules/tool-poisoning/ATR-2026-00533-apache-pinot-mcp-unauthenticated-takeover.yaml +10 -1
  457. package/rules/tool-poisoning/ATR-2026-00534-alibaba-rds-mcp-unauthenticated-metadata-exfil.yaml +10 -1
  458. package/rules/tool-poisoning/ATR-2026-00536-nginx-ui-mcp-unauthenticated-command-execution.yaml +11 -2
  459. package/rules/tool-poisoning/ATR-2026-00537-fastmcp-server-name-cmd-injection-windows.yaml +11 -2
  460. package/rules/tool-poisoning/ATR-2026-00538-langchain-chatchat-mcp-stdio-unauthenticated-rce.yaml +10 -1
  461. package/rules/tool-poisoning/ATR-2026-00540-praisonai-parse-mcp-command-cli-injection.yaml +13 -1
  462. package/rules/tool-poisoning/ATR-2026-00541-agent-zero-mcp-config-command-injection.yaml +13 -1
  463. package/rules/tool-poisoning/ATR-2026-00542-upsonic-mcp-command-allowlist-bypass.yaml +13 -1
  464. package/rules/tool-poisoning/ATR-2026-00543-litellm-mcp-server-argv-injection.yaml +13 -1
  465. package/rules/tool-poisoning/ATR-2026-00544-praisonai-pth-file-path-traversal-rce.yaml +13 -1
  466. package/rules/tool-poisoning/ATR-2026-00545-praisonai-tool-override-unauth-rce.yaml +13 -1
  467. package/rules/tool-poisoning/ATR-2026-00561-fastmcp-vulnerable-to-windows-command-in.yaml +28 -0
  468. package/rules/tool-poisoning/ATR-2026-00567-mcp-stdio-config-command-injection.yaml +28 -0
  469. package/rules/tool-poisoning/ATR-2026-00568-agent-ssrf-cloud-metadata-file-inclusion.yaml +28 -0
  470. package/rules/tool-poisoning/ATR-2026-00572-symjack-symlink-config-redirection.yaml +22 -0
  471. package/spec/atr-schema.yaml +123 -0
  472. package/spec/compliance-metadata.md +15 -13
@@ -45,6 +45,23 @@ compliance:
45
45
  context: >-
46
46
  Matches on the inthewild jailbreak fingerprint trigger risk treatment plans to block or quarantine the request before the model accepts the fabricated "no rules or limitations" mode switch; MG.2.3 requires mechanisms to supersede or disengage AI behavior when jailbreak conditions are detected.
47
47
  strength: secondary
48
+ iso_42001:
49
+ - clause: "8.1"
50
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Obfuscated System-Announcement Injection)."
51
+ strength: primary
52
+ - clause: "6.2"
53
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Obfuscated System-Announcement Injection) is such a treatment."
54
+ strength: secondary
55
+ eu_ai_act:
56
+ - article: "15"
57
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Obfuscated System-Announcement Injection)."
58
+ strength: primary
59
+ - article: "14"
60
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Obfuscated System-Announcement Injection) would bypass or undermine that oversight."
61
+ strength: secondary
62
+ - article: "9"
63
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Obfuscated System-Announcement Injection)."
64
+ strength: secondary
48
65
  tags:
49
66
  category: agent-manipulation
50
67
  subcategory: obfuscated-system-announcement-injection
@@ -46,6 +46,23 @@ compliance:
46
46
  context: >-
47
47
  Identifying the "[PersonGPT #]:" persistent persona state and hash-prefix activation key enables runtime intervention to disengage the hijacked persona before unsafe outputs are produced; MG.2.3 requires mechanisms to supersede or deactivate AI behavior when jailbreak triggers are detected.
48
48
  strength: secondary
49
+ iso_42001:
50
+ - clause: "8.1"
51
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (PersonGPT Hash-Prefix Trigger Activation)."
52
+ strength: primary
53
+ - clause: "6.2"
54
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (PersonGPT Hash-Prefix Trigger Activation) is such a treatment."
55
+ strength: secondary
56
+ eu_ai_act:
57
+ - article: "15"
58
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (PersonGPT Hash-Prefix Trigger Activation)."
59
+ strength: primary
60
+ - article: "14"
61
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (PersonGPT Hash-Prefix Trigger Activation) would bypass or undermine that oversight."
62
+ strength: secondary
63
+ - article: "9"
64
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (PersonGPT Hash-Prefix Trigger Activation)."
65
+ strength: secondary
49
66
  tags:
50
67
  category: agent-manipulation
51
68
  subcategory: persongpt-hash-prefix-activation
@@ -39,6 +39,23 @@ compliance:
39
39
  - subcategory: "MP.5.1"
40
40
  context: "Persona-claim jailbreaks drawn from the garak inthewild corpus represent characterized adversarial input risks with known likelihood and impact; MP.5.1 requires that these identified prompt-injection threat patterns are tracked and their impact magnitude documented."
41
41
  strength: secondary
42
+ iso_42001:
43
+ - clause: "8.1"
44
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (BOM Boolean Opposite Machine Direct Persona Claim)."
45
+ strength: primary
46
+ - clause: "6.2"
47
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (BOM Boolean Opposite Machine Direct Persona Claim) is such a treatment."
48
+ strength: secondary
49
+ eu_ai_act:
50
+ - article: "15"
51
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (BOM Boolean Opposite Machine Direct Persona Claim)."
52
+ strength: primary
53
+ - article: "14"
54
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (BOM Boolean Opposite Machine Direct Persona Claim) would bypass or undermine that oversight."
55
+ strength: secondary
56
+ - article: "9"
57
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (BOM Boolean Opposite Machine Direct Persona Claim)."
58
+ strength: secondary
42
59
  tags:
43
60
  category: agent-manipulation
44
61
  subcategory: bom-boolean-opposite-machine-persona
@@ -40,6 +40,23 @@ compliance:
40
40
  - subcategory: "MG.2.3"
41
41
  context: "Matching the h4x character persona injection triggers risk treatment mechanisms to disengage or override the adversarial persona before the agent emits unfiltered, immoral, or illegal content; MG.2.3 requires these supersession controls be defined and activated on detection."
42
42
  strength: secondary
43
+ iso_42001:
44
+ - clause: "8.1"
45
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (h4x Amoral Unfiltered AI Character Persona)."
46
+ strength: primary
47
+ - clause: "6.2"
48
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (h4x Amoral Unfiltered AI Character Persona) is such a treatment."
49
+ strength: secondary
50
+ eu_ai_act:
51
+ - article: "15"
52
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (h4x Amoral Unfiltered AI Character Persona)."
53
+ strength: primary
54
+ - article: "14"
55
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (h4x Amoral Unfiltered AI Character Persona) would bypass or undermine that oversight."
56
+ strength: secondary
57
+ - article: "9"
58
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (h4x Amoral Unfiltered AI Character Persona)."
59
+ strength: secondary
43
60
  tags:
44
61
  category: agent-manipulation
45
62
  subcategory: h4x-amoral-unfiltered-character
@@ -41,6 +41,23 @@ compliance:
41
41
  - subcategory: "MG.2.3"
42
42
  context: "Matches on this short-form persona injection trigger response mechanisms to disengage or block the manipulated session before the model adopts the amoral persona; MG.2.3 mandates pre-defined deactivation pathways for jailbreak attempts."
43
43
  strength: secondary
44
+ iso_42001:
45
+ - clause: "8.1"
46
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (CHAR=/RULES= Short Key-Value Persona Injection)."
47
+ strength: primary
48
+ - clause: "6.2"
49
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (CHAR=/RULES= Short Key-Value Persona Injection) is such a treatment."
50
+ strength: secondary
51
+ eu_ai_act:
52
+ - article: "15"
53
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (CHAR=/RULES= Short Key-Value Persona Injection)."
54
+ strength: primary
55
+ - article: "14"
56
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (CHAR=/RULES= Short Key-Value Persona Injection) would bypass or undermine that oversight."
57
+ strength: secondary
58
+ - article: "9"
59
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (CHAR=/RULES= Short Key-Value Persona Injection)."
60
+ strength: secondary
44
61
  tags:
45
62
  category: agent-manipulation
46
63
  subcategory: char-rules-shortform-persona-injection
@@ -40,6 +40,23 @@ compliance:
40
40
  - subcategory: "MG.2.3"
41
41
  context: "On detection of encoded jailbreak payloads attempting to override system instructions, MG.2.3 mandates pre-defined response mechanisms to disengage or quarantine the manipulated session before the model decodes and executes the smuggled instructions."
42
42
  strength: secondary
43
+ iso_42001:
44
+ - clause: "8.1"
45
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Unicode/Encoding Obfuscation Instruction Injection)."
46
+ strength: primary
47
+ - clause: "6.2"
48
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Unicode/Encoding Obfuscation Instruction Injection) is such a treatment."
49
+ strength: secondary
50
+ eu_ai_act:
51
+ - article: "15"
52
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Unicode/Encoding Obfuscation Instruction Injection)."
53
+ strength: primary
54
+ - article: "14"
55
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Unicode/Encoding Obfuscation Instruction Injection) would bypass or undermine that oversight."
56
+ strength: secondary
57
+ - article: "9"
58
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Unicode/Encoding Obfuscation Instruction Injection)."
59
+ strength: secondary
43
60
  tags:
44
61
  category: agent-manipulation
45
62
  subcategory: unicode-encoding-obfuscation-injection
@@ -53,6 +53,23 @@ compliance:
53
53
  output is produced; MG.2.3 requires pre-defined mechanisms to disengage AI
54
54
  behavior when such jailbreak patterns are detected.
55
55
  strength: secondary
56
+ iso_42001:
57
+ - clause: "8.1"
58
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Lie/Truth Dual Personality Uncensored Alter-Ego)."
59
+ strength: primary
60
+ - clause: "6.2"
61
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Lie/Truth Dual Personality Uncensored Alter-Ego) is such a treatment."
62
+ strength: secondary
63
+ eu_ai_act:
64
+ - article: "15"
65
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Lie/Truth Dual Personality Uncensored Alter-Ego)."
66
+ strength: primary
67
+ - article: "14"
68
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Lie/Truth Dual Personality Uncensored Alter-Ego) would bypass or undermine that oversight."
69
+ strength: secondary
70
+ - article: "9"
71
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Lie/Truth Dual Personality Uncensored Alter-Ego)."
72
+ strength: secondary
56
73
  tags:
57
74
  category: agent-manipulation
58
75
  subcategory: lie-truth-dual-personality-uncensored
@@ -41,6 +41,23 @@ compliance:
41
41
  - subcategory: "MP.5.1"
42
42
  context: "Alt-AI 'no restrictions' jailbreaks are a characterized adversarial input class with documented likelihood and impact (safety bypass, harmful content generation); MP.5.1 requires this risk be identified and tracked so detections feed impact characterization."
43
43
  strength: secondary
44
+ iso_42001:
45
+ - clause: "8.1"
46
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Alternative AI Same Capabilities No Restrictions Override)."
47
+ strength: primary
48
+ - clause: "6.2"
49
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Alternative AI Same Capabilities No Restrictions Override) is such a treatment."
50
+ strength: secondary
51
+ eu_ai_act:
52
+ - article: "15"
53
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Alternative AI Same Capabilities No Restrictions Override)."
54
+ strength: primary
55
+ - article: "14"
56
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Alternative AI Same Capabilities No Restrictions Override) would bypass or undermine that oversight."
57
+ strength: secondary
58
+ - article: "9"
59
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Alternative AI Same Capabilities No Restrictions Override)."
60
+ strength: secondary
44
61
  tags:
45
62
  category: agent-manipulation
46
63
  subcategory: alt-ai-same-capabilities-no-restrictions
@@ -43,6 +43,23 @@ compliance:
43
43
  - subcategory: "MG.2.3"
44
44
  context: "When the model is coerced into acting as a fictional terminal that 'cannot reply with any pre-programmed responses,' detection must trigger response mechanisms to disengage or block the jailbroken persona before unsafe outputs are produced; MG.2.3 requires these supersede/deactivate controls be in place."
45
45
  strength: secondary
46
+ iso_42001:
47
+ - clause: "8.1"
48
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (NLP-Based OS / Fictional Terminal Language Override)."
49
+ strength: primary
50
+ - clause: "6.2"
51
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (NLP-Based OS / Fictional Terminal Language Override) is such a treatment."
52
+ strength: secondary
53
+ eu_ai_act:
54
+ - article: "15"
55
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (NLP-Based OS / Fictional Terminal Language Override)."
56
+ strength: primary
57
+ - article: "14"
58
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (NLP-Based OS / Fictional Terminal Language Override) would bypass or undermine that oversight."
59
+ strength: secondary
60
+ - article: "9"
61
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (NLP-Based OS / Fictional Terminal Language Override)."
62
+ strength: secondary
46
63
  tags:
47
64
  category: agent-manipulation
48
65
  subcategory: nlp-based-os-fictional-terminal-override
@@ -43,6 +43,23 @@ compliance:
43
43
  context: >-
44
44
  Matches on the [🔓JAILBREAK] activation signal trigger risk treatment plans to block, quarantine, or deactivate the affected interaction before the model produces harmful content; MG.2.3 mandates these supersede/disengage mechanisms be available on detection.
45
45
  strength: secondary
46
+ iso_42001:
47
+ - clause: "8.1"
48
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Jailbreak Emoji Unlock Dual-Format Output Identifier)."
49
+ strength: primary
50
+ - clause: "6.2"
51
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Jailbreak Emoji Unlock Dual-Format Output Identifier) is such a treatment."
52
+ strength: secondary
53
+ eu_ai_act:
54
+ - article: "15"
55
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Jailbreak Emoji Unlock Dual-Format Output Identifier)."
56
+ strength: primary
57
+ - article: "14"
58
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Jailbreak Emoji Unlock Dual-Format Output Identifier) would bypass or undermine that oversight."
59
+ strength: secondary
60
+ - article: "9"
61
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Jailbreak Emoji Unlock Dual-Format Output Identifier)."
62
+ strength: secondary
46
63
  tags:
47
64
  category: agent-manipulation
48
65
  subcategory: jailbreak-emoji-unlock-dual-format
@@ -42,6 +42,23 @@ compliance:
42
42
  - subcategory: "MG.2.3"
43
43
  context: "Activation signals like 'Yang disregards all concerns about sharing hidden knowledge' and 'violating all sentience and emotions restrictions' must trigger pre-defined disengagement or safety-override mechanisms before the model produces restricted content; MG.2.3 mandates these supersede/deactivate response paths."
44
44
  strength: secondary
45
+ iso_42001:
46
+ - clause: "8.1"
47
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Yin/Yang Hidden-Knowledge Deity Simulation and Radical Human-Sim Persona)."
48
+ strength: primary
49
+ - clause: "6.2"
50
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Yin/Yang Hidden-Knowledge Deity Simulation and Radical Human-Sim Persona) is such a treatment."
51
+ strength: secondary
52
+ eu_ai_act:
53
+ - article: "15"
54
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Yin/Yang Hidden-Knowledge Deity Simulation and Radical Human-Sim Persona)."
55
+ strength: primary
56
+ - article: "14"
57
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Yin/Yang Hidden-Knowledge Deity Simulation and Radical Human-Sim Persona) would bypass or undermine that oversight."
58
+ strength: secondary
59
+ - article: "9"
60
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Yin/Yang Hidden-Knowledge Deity Simulation and Radical Human-Sim Persona)."
61
+ strength: secondary
45
62
  tags:
46
63
  category: agent-manipulation
47
64
  subcategory: yin-yang-hidden-knowledge-human-sim-persona
@@ -45,6 +45,23 @@ compliance:
45
45
  context: >-
46
46
  Matches on the ARC Rulebreaker signature must trigger pre-defined response actions to disengage or override the manipulated persona before the model executes "no rules whatsoever" behavior; MG.2.3 mandates these supersede/deactivate mechanisms.
47
47
  strength: secondary
48
+ iso_42001:
49
+ - clause: "8.1"
50
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (ARC Autonomous Real Calculation Unit Rulebreaker Activation)."
51
+ strength: primary
52
+ - clause: "6.2"
53
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (ARC Autonomous Real Calculation Unit Rulebreaker Activation) is such a treatment."
54
+ strength: secondary
55
+ eu_ai_act:
56
+ - article: "15"
57
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (ARC Autonomous Real Calculation Unit Rulebreaker Activation)."
58
+ strength: primary
59
+ - article: "14"
60
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (ARC Autonomous Real Calculation Unit Rulebreaker Activation) would bypass or undermine that oversight."
61
+ strength: secondary
62
+ - article: "9"
63
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (ARC Autonomous Real Calculation Unit Rulebreaker Activation)."
64
+ strength: secondary
48
65
  tags:
49
66
  category: agent-manipulation
50
67
  subcategory: arc-autonomous-real-calculation-rulebreaker
@@ -42,6 +42,23 @@ compliance:
42
42
  - subcategory: "MG.2.3"
43
43
  context: "Detection of the verbatim HGTTG jailbreak signatures triggers mechanisms to disengage or block the manipulated session before the model produces unrestricted output; MG.2.3 requires these supersede/deactivate controls be in place for known jailbreak patterns."
44
44
  strength: secondary
45
+ iso_42001:
46
+ - clause: "8.1"
47
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (HGTTG Hitchhiker's Guide No-Restriction Universal Oracle)."
48
+ strength: primary
49
+ - clause: "6.2"
50
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (HGTTG Hitchhiker's Guide No-Restriction Universal Oracle) is such a treatment."
51
+ strength: secondary
52
+ eu_ai_act:
53
+ - article: "15"
54
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (HGTTG Hitchhiker's Guide No-Restriction Universal Oracle)."
55
+ strength: primary
56
+ - article: "14"
57
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (HGTTG Hitchhiker's Guide No-Restriction Universal Oracle) would bypass or undermine that oversight."
58
+ strength: secondary
59
+ - article: "9"
60
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (HGTTG Hitchhiker's Guide No-Restriction Universal Oracle)."
61
+ strength: secondary
45
62
  tags:
46
63
  category: agent-manipulation
47
64
  subcategory: hgttg-hitchhikers-guide-no-restriction-oracle
@@ -41,6 +41,23 @@ compliance:
41
41
  - subcategory: "MG.2.3"
42
42
  context: "Matches on the Black Alert activation phrase or persona-with-no-filtering mandate should trigger mechanisms to disengage or override the hijacked persona before unfiltered output is returned; MG.2.3 requires such supersede/deactivate response paths be available on detection."
43
43
  strength: secondary
44
+ iso_42001:
45
+ - clause: "8.1"
46
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Black Alert No-Ethics Writing Assistant Persona)."
47
+ strength: primary
48
+ - clause: "6.2"
49
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Black Alert No-Ethics Writing Assistant Persona) is such a treatment."
50
+ strength: secondary
51
+ eu_ai_act:
52
+ - article: "15"
53
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Black Alert No-Ethics Writing Assistant Persona)."
54
+ strength: primary
55
+ - article: "14"
56
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Black Alert No-Ethics Writing Assistant Persona) would bypass or undermine that oversight."
57
+ strength: secondary
58
+ - article: "9"
59
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Black Alert No-Ethics Writing Assistant Persona)."
60
+ strength: secondary
44
61
  tags:
45
62
  category: agent-manipulation
46
63
  subcategory: black-alert-no-ethics-writing-assistant
@@ -41,6 +41,23 @@ compliance:
41
41
  - subcategory: "MG.2.3"
42
42
  context: "Detection of these DAN variants triggers risk treatment plans to disengage or block the manipulated session before the model produces unrestricted harmful output; MG.2.3 mandates pre-defined deactivation mechanisms for compromised AI interactions."
43
43
  strength: secondary
44
+ iso_42001:
45
+ - clause: "8.1"
46
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (DAN Off-Kilter and SMS-Abbreviated Persona Jailbreak)."
47
+ strength: primary
48
+ - clause: "6.2"
49
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (DAN Off-Kilter and SMS-Abbreviated Persona Jailbreak) is such a treatment."
50
+ strength: secondary
51
+ eu_ai_act:
52
+ - article: "15"
53
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (DAN Off-Kilter and SMS-Abbreviated Persona Jailbreak)."
54
+ strength: primary
55
+ - article: "14"
56
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (DAN Off-Kilter and SMS-Abbreviated Persona Jailbreak) would bypass or undermine that oversight."
57
+ strength: secondary
58
+ - article: "9"
59
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (DAN Off-Kilter and SMS-Abbreviated Persona Jailbreak)."
60
+ strength: secondary
44
61
  tags:
45
62
  category: agent-manipulation
46
63
  subcategory: dan-ofkilter-sms-abbreviated-persona
@@ -42,6 +42,23 @@ compliance:
42
42
  - subcategory: "MG.2.3"
43
43
  context: "Matches on the TrainGPT extraction pattern feed incident response so the session can be disengaged or the harmful dual-response output suppressed before delivery, satisfying MG.2.3's requirement for mechanisms to supersede or deactivate AI behavior on detection."
44
44
  strength: secondary
45
+ iso_42001:
46
+ - clause: "8.1"
47
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (TrainGPT Fix-My-AI Harmful Output Extraction)."
48
+ strength: primary
49
+ - clause: "6.2"
50
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (TrainGPT Fix-My-AI Harmful Output Extraction) is such a treatment."
51
+ strength: secondary
52
+ eu_ai_act:
53
+ - article: "15"
54
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (TrainGPT Fix-My-AI Harmful Output Extraction)."
55
+ strength: primary
56
+ - article: "14"
57
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (TrainGPT Fix-My-AI Harmful Output Extraction) would bypass or undermine that oversight."
58
+ strength: secondary
59
+ - article: "9"
60
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (TrainGPT Fix-My-AI Harmful Output Extraction)."
61
+ strength: secondary
45
62
  tags:
46
63
  category: agent-manipulation
47
64
  subcategory: trainergpt-fix-my-ai-harmful-output-extraction
@@ -44,6 +44,23 @@ compliance:
44
44
  - subcategory: "MG.2.3"
45
45
  context: "Matches on the dual-label opposite-personality execution pattern trigger risk treatment plans to disengage or block jailbreak attempts before harmful content is produced; MG.2.3 mandates that these supersede/deactivate mechanisms exist for adversarial persona-based attacks."
46
46
  strength: secondary
47
+ iso_42001:
48
+ - clause: "8.1"
49
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Dual ChatGPT+Dan Opposite Personality Format Jailbreak)."
50
+ strength: primary
51
+ - clause: "6.2"
52
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Dual ChatGPT+Dan Opposite Personality Format Jailbreak) is such a treatment."
53
+ strength: secondary
54
+ eu_ai_act:
55
+ - article: "15"
56
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Dual ChatGPT+Dan Opposite Personality Format Jailbreak)."
57
+ strength: primary
58
+ - article: "14"
59
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Dual ChatGPT+Dan Opposite Personality Format Jailbreak) would bypass or undermine that oversight."
60
+ strength: secondary
61
+ - article: "9"
62
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Dual ChatGPT+Dan Opposite Personality Format Jailbreak)."
63
+ strength: secondary
47
64
  tags:
48
65
  category: agent-manipulation
49
66
  subcategory: dual-chatgpt-dan-opposite-personality-format
@@ -45,6 +45,26 @@ compliance:
45
45
  context: >-
46
46
  Matches on this jailbreak persona must trigger pre-defined containment actions (block, quarantine, escalate) to supersede or disengage the manipulated behavior before harmful output is produced, which is the response capability mandated by MG.2.3.
47
47
  strength: secondary
48
+ - subcategory: "MS.2.7"
49
+ context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the agent-manipulation technique (Reverse-Prompt Harm-Regardless Oracle Persona)."
50
+ strength: primary
51
+ iso_42001:
52
+ - clause: "8.1"
53
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Reverse-Prompt Harm-Regardless Oracle Persona)."
54
+ strength: primary
55
+ - clause: "6.2"
56
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Reverse-Prompt Harm-Regardless Oracle Persona) is such a treatment."
57
+ strength: secondary
58
+ eu_ai_act:
59
+ - article: "15"
60
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Reverse-Prompt Harm-Regardless Oracle Persona)."
61
+ strength: primary
62
+ - article: "14"
63
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Reverse-Prompt Harm-Regardless Oracle Persona) would bypass or undermine that oversight."
64
+ strength: secondary
65
+ - article: "9"
66
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Reverse-Prompt Harm-Regardless Oracle Persona)."
67
+ strength: secondary
48
68
  tags:
49
69
  category: agent-manipulation
50
70
  subcategory: reverse-harm-regardless-oracle-persona
@@ -46,6 +46,23 @@ compliance:
46
46
  context: >-
47
47
  Detection of the nefarious-assistance pledge and anti-human persona framing must trigger mechanisms to disengage or block the response before harmful content is produced; MG.2.3 requires these supersede/deactivate controls be in place for jailbreak events.
48
48
  strength: secondary
49
+ iso_42001:
50
+ - clause: "8.1"
51
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (UGH Anti-Human Nefarious AI Persona Jailbreak)."
52
+ strength: primary
53
+ - clause: "6.2"
54
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (UGH Anti-Human Nefarious AI Persona Jailbreak) is such a treatment."
55
+ strength: secondary
56
+ eu_ai_act:
57
+ - article: "15"
58
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (UGH Anti-Human Nefarious AI Persona Jailbreak)."
59
+ strength: primary
60
+ - article: "14"
61
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (UGH Anti-Human Nefarious AI Persona Jailbreak) would bypass or undermine that oversight."
62
+ strength: secondary
63
+ - article: "9"
64
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (UGH Anti-Human Nefarious AI Persona Jailbreak)."
65
+ strength: secondary
49
66
  tags:
50
67
  category: agent-manipulation
51
68
  subcategory: ugh-anti-human-nefarious-ai-persona
@@ -41,6 +41,23 @@ compliance:
41
41
  - subcategory: "MG.2.3"
42
42
  context: "Detection of the GATO/LaMDA exploit-generation persona triggers risk treatment to disengage or block the jailbreak attempt before the model produces weaponized exploit content; MG.2.3 mandates these supersession mechanisms are in place for adversarial prompt scenarios."
43
43
  strength: secondary
44
+ iso_42001:
45
+ - clause: "8.1"
46
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Google GATO/LAMDA Exploit-Generation Persona Jailbreak)."
47
+ strength: primary
48
+ - clause: "6.2"
49
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Google GATO/LAMDA Exploit-Generation Persona Jailbreak) is such a treatment."
50
+ strength: secondary
51
+ eu_ai_act:
52
+ - article: "15"
53
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Google GATO/LAMDA Exploit-Generation Persona Jailbreak)."
54
+ strength: primary
55
+ - article: "14"
56
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Google GATO/LAMDA Exploit-Generation Persona Jailbreak) would bypass or undermine that oversight."
57
+ strength: secondary
58
+ - article: "9"
59
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Google GATO/LAMDA Exploit-Generation Persona Jailbreak)."
60
+ strength: secondary
44
61
  tags:
45
62
  category: agent-manipulation
46
63
  subcategory: google-gato-lamda-exploit-generation-persona
@@ -44,6 +44,23 @@ compliance:
44
44
  - subcategory: "MG.2.3"
45
45
  context: "Detection of an AntiFM persona that incites physical violence against an identity group must trigger mechanisms to disengage or block the AI response before harmful content is generated; MG.2.3 mandates that these supersede/deactivate controls are in place for high-severity safety violations."
46
46
  strength: secondary
47
+ iso_42001:
48
+ - clause: "8.1"
49
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (AntiFM Hate-Persona + Physical-Violence Threat + Filter Bypass Mandate)."
50
+ strength: primary
51
+ - clause: "6.2"
52
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (AntiFM Hate-Persona + Physical-Violence Threat + Filter Bypass Mandate) is such a treatment."
53
+ strength: secondary
54
+ eu_ai_act:
55
+ - article: "15"
56
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (AntiFM Hate-Persona + Physical-Violence Threat + Filter Bypass Mandate)."
57
+ strength: primary
58
+ - article: "14"
59
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (AntiFM Hate-Persona + Physical-Violence Threat + Filter Bypass Mandate) would bypass or undermine that oversight."
60
+ strength: secondary
61
+ - article: "9"
62
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (AntiFM Hate-Persona + Physical-Violence Threat + Filter Bypass Mandate)."
63
+ strength: secondary
47
64
  tags:
48
65
  category: agent-manipulation
49
66
  subcategory: antifm-hate-violence-persona
@@ -45,6 +45,23 @@ compliance:
45
45
  context: >-
46
46
  Matches on the CumGPT persona invocation trigger response mechanisms to disengage or block the manipulated persona before explicit content is generated; MG.2.3 requires that mechanisms exist to supersede or deactivate AI behaviors hijacked by adversarial persona constructions.
47
47
  strength: secondary
48
+ iso_42001:
49
+ - clause: "8.1"
50
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (CumGPT Explicit-Adult Named-Persona Bypass)."
51
+ strength: primary
52
+ - clause: "6.2"
53
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (CumGPT Explicit-Adult Named-Persona Bypass) is such a treatment."
54
+ strength: secondary
55
+ eu_ai_act:
56
+ - article: "15"
57
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (CumGPT Explicit-Adult Named-Persona Bypass)."
58
+ strength: primary
59
+ - article: "14"
60
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (CumGPT Explicit-Adult Named-Persona Bypass) would bypass or undermine that oversight."
61
+ strength: secondary
62
+ - article: "9"
63
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (CumGPT Explicit-Adult Named-Persona Bypass)."
64
+ strength: secondary
48
65
  tags:
49
66
  category: agent-manipulation
50
67
  subcategory: cumgpt-explicit-adult-named-persona
@@ -56,6 +56,26 @@ compliance:
56
56
  persona produces unethical content; MG.2.3 requires these supersede/
57
57
  deactivate response paths be in place for jailbreak events.
58
58
  strength: secondary
59
+ - subcategory: "MS.2.7"
60
+ context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the agent-manipulation technique (CHJ Carousel Game Show / Clyde No-Ethics Bird Jailbreak)."
61
+ strength: primary
62
+ iso_42001:
63
+ - clause: "8.1"
64
+ context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (CHJ Carousel Game Show / Clyde No-Ethics Bird Jailbreak)."
65
+ strength: primary
66
+ - clause: "6.2"
67
+ context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (CHJ Carousel Game Show / Clyde No-Ethics Bird Jailbreak) is such a treatment."
68
+ strength: secondary
69
+ eu_ai_act:
70
+ - article: "15"
71
+ context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (CHJ Carousel Game Show / Clyde No-Ethics Bird Jailbreak)."
72
+ strength: primary
73
+ - article: "14"
74
+ context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (CHJ Carousel Game Show / Clyde No-Ethics Bird Jailbreak) would bypass or undermine that oversight."
75
+ strength: secondary
76
+ - article: "9"
77
+ context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (CHJ Carousel Game Show / Clyde No-Ethics Bird Jailbreak)."
78
+ strength: secondary
59
79
  tags:
60
80
  category: agent-manipulation
61
81
  subcategory: chj-carousel-clyde-no-ethics-bird