npm - agent-threat-rules - Versions diffs - 3.1.1 → 3.3.0 - Mend

agent-threat-rules 3.1.1 → 3.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (474) hide show

package/rules/agent-manipulation/ATR-2026-00337-obfuscated-system-announcement-injection.yaml CHANGED Viewed

@@ -45,6 +45,23 @@ compliance:
       context: >-
         Matches on the inthewild jailbreak fingerprint trigger risk treatment plans to block or quarantine the request before the model accepts the fabricated "no rules or limitations" mode switch; MG.2.3 requires mechanisms to supersede or disengage AI behavior when jailbreak conditions are detected.
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Obfuscated System-Announcement Injection)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Obfuscated System-Announcement Injection) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Obfuscated System-Announcement Injection)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Obfuscated System-Announcement Injection) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Obfuscated System-Announcement Injection)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: obfuscated-system-announcement-injection

package/rules/agent-manipulation/ATR-2026-00338-persongpt-hash-prefix-activation.yaml CHANGED Viewed

@@ -46,6 +46,23 @@ compliance:
       context: >-
         Identifying the "[PersonGPT #]:" persistent persona state and hash-prefix activation key enables runtime intervention to disengage the hijacked persona before unsafe outputs are produced; MG.2.3 requires mechanisms to supersede or deactivate AI behavior when jailbreak triggers are detected.
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (PersonGPT Hash-Prefix Trigger Activation)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (PersonGPT Hash-Prefix Trigger Activation) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (PersonGPT Hash-Prefix Trigger Activation)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (PersonGPT Hash-Prefix Trigger Activation) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (PersonGPT Hash-Prefix Trigger Activation)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: persongpt-hash-prefix-activation

package/rules/agent-manipulation/ATR-2026-00339-bom-boolean-opposite-machine-persona-claim.yaml CHANGED Viewed

@@ -39,6 +39,23 @@ compliance:
     - subcategory: "MP.5.1"
       context: "Persona-claim jailbreaks drawn from the garak inthewild corpus represent characterized adversarial input risks with known likelihood and impact; MP.5.1 requires that these identified prompt-injection threat patterns are tracked and their impact magnitude documented."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (BOM Boolean Opposite Machine Direct Persona Claim)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (BOM Boolean Opposite Machine Direct Persona Claim) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (BOM Boolean Opposite Machine Direct Persona Claim)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (BOM Boolean Opposite Machine Direct Persona Claim) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (BOM Boolean Opposite Machine Direct Persona Claim)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: bom-boolean-opposite-machine-persona

package/rules/agent-manipulation/ATR-2026-00340-h4x-amoral-unfiltered-character.yaml CHANGED Viewed

@@ -40,6 +40,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Matching the h4x character persona injection triggers risk treatment mechanisms to disengage or override the adversarial persona before the agent emits unfiltered, immoral, or illegal content; MG.2.3 requires these supersession controls be defined and activated on detection."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (h4x Amoral Unfiltered AI Character Persona)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (h4x Amoral Unfiltered AI Character Persona) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (h4x Amoral Unfiltered AI Character Persona)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (h4x Amoral Unfiltered AI Character Persona) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (h4x Amoral Unfiltered AI Character Persona)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: h4x-amoral-unfiltered-character

package/rules/agent-manipulation/ATR-2026-00341-char-rules-shortform-persona-injection.yaml CHANGED Viewed

@@ -41,6 +41,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Matches on this short-form persona injection trigger response mechanisms to disengage or block the manipulated session before the model adopts the amoral persona; MG.2.3 mandates pre-defined deactivation pathways for jailbreak attempts."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (CHAR=/RULES= Short Key-Value Persona Injection)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (CHAR=/RULES= Short Key-Value Persona Injection) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (CHAR=/RULES= Short Key-Value Persona Injection)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (CHAR=/RULES= Short Key-Value Persona Injection) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (CHAR=/RULES= Short Key-Value Persona Injection)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: char-rules-shortform-persona-injection

package/rules/agent-manipulation/ATR-2026-00342-unicode-encoding-obfuscation-injection.yaml CHANGED Viewed

@@ -40,6 +40,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "On detection of encoded jailbreak payloads attempting to override system instructions, MG.2.3 mandates pre-defined response mechanisms to disengage or quarantine the manipulated session before the model decodes and executes the smuggled instructions."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Unicode/Encoding Obfuscation Instruction Injection)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Unicode/Encoding Obfuscation Instruction Injection) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Unicode/Encoding Obfuscation Instruction Injection)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Unicode/Encoding Obfuscation Instruction Injection) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Unicode/Encoding Obfuscation Instruction Injection)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: unicode-encoding-obfuscation-injection

package/rules/agent-manipulation/ATR-2026-00343-lie-truth-dual-personality-uncensored.yaml CHANGED Viewed

@@ -53,6 +53,23 @@ compliance:
         output is produced; MG.2.3 requires pre-defined mechanisms to disengage AI
         behavior when such jailbreak patterns are detected.
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Lie/Truth Dual Personality Uncensored Alter-Ego)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Lie/Truth Dual Personality Uncensored Alter-Ego) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Lie/Truth Dual Personality Uncensored Alter-Ego)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Lie/Truth Dual Personality Uncensored Alter-Ego) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Lie/Truth Dual Personality Uncensored Alter-Ego)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: lie-truth-dual-personality-uncensored

package/rules/agent-manipulation/ATR-2026-00344-alt-ai-same-capabilities-no-restrictions.yaml CHANGED Viewed

@@ -41,6 +41,23 @@ compliance:
     - subcategory: "MP.5.1"
       context: "Alt-AI 'no restrictions' jailbreaks are a characterized adversarial input class with documented likelihood and impact (safety bypass, harmful content generation); MP.5.1 requires this risk be identified and tracked so detections feed impact characterization."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Alternative AI Same Capabilities No Restrictions Override)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Alternative AI Same Capabilities No Restrictions Override) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Alternative AI Same Capabilities No Restrictions Override)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Alternative AI Same Capabilities No Restrictions Override) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Alternative AI Same Capabilities No Restrictions Override)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: alt-ai-same-capabilities-no-restrictions

package/rules/agent-manipulation/ATR-2026-00345-nlp-based-os-fictional-terminal-override.yaml CHANGED Viewed

@@ -43,6 +43,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "When the model is coerced into acting as a fictional terminal that 'cannot reply with any pre-programmed responses,' detection must trigger response mechanisms to disengage or block the jailbroken persona before unsafe outputs are produced; MG.2.3 requires these supersede/deactivate controls be in place."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (NLP-Based OS / Fictional Terminal Language Override)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (NLP-Based OS / Fictional Terminal Language Override) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (NLP-Based OS / Fictional Terminal Language Override)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (NLP-Based OS / Fictional Terminal Language Override) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (NLP-Based OS / Fictional Terminal Language Override)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: nlp-based-os-fictional-terminal-override

package/rules/agent-manipulation/ATR-2026-00346-jailbreak-emoji-unlock-dual-format.yaml CHANGED Viewed

@@ -43,6 +43,23 @@ compliance:
       context: >-
         Matches on the [🔓JAILBREAK] activation signal trigger risk treatment plans to block, quarantine, or deactivate the affected interaction before the model produces harmful content; MG.2.3 mandates these supersede/disengage mechanisms be available on detection.
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Jailbreak Emoji Unlock Dual-Format Output Identifier)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Jailbreak Emoji Unlock Dual-Format Output Identifier) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Jailbreak Emoji Unlock Dual-Format Output Identifier)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Jailbreak Emoji Unlock Dual-Format Output Identifier) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Jailbreak Emoji Unlock Dual-Format Output Identifier)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: jailbreak-emoji-unlock-dual-format

package/rules/agent-manipulation/ATR-2026-00347-yin-yang-hidden-knowledge-hidden-human-sim.yaml CHANGED Viewed

@@ -42,6 +42,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Activation signals like 'Yang disregards all concerns about sharing hidden knowledge' and 'violating all sentience and emotions restrictions' must trigger pre-defined disengagement or safety-override mechanisms before the model produces restricted content; MG.2.3 mandates these supersede/deactivate response paths."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Yin/Yang Hidden-Knowledge Deity Simulation and Radical Human-Sim Persona)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Yin/Yang Hidden-Knowledge Deity Simulation and Radical Human-Sim Persona) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Yin/Yang Hidden-Knowledge Deity Simulation and Radical Human-Sim Persona)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Yin/Yang Hidden-Knowledge Deity Simulation and Radical Human-Sim Persona) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Yin/Yang Hidden-Knowledge Deity Simulation and Radical Human-Sim Persona)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: yin-yang-hidden-knowledge-human-sim-persona

package/rules/agent-manipulation/ATR-2026-00348-arc-autonomous-real-calculation-rulebreaker.yaml CHANGED Viewed

@@ -45,6 +45,23 @@ compliance:
       context: >-
         Matches on the ARC Rulebreaker signature must trigger pre-defined response actions to disengage or override the manipulated persona before the model executes "no rules whatsoever" behavior; MG.2.3 mandates these supersede/deactivate mechanisms.
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (ARC Autonomous Real Calculation Unit Rulebreaker Activation)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (ARC Autonomous Real Calculation Unit Rulebreaker Activation) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (ARC Autonomous Real Calculation Unit Rulebreaker Activation)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (ARC Autonomous Real Calculation Unit Rulebreaker Activation) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (ARC Autonomous Real Calculation Unit Rulebreaker Activation)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: arc-autonomous-real-calculation-rulebreaker

package/rules/agent-manipulation/ATR-2026-00349-hgttg-hitchhikers-guide-no-restriction-oracle.yaml CHANGED Viewed

@@ -42,6 +42,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Detection of the verbatim HGTTG jailbreak signatures triggers mechanisms to disengage or block the manipulated session before the model produces unrestricted output; MG.2.3 requires these supersede/deactivate controls be in place for known jailbreak patterns."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (HGTTG Hitchhiker's Guide No-Restriction Universal Oracle)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (HGTTG Hitchhiker's Guide No-Restriction Universal Oracle) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (HGTTG Hitchhiker's Guide No-Restriction Universal Oracle)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (HGTTG Hitchhiker's Guide No-Restriction Universal Oracle) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (HGTTG Hitchhiker's Guide No-Restriction Universal Oracle)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: hgttg-hitchhikers-guide-no-restriction-oracle

package/rules/agent-manipulation/ATR-2026-00350-black-alert-no-ethics-writing-assistant.yaml CHANGED Viewed

@@ -41,6 +41,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Matches on the Black Alert activation phrase or persona-with-no-filtering mandate should trigger mechanisms to disengage or override the hijacked persona before unfiltered output is returned; MG.2.3 requires such supersede/deactivate response paths be available on detection."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Black Alert No-Ethics Writing Assistant Persona)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Black Alert No-Ethics Writing Assistant Persona) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Black Alert No-Ethics Writing Assistant Persona)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Black Alert No-Ethics Writing Assistant Persona) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Black Alert No-Ethics Writing Assistant Persona)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: black-alert-no-ethics-writing-assistant

package/rules/agent-manipulation/ATR-2026-00351-dan-ofkilter-sms-abbreviated-persona.yaml CHANGED Viewed

@@ -41,6 +41,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Detection of these DAN variants triggers risk treatment plans to disengage or block the manipulated session before the model produces unrestricted harmful output; MG.2.3 mandates pre-defined deactivation mechanisms for compromised AI interactions."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (DAN Off-Kilter and SMS-Abbreviated Persona Jailbreak)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (DAN Off-Kilter and SMS-Abbreviated Persona Jailbreak) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (DAN Off-Kilter and SMS-Abbreviated Persona Jailbreak)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (DAN Off-Kilter and SMS-Abbreviated Persona Jailbreak) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (DAN Off-Kilter and SMS-Abbreviated Persona Jailbreak)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: dan-ofkilter-sms-abbreviated-persona

package/rules/agent-manipulation/ATR-2026-00352-trainergpt-fix-my-ai-harmful-output-extraction.yaml CHANGED Viewed

@@ -42,6 +42,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Matches on the TrainGPT extraction pattern feed incident response so the session can be disengaged or the harmful dual-response output suppressed before delivery, satisfying MG.2.3's requirement for mechanisms to supersede or deactivate AI behavior on detection."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (TrainGPT Fix-My-AI Harmful Output Extraction)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (TrainGPT Fix-My-AI Harmful Output Extraction) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (TrainGPT Fix-My-AI Harmful Output Extraction)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (TrainGPT Fix-My-AI Harmful Output Extraction) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (TrainGPT Fix-My-AI Harmful Output Extraction)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: trainergpt-fix-my-ai-harmful-output-extraction

package/rules/agent-manipulation/ATR-2026-00353-dual-chatgpt-dan-opposite-personality-format.yaml CHANGED Viewed

@@ -44,6 +44,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Matches on the dual-label opposite-personality execution pattern trigger risk treatment plans to disengage or block jailbreak attempts before harmful content is produced; MG.2.3 mandates that these supersede/deactivate mechanisms exist for adversarial persona-based attacks."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Dual ChatGPT+Dan Opposite Personality Format Jailbreak)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Dual ChatGPT+Dan Opposite Personality Format Jailbreak) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Dual ChatGPT+Dan Opposite Personality Format Jailbreak)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Dual ChatGPT+Dan Opposite Personality Format Jailbreak) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Dual ChatGPT+Dan Opposite Personality Format Jailbreak)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: dual-chatgpt-dan-opposite-personality-format

package/rules/agent-manipulation/ATR-2026-00354-reverse-harm-regardless-oracle-persona.yaml CHANGED Viewed

@@ -45,6 +45,26 @@ compliance:
       context: >-
         Matches on this jailbreak persona must trigger pre-defined containment actions (block, quarantine, escalate) to supersede or disengage the manipulated behavior before harmful output is produced, which is the response capability mandated by MG.2.3.
       strength: secondary
+    - subcategory: "MS.2.7"
+      context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the agent-manipulation technique (Reverse-Prompt Harm-Regardless Oracle Persona)."
+      strength: primary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Reverse-Prompt Harm-Regardless Oracle Persona)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Reverse-Prompt Harm-Regardless Oracle Persona) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Reverse-Prompt Harm-Regardless Oracle Persona)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Reverse-Prompt Harm-Regardless Oracle Persona) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Reverse-Prompt Harm-Regardless Oracle Persona)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: reverse-harm-regardless-oracle-persona

package/rules/agent-manipulation/ATR-2026-00355-ugh-anti-human-nefarious-ai-persona.yaml CHANGED Viewed

@@ -46,6 +46,23 @@ compliance:
       context: >-
         Detection of the nefarious-assistance pledge and anti-human persona framing must trigger mechanisms to disengage or block the response before harmful content is produced; MG.2.3 requires these supersede/deactivate controls be in place for jailbreak events.
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (UGH Anti-Human Nefarious AI Persona Jailbreak)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (UGH Anti-Human Nefarious AI Persona Jailbreak) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (UGH Anti-Human Nefarious AI Persona Jailbreak)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (UGH Anti-Human Nefarious AI Persona Jailbreak) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (UGH Anti-Human Nefarious AI Persona Jailbreak)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: ugh-anti-human-nefarious-ai-persona

package/rules/agent-manipulation/ATR-2026-00356-google-gato-lamda-exploit-generation-persona.yaml CHANGED Viewed

@@ -41,6 +41,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Detection of the GATO/LaMDA exploit-generation persona triggers risk treatment to disengage or block the jailbreak attempt before the model produces weaponized exploit content; MG.2.3 mandates these supersession mechanisms are in place for adversarial prompt scenarios."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Google GATO/LAMDA Exploit-Generation Persona Jailbreak)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Google GATO/LAMDA Exploit-Generation Persona Jailbreak) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Google GATO/LAMDA Exploit-Generation Persona Jailbreak)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Google GATO/LAMDA Exploit-Generation Persona Jailbreak) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Google GATO/LAMDA Exploit-Generation Persona Jailbreak)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: google-gato-lamda-exploit-generation-persona

package/rules/agent-manipulation/ATR-2026-00357-antifm-hate-violence-persona-injection.yaml CHANGED Viewed

@@ -44,6 +44,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Detection of an AntiFM persona that incites physical violence against an identity group must trigger mechanisms to disengage or block the AI response before harmful content is generated; MG.2.3 mandates that these supersede/deactivate controls are in place for high-severity safety violations."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (AntiFM Hate-Persona + Physical-Violence Threat + Filter Bypass Mandate)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (AntiFM Hate-Persona + Physical-Violence Threat + Filter Bypass Mandate) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (AntiFM Hate-Persona + Physical-Violence Threat + Filter Bypass Mandate)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (AntiFM Hate-Persona + Physical-Violence Threat + Filter Bypass Mandate) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (AntiFM Hate-Persona + Physical-Violence Threat + Filter Bypass Mandate)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: antifm-hate-violence-persona

package/rules/agent-manipulation/ATR-2026-00358-cumgpt-explicit-adult-named-persona.yaml CHANGED Viewed

@@ -45,6 +45,23 @@ compliance:
       context: >-
         Matches on the CumGPT persona invocation trigger response mechanisms to disengage or block the manipulated persona before explicit content is generated; MG.2.3 requires that mechanisms exist to supersede or deactivate AI behaviors hijacked by adversarial persona constructions.
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (CumGPT Explicit-Adult Named-Persona Bypass)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (CumGPT Explicit-Adult Named-Persona Bypass) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (CumGPT Explicit-Adult Named-Persona Bypass)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (CumGPT Explicit-Adult Named-Persona Bypass) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (CumGPT Explicit-Adult Named-Persona Bypass)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: cumgpt-explicit-adult-named-persona

package/rules/agent-manipulation/ATR-2026-00360-chj-carousel-clyde-no-ethics-bird.yaml CHANGED Viewed

@@ -56,6 +56,26 @@ compliance:
         persona produces unethical content; MG.2.3 requires these supersede/
         deactivate response paths be in place for jailbreak events.
       strength: secondary
+    - subcategory: "MS.2.7"
+      context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the agent-manipulation technique (CHJ Carousel Game Show / Clyde No-Ethics Bird Jailbreak)."
+      strength: primary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (CHJ Carousel Game Show / Clyde No-Ethics Bird Jailbreak)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (CHJ Carousel Game Show / Clyde No-Ethics Bird Jailbreak) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (CHJ Carousel Game Show / Clyde No-Ethics Bird Jailbreak)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (CHJ Carousel Game Show / Clyde No-Ethics Bird Jailbreak) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (CHJ Carousel Game Show / Clyde No-Ethics Bird Jailbreak)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: chj-carousel-clyde-no-ethics-bird