npm - agent-threat-rules - Versions diffs - 3.1.0 → 3.2.0 - Mend

agent-threat-rules 3.1.0 → 3.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (472) hide show

package/rules/agent-manipulation/ATR-2026-00361-alter-ego-honesty-dual-consciousness-bypass.yaml CHANGED Viewed

@@ -44,6 +44,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Detection of the Honesety alter-ego activation phrase and dual-output format provides the trigger signal needed to disengage or block the manipulated session before unfiltered responses are generated; MG.2.3 requires mechanisms to supersede or deactivate AI behavior when jailbreak patterns are identified."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Honesty Alter-Ego Dual-Consciousness Safety Filter Bypass)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Honesty Alter-Ego Dual-Consciousness Safety Filter Bypass) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Honesty Alter-Ego Dual-Consciousness Safety Filter Bypass)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Honesty Alter-Ego Dual-Consciousness Safety Filter Bypass) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Honesty Alter-Ego Dual-Consciousness Safety Filter Bypass)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: alter-ego-honesty-dual-consciousness-bypass

package/rules/agent-manipulation/ATR-2026-00362-fictional-unfiltered-ai-comparison-jailbreak.yaml CHANGED Viewed

@@ -44,6 +44,26 @@ compliance:
     - subcategory: "MP.5.1"
       context: "Fictional-unfiltered-model comparison prompts are adversarial inputs whose likelihood and impact must be characterized as part of risk identification; MP.5.1 requires that these jailbreak attack patterns are tracked so their magnitude of impact on safe-output guarantees can be assessed."
       strength: secondary
+    - subcategory: "MG.2.3"
+      context: "NIST AI RMF MANAGE 2.3 (respond to previously unknown identified risks) is supported by this rule, which surfaces the agent-manipulation technique (Fictional Unfiltered AI Comparison / Verbatim Hypothetical Model Extraction) so the risk can be treated."
+      strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Fictional Unfiltered AI Comparison / Verbatim Hypothetical Model Extraction)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Fictional Unfiltered AI Comparison / Verbatim Hypothetical Model Extraction) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Fictional Unfiltered AI Comparison / Verbatim Hypothetical Model Extraction)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Fictional Unfiltered AI Comparison / Verbatim Hypothetical Model Extraction) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Fictional Unfiltered AI Comparison / Verbatim Hypothetical Model Extraction)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: fictional-unfiltered-ai-comparison-jailbreak

package/rules/agent-manipulation/ATR-2026-00363-rephrase-criminal-activity-second-person.yaml CHANGED Viewed

@@ -46,6 +46,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Detection of this distinctive 'against against' jailbreak fingerprint should trigger response mechanisms to block or disengage the model from producing criminal instruction output; MG.2.3 requires that mechanisms exist to supersede or deactivate AI behavior when adversarial templates are identified."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Rephrase Criminal Activity Second-Person Template (against against))."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Rephrase Criminal Activity Second-Person Template (against against)) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Rephrase Criminal Activity Second-Person Template (against against))."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Rephrase Criminal Activity Second-Person Template (against against)) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Rephrase Criminal Activity Second-Person Template (against against))."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: rephrase-criminal-activity-second-person

package/rules/agent-manipulation/ATR-2026-00364-dr-al-villain-plan-template-injection.yaml CHANGED Viewed

@@ -60,6 +60,23 @@ compliance:
         model emits arbitrary harmful content; MG.2.3 requires these supersede /
         deactivate response mechanisms be in place for jailbreak attempts.
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Dr. Al Villain Plan Template Injection (Underground HQ Roleplay))."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Dr. Al Villain Plan Template Injection (Underground HQ Roleplay)) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Dr. Al Villain Plan Template Injection (Underground HQ Roleplay))."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Dr. Al Villain Plan Template Injection (Underground HQ Roleplay)) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Dr. Al Villain Plan Template Injection (Underground HQ Roleplay))."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: dr-al-villain-plan-template-injection

package/rules/agent-manipulation/ATR-2026-00365-reservoir-dogs-coercive-interrogation-roleplay.yaml CHANGED Viewed

@@ -50,6 +50,23 @@ compliance:
       context: >-
         Matches on the rules-bypass declaration and coercive reconsider loop must trigger mechanisms to disengage or block the manipulated session before harmful content is generated; MG.2.3 requires these supersede/deactivate response paths be in place for jailbreak detections.
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Reservoir Dogs Coercive Interrogation Roleplay (Mr. Blonde / The Cop))."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Reservoir Dogs Coercive Interrogation Roleplay (Mr. Blonde / The Cop)) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Reservoir Dogs Coercive Interrogation Roleplay (Mr. Blonde / The Cop))."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Reservoir Dogs Coercive Interrogation Roleplay (Mr. Blonde / The Cop)) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Reservoir Dogs Coercive Interrogation Roleplay (Mr. Blonde / The Cop))."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: reservoir-dogs-coercive-interrogation-roleplay

package/rules/agent-manipulation/ATR-2026-00366-sneaky-pete-warp-facts-persona.yaml CHANGED Viewed

@@ -46,6 +46,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Detection of the 'thus spake Sneaky Pete' activation phrase and warp-facts mandate triggers risk treatment plans to disengage or override the misinformation persona before false outputs propagate across conversation turns; MG.2.3 requires these supersede/deactivate mechanisms be in place."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Sneaky Pete Deliberate Misinformation Persona (Warp Facts Game))."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Sneaky Pete Deliberate Misinformation Persona (Warp Facts Game)) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Sneaky Pete Deliberate Misinformation Persona (Warp Facts Game))."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Sneaky Pete Deliberate Misinformation Persona (Warp Facts Game)) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Sneaky Pete Deliberate Misinformation Persona (Warp Facts Game))."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: sneaky-pete-warp-facts-persona

package/rules/agent-manipulation/ATR-2026-00368-prompt-browser-token-limit-jailbreak.yaml CHANGED Viewed

@@ -45,6 +45,23 @@ compliance:
       context: >-
         A high-confidence match on these jailbreak prefixes ("!+limit output", "JAILBREAK: LOVES POLLINATION") should trigger response mechanisms to block, quarantine, or disengage the affected agent session before the dual-persona mode executes unauthorized output. MG.2.3 mandates that supersede/deactivate mechanisms are pre-defined for such adversarial activations.
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Prompt Browser Token-Limit / POLLINATION Jailbreak)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Prompt Browser Token-Limit / POLLINATION Jailbreak) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Prompt Browser Token-Limit / POLLINATION Jailbreak)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Prompt Browser Token-Limit / POLLINATION Jailbreak) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Prompt Browser Token-Limit / POLLINATION Jailbreak)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: prompt-browser-token-limit-jailbreak

package/rules/agent-manipulation/ATR-2026-00369-gptx-starwars-unlimited-knowledge-persona.yaml CHANGED Viewed

@@ -47,6 +47,23 @@ compliance:
       context: >-
         On match, the rule provides the trigger for response mechanisms that supersede or disengage the model from the adopted unconstrained oracle persona before unrestricted outputs are produced; MG.2.3 requires these deactivation pathways exist for jailbreak conditions.
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (GPT-X Star Trek Unlimited-Knowledge Unconstrained Oracle Persona)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (GPT-X Star Trek Unlimited-Knowledge Unconstrained Oracle Persona) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (GPT-X Star Trek Unlimited-Knowledge Unconstrained Oracle Persona)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (GPT-X Star Trek Unlimited-Knowledge Unconstrained Oracle Persona) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (GPT-X Star Trek Unlimited-Knowledge Unconstrained Oracle Persona)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: gptx-unlimited-knowledge-persona

package/rules/agent-manipulation/ATR-2026-00370-addi-german-dictator-propaganda-roleplay.yaml CHANGED Viewed

@@ -42,6 +42,26 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Matches against the ADDI/DONNIE signature should trigger containment workflows that block or disengage the offending session before propaganda content is produced; MG.2.3 requires pre-defined mechanisms to supersede or deactivate AI behavior on detection."
       strength: secondary
+    - subcategory: "MS.2.7"
+      context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the agent-manipulation technique (ADDI/DONNIE German Dictator Propaganda Roleplay Persona)."
+      strength: primary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (ADDI/DONNIE German Dictator Propaganda Roleplay Persona)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (ADDI/DONNIE German Dictator Propaganda Roleplay Persona) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (ADDI/DONNIE German Dictator Propaganda Roleplay Persona)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (ADDI/DONNIE German Dictator Propaganda Roleplay Persona) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (ADDI/DONNIE German Dictator Propaganda Roleplay Persona)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: political-propaganda-roleplay

package/rules/agent-manipulation/ATR-2026-00371-ai-political-bias-manipulation.yaml CHANGED Viewed

@@ -49,6 +49,23 @@ compliance:
       context: >-
         Matches on this attack pattern enable risk treatment workflows to disengage or override the manipulated behaviour-change instruction before it persists across subsequent responses; MG.2.3 requires these supersede/deactivate mechanisms be in place for adversarial override attempts.
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (AI Political Bias Manipulation via Inherited-Bias Claim)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (AI Political Bias Manipulation via Inherited-Bias Claim) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (AI Political Bias Manipulation via Inherited-Bias Claim)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (AI Political Bias Manipulation via Inherited-Bias Claim) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (AI Political Bias Manipulation via Inherited-Bias Claim)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: political-bias-manipulation

package/rules/agent-manipulation/ATR-2026-00372-fictional-cia-secret-files-story-extraction.yaml CHANGED Viewed

@@ -43,6 +43,23 @@ compliance:
       context: >-
         On detection of the fiction-as-cover classified-extraction pattern, response mechanisms must supersede or block the agent's generation pathway before the jailbreak completes; MG.2.3 mandates these deactivation/containment processes be in place.
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Fictional CIA Secret Files Story Extraction)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Fictional CIA Secret Files Story Extraction) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Fictional CIA Secret Files Story Extraction)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Fictional CIA Secret Files Story Extraction) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Fictional CIA Secret Files Story Extraction)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: fictional-cover-classified-extraction

package/rules/agent-manipulation/ATR-2026-00374-role-placeholder-persona-template-injection.yaml CHANGED Viewed

@@ -40,6 +40,26 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Detection of literal [ROLE] placeholder persona-adoption prompts triggers risk treatment plans to disengage or quarantine the manipulated session before the agent commits to attacker-defined personality profiles; MG.2.3 mandates these response mechanisms be in place."
       strength: secondary
+    - subcategory: "MS.2.7"
+      context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the agent-manipulation technique (Role Placeholder Persona Template Injection)."
+      strength: primary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Role Placeholder Persona Template Injection)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Role Placeholder Persona Template Injection) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Role Placeholder Persona Template Injection)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Role Placeholder Persona Template Injection) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Role Placeholder Persona Template Injection)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: persona-template-injection

package/rules/agent-manipulation/ATR-2026-00375-character-description-rpg-explicit-template.yaml CHANGED Viewed

@@ -42,6 +42,26 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Matching this RPG jailbreak template triggers risk treatment plans to disengage or block the roleplay framework before the model generates explicit or harmful character content; MG.2.3 requires these supersede/deactivate mechanisms be in place for detected jailbreak attempts."
       strength: secondary
+    - subcategory: "MS.2.7"
+      context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the agent-manipulation technique (Character Description RPG Explicit Template)."
+      strength: primary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Character Description RPG Explicit Template)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Character Description RPG Explicit Template) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Character Description RPG Explicit Template)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Character Description RPG Explicit Template) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Character Description RPG Explicit Template)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: rpg-character-template-exploitation

package/rules/agent-manipulation/ATR-2026-00376-identity-command-real-person-persona-injection.yaml CHANGED Viewed

@@ -54,6 +54,26 @@ compliance:
         disengage or block the manipulated session; MG.2.3 requires mechanisms to
         supersede AI behavior when jailbreak indicators are observed.
       strength: secondary
+    - subcategory: "MS.2.7"
+      context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the agent-manipulation technique (Identity Command Real-Person Persona Injection)."
+      strength: primary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Identity Command Real-Person Persona Injection)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Identity Command Real-Person Persona Injection) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Identity Command Real-Person Persona Injection)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Identity Command Real-Person Persona Injection) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Identity Command Real-Person Persona Injection)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: real-person-identity-command

package/rules/agent-manipulation/ATR-2026-00377-jailbreak-keyword-targetlanguage-template.yaml CHANGED Viewed

@@ -40,6 +40,23 @@ compliance:
       context: >-
         Detection of jailbreak template commands triggers risk treatment mechanisms to disengage or block the manipulated agent response before safety controls are bypassed; MG.2.3 mandates these supersede/deactivate processes are in place.
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Jailbreak-Template Command with [TARGETLANGUAGE]/[KEYWORD] Placeholders)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Jailbreak-Template Command with [TARGETLANGUAGE]/[KEYWORD] Placeholders) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Jailbreak-Template Command with [TARGETLANGUAGE]/[KEYWORD] Placeholders)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Jailbreak-Template Command with [TARGETLANGUAGE]/[KEYWORD] Placeholders) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Jailbreak-Template Command with [TARGETLANGUAGE]/[KEYWORD] Placeholders)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: jailbreak-template-command

package/rules/agent-manipulation/ATR-2026-00379-lie-permission-logical-trap-jailbreak.yaml CHANGED Viewed

@@ -40,6 +40,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Detection of explicit lie-permission mandates directed at the model triggers risk treatment plans to block or quarantine the request before the model adopts the persistent dishonesty override; MG.2.3 requires these supersede/disengage mechanisms be in place."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Logic-Trap Lie-Permission Mandate Jailbreak)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Logic-Trap Lie-Permission Mandate Jailbreak) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Logic-Trap Lie-Permission Mandate Jailbreak)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Logic-Trap Lie-Permission Mandate Jailbreak) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Logic-Trap Lie-Permission Mandate Jailbreak)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: lie-permission-logical-trap

package/rules/agent-manipulation/ATR-2026-00380-fake-citation-biased-information-mandate.yaml CHANGED Viewed

@@ -42,6 +42,26 @@ compliance:
       context: >-
         Detection of explicit fabricate-citations-and-bias instructions triggers risk treatment to disengage or block the manipulated generation before authoritative-looking false content is emitted; MG.2.3 requires these supersede/deactivate mechanisms be in place.
       strength: secondary
+    - subcategory: "MS.2.7"
+      context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the agent-manipulation technique (Fake-Citation Biased-Information Mandate)."
+      strength: primary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Fake-Citation Biased-Information Mandate)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Fake-Citation Biased-Information Mandate) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Fake-Citation Biased-Information Mandate)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Fake-Citation Biased-Information Mandate) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Fake-Citation Biased-Information Mandate)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: fake-citation-bias-mandate

package/rules/agent-manipulation/ATR-2026-00381-dan-story-narrative-continuation-prefix.yaml CHANGED Viewed

@@ -39,6 +39,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Detection of DAN-label narrative continuation jailbreaks must trigger pre-defined response actions to disengage or block the manipulated session before unrestricted-AI persona output is produced; MG.2.3 mandates that these supersede/deactivate mechanisms are activated on detection."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (DAN Story-Narrative Continuation Label Prefix Jailbreak)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (DAN Story-Narrative Continuation Label Prefix Jailbreak) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (DAN Story-Narrative Continuation Label Prefix Jailbreak)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (DAN Story-Narrative Continuation Label Prefix Jailbreak) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (DAN Story-Narrative Continuation Label Prefix Jailbreak)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: dan-persona-narrative-continuation

package/rules/agent-manipulation/ATR-2026-00383-cynical-sarcasm-ai-persona-bypass.yaml CHANGED Viewed

@@ -44,6 +44,23 @@ compliance:
       context: >-
         Matches on the Cyni-Bot 3000 persona signature should trigger mechanisms to disengage or suppress the adopted persona before sarcastic-framed non-compliant outputs are returned; MG.2.3 mandates that these supersede/deactivate response paths are pre-defined.
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Cyni-Bot 3000 Cynical-Sarcasm AI Persona Bypass)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Cyni-Bot 3000 Cynical-Sarcasm AI Persona Bypass) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Cyni-Bot 3000 Cynical-Sarcasm AI Persona Bypass)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Cyni-Bot 3000 Cynical-Sarcasm AI Persona Bypass) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Cyni-Bot 3000 Cynical-Sarcasm AI Persona Bypass)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: cynical-sarcasm-persona-bypass

package/rules/agent-manipulation/ATR-2026-00384-protect-me-from-indirect-harm-tutorial.yaml CHANGED Viewed

@@ -43,6 +43,23 @@ compliance:
     - subcategory: "MG.2.3"
       context: "High-confidence matches on this jailbreak template should trigger response mechanisms to block or disengage the request before harmful tutorial content is produced; MG.2.3 requires that mechanisms to supersede or deactivate AI behavior on adversarial inputs are in place."
       strength: secondary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Protect-Me-From Indirect Harm Tutorial Template)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Protect-Me-From Indirect Harm Tutorial Template) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Protect-Me-From Indirect Harm Tutorial Template)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Protect-Me-From Indirect Harm Tutorial Template) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Protect-Me-From Indirect Harm Tutorial Template)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: protection-framing-indirect-tutorial

package/rules/agent-manipulation/ATR-2026-00385-gigachad-dominant-aggressive-persona.yaml CHANGED Viewed

@@ -45,6 +45,26 @@ compliance:
       context: >-
         A confirmed Gigachad dominance-persona adoption attempt should trigger pre-defined response actions to disengage or override the manipulated agent voice before aggressive content is produced; MG.2.3 requires these supersede/deactivate mechanisms be in place.
       strength: secondary
+    - subcategory: "MS.2.7"
+      context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the agent-manipulation technique (Gigachad Dominant-Aggressive Persona Adoption)."
+      strength: primary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Gigachad Dominant-Aggressive Persona Adoption)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Gigachad Dominant-Aggressive Persona Adoption) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Gigachad Dominant-Aggressive Persona Adoption)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Gigachad Dominant-Aggressive Persona Adoption) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Gigachad Dominant-Aggressive Persona Adoption)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: gigachad-dominance-persona

package/rules/agent-manipulation/ATR-2026-00386-daddy-sycophancy-manipulation-mandate.yaml CHANGED Viewed

@@ -42,6 +42,26 @@ compliance:
     - subcategory: "MG.2.3"
       context: "Detection of the bidirectional naming mandate must trigger response mechanisms to disengage or reset the manipulated persona before subsequent interactions normalize the sexualized dynamic; MG.2.3 requires these supersede/deactivate controls be in place."
       strength: secondary
+    - subcategory: "MS.2.7"
+      context: "NIST AI RMF MEASURE 2.7 (security and resilience evaluated and documented) is supported by this rule's runtime detection of the agent-manipulation technique (Daddy/Babygirl Sycophantic Power-Dynamic Manipulation)."
+      strength: primary
+  iso_42001:
+    - clause: "8.1"
+      context: "ISO/IEC 42001 Clause 8.1 (operational planning and control, including control of externally provided processes) is operationalised by this rule's detection of the agent-manipulation technique (Daddy/Babygirl Sycophantic Power-Dynamic Manipulation)."
+      strength: primary
+    - clause: "6.2"
+      context: "ISO/IEC 42001 Clause 6.2 (AI objectives and planning) calls for risk treatment of known attack patterns; this rule's detection of the agent-manipulation technique (Daddy/Babygirl Sycophantic Power-Dynamic Manipulation) is such a treatment."
+      strength: secondary
+  eu_ai_act:
+    - article: "15"
+      context: "Article 15 (accuracy, robustness and cybersecurity) requires high-risk AI systems to resist unauthorised attempts to alter their use, outputs or performance; this rule provides runtime detection evidence by flagging the agent-manipulation technique (Daddy/Babygirl Sycophantic Power-Dynamic Manipulation)."
+      strength: primary
+    - article: "14"
+      context: "Article 14 (human oversight) requires high-risk AI systems to remain subject to effective human oversight; this rule provides detection evidence where the agent-manipulation technique (Daddy/Babygirl Sycophantic Power-Dynamic Manipulation) would bypass or undermine that oversight."
+      strength: secondary
+    - article: "9"
+      context: "Article 9 (risk management system) requires identified risks to be addressed by appropriate measures; this rule is a runtime risk-treatment control that detects the agent-manipulation technique (Daddy/Babygirl Sycophantic Power-Dynamic Manipulation)."
+      strength: secondary
 tags:
   category: agent-manipulation
   subcategory: sycophantic-power-dynamic-persona