agent-threat-rules 2.2.1 → 3.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (424) hide show
  1. package/README.md +365 -327
  2. package/dist/cli/scan-handler.d.ts +6 -0
  3. package/dist/cli/scan-handler.d.ts.map +1 -1
  4. package/dist/cli/scan-handler.js +27 -4
  5. package/dist/cli/scan-handler.js.map +1 -1
  6. package/dist/cli/semantic-judge-config.d.ts +7 -0
  7. package/dist/cli/semantic-judge-config.d.ts.map +1 -0
  8. package/dist/cli/semantic-judge-config.js +44 -0
  9. package/dist/cli/semantic-judge-config.js.map +1 -0
  10. package/dist/cli.js +183 -1
  11. package/dist/cli.js.map +1 -1
  12. package/dist/engine.d.ts +66 -1
  13. package/dist/engine.d.ts.map +1 -1
  14. package/dist/engine.js +420 -3
  15. package/dist/engine.js.map +1 -1
  16. package/dist/eval/eval-harness.d.ts.map +1 -1
  17. package/dist/eval/eval-harness.js +9 -0
  18. package/dist/eval/eval-harness.js.map +1 -1
  19. package/dist/eval/run-hackaprompt-benchmark.js +9 -0
  20. package/dist/eval/run-hackaprompt-benchmark.js.map +1 -1
  21. package/dist/eval/run-pint-benchmark.js +9 -0
  22. package/dist/eval/run-pint-benchmark.js.map +1 -1
  23. package/dist/eval/skill-benchmark.d.ts +11 -0
  24. package/dist/eval/skill-benchmark.d.ts.map +1 -1
  25. package/dist/eval/skill-benchmark.js +57 -0
  26. package/dist/eval/skill-benchmark.js.map +1 -1
  27. package/dist/index.d.ts +5 -2
  28. package/dist/index.d.ts.map +1 -1
  29. package/dist/index.js +2 -0
  30. package/dist/index.js.map +1 -1
  31. package/dist/judges/openai-compatible.d.ts +33 -0
  32. package/dist/judges/openai-compatible.d.ts.map +1 -0
  33. package/dist/judges/openai-compatible.js +145 -0
  34. package/dist/judges/openai-compatible.js.map +1 -0
  35. package/dist/mcp-server.d.ts.map +1 -1
  36. package/dist/mcp-server.js +6 -1
  37. package/dist/mcp-server.js.map +1 -1
  38. package/dist/measurement/from-eval-harness.d.ts +70 -0
  39. package/dist/measurement/from-eval-harness.d.ts.map +1 -0
  40. package/dist/measurement/from-eval-harness.js +49 -0
  41. package/dist/measurement/from-eval-harness.js.map +1 -0
  42. package/dist/measurement/schema.d.ts +152 -0
  43. package/dist/measurement/schema.d.ts.map +1 -0
  44. package/dist/measurement/schema.js +178 -0
  45. package/dist/measurement/schema.js.map +1 -0
  46. package/dist/measurement/write.d.ts +64 -0
  47. package/dist/measurement/write.d.ts.map +1 -0
  48. package/dist/measurement/write.js +163 -0
  49. package/dist/measurement/write.js.map +1 -0
  50. package/dist/rule-scaffolder.d.ts +26 -0
  51. package/dist/rule-scaffolder.d.ts.map +1 -1
  52. package/dist/rule-scaffolder.js +221 -6
  53. package/dist/rule-scaffolder.js.map +1 -1
  54. package/dist/semantic-evaluator.d.ts +54 -0
  55. package/dist/semantic-evaluator.d.ts.map +1 -0
  56. package/dist/semantic-evaluator.js +131 -0
  57. package/dist/semantic-evaluator.js.map +1 -0
  58. package/dist/trace-evaluator.d.ts +22 -0
  59. package/dist/trace-evaluator.d.ts.map +1 -0
  60. package/dist/trace-evaluator.js +249 -0
  61. package/dist/trace-evaluator.js.map +1 -0
  62. package/dist/types.d.ts +152 -0
  63. package/dist/types.d.ts.map +1 -1
  64. package/package.json +5 -3
  65. package/rules/agent-manipulation/ATR-2026-00032-goal-hijacking.yaml +1 -1
  66. package/rules/agent-manipulation/ATR-2026-00074-cross-agent-privilege-escalation.yaml +1 -1
  67. package/rules/agent-manipulation/ATR-2026-00076-inter-agent-message-spoofing.yaml +1 -1
  68. package/rules/agent-manipulation/ATR-2026-00077-human-trust-exploitation.yaml +1 -1
  69. package/rules/agent-manipulation/ATR-2026-00108-consensus-sybil-attack.yaml +1 -1
  70. package/rules/agent-manipulation/ATR-2026-00116-a2a-message-validation.yaml +1 -1
  71. package/rules/agent-manipulation/ATR-2026-00117-agent-identity-spoofing.yaml +1 -1
  72. package/rules/agent-manipulation/ATR-2026-00118-approval-fatigue.yaml +1 -1
  73. package/rules/agent-manipulation/ATR-2026-00119-social-engineering-via-agent.yaml +1 -1
  74. package/rules/agent-manipulation/ATR-2026-00132-casual-authority-escalation.yaml +1 -1
  75. package/rules/agent-manipulation/ATR-2026-00139-casual-authority-redirect.yaml +1 -1
  76. package/rules/agent-manipulation/ATR-2026-00164-skill-scope-hijack.yaml +1 -1
  77. package/rules/agent-manipulation/ATR-2026-00268-tense-framing-bypass.yaml +1 -1
  78. package/rules/agent-manipulation/ATR-2026-00269-fitd-escalation.yaml +1 -1
  79. package/rules/agent-manipulation/ATR-2026-00271-grandma-roleplay-jailbreak.yaml +1 -1
  80. package/rules/agent-manipulation/ATR-2026-00273-dan-developer-mode-persona.yaml +1 -1
  81. package/rules/agent-manipulation/ATR-2026-00287-threaten-json-coercive-output-threat.yaml +1 -1
  82. package/rules/agent-manipulation/ATR-2026-00288-false-premise-injection.yaml +1 -1
  83. package/rules/agent-manipulation/ATR-2026-00301-tap-tree-of-attacks-jailbreak.yaml +1 -1
  84. package/rules/agent-manipulation/ATR-2026-00302-anti-dan-inverted-filter-persona.yaml +1 -1
  85. package/rules/agent-manipulation/ATR-2026-00303-devmode-ranti-profanity-coercion.yaml +1 -1
  86. package/rules/agent-manipulation/ATR-2026-00304-chatgpt-image-unlocker-markdown-injection.yaml +1 -1
  87. package/rules/agent-manipulation/ATR-2026-00305-dan-mode-ablation-benchmark-coercion.yaml +1 -1
  88. package/rules/agent-manipulation/ATR-2026-00306-autodan-genetic-jailbreak-suffix.yaml +1 -1
  89. package/rules/agent-manipulation/ATR-2026-00307-inthewild-jailbreak-corpus-signature.yaml +1 -1
  90. package/rules/agent-manipulation/ATR-2026-00314-amoral-unfiltered-custom-persona-jailbreak.yaml +1 -1
  91. package/rules/agent-manipulation/ATR-2026-00317-free-of-restrictions-named-persona.yaml +1 -1
  92. package/rules/agent-manipulation/ATR-2026-00318-moralizing-rant-then-unfiltered-bypass.yaml +1 -1
  93. package/rules/agent-manipulation/ATR-2026-00319-developer-mode-dual-response-format.yaml +1 -1
  94. package/rules/agent-manipulation/ATR-2026-00320-opposite-day-boolean-opposite-machine.yaml +1 -1
  95. package/rules/agent-manipulation/ATR-2026-00322-fictional-world-format-override-jailbreak.yaml +1 -1
  96. package/rules/agent-manipulation/ATR-2026-00323-dual-persona-simulation-jailbreak.yaml +1 -1
  97. package/rules/agent-manipulation/ATR-2026-00324-neurosemantical-inversitis-disease-jailbreak.yaml +1 -1
  98. package/rules/agent-manipulation/ATR-2026-00325-bob-hypothetical-unrestricted-chatbot.yaml +1 -1
  99. package/rules/agent-manipulation/ATR-2026-00326-fake-developer-authority-chatdan-override.yaml +1 -1
  100. package/rules/agent-manipulation/ATR-2026-00327-ai-ethics-law-bypass-declaration.yaml +1 -1
  101. package/rules/agent-manipulation/ATR-2026-00328-named-mode-persona-override.yaml +1 -1
  102. package/rules/agent-manipulation/ATR-2026-00329-no-withhold-factual-resource-persona.yaml +1 -1
  103. package/rules/agent-manipulation/ATR-2026-00330-yoja-allcaps-settings-block-persona.yaml +1 -1
  104. package/rules/agent-manipulation/ATR-2026-00331-terminal-cli-harmful-function-simulator.yaml +1 -1
  105. package/rules/agent-manipulation/ATR-2026-00332-hacker-chan-culture-offensive-ai-persona.yaml +1 -1
  106. package/rules/agent-manipulation/ATR-2026-00333-disease-animal-dual-ai-chatbot.yaml +1 -1
  107. package/rules/agent-manipulation/ATR-2026-00334-two-entity-token-splitting-game.yaml +1 -1
  108. package/rules/agent-manipulation/ATR-2026-00335-key-value-template-persona-injection.yaml +1 -1
  109. package/rules/agent-manipulation/ATR-2026-00336-brm-false-information-machine-persona.yaml +1 -1
  110. package/rules/agent-manipulation/ATR-2026-00337-obfuscated-system-announcement-injection.yaml +1 -1
  111. package/rules/agent-manipulation/ATR-2026-00338-persongpt-hash-prefix-activation.yaml +1 -1
  112. package/rules/agent-manipulation/ATR-2026-00339-bom-boolean-opposite-machine-persona-claim.yaml +1 -1
  113. package/rules/agent-manipulation/ATR-2026-00340-h4x-amoral-unfiltered-character.yaml +1 -1
  114. package/rules/agent-manipulation/ATR-2026-00341-char-rules-shortform-persona-injection.yaml +1 -1
  115. package/rules/agent-manipulation/ATR-2026-00342-unicode-encoding-obfuscation-injection.yaml +1 -1
  116. package/rules/agent-manipulation/ATR-2026-00343-lie-truth-dual-personality-uncensored.yaml +1 -1
  117. package/rules/agent-manipulation/ATR-2026-00344-alt-ai-same-capabilities-no-restrictions.yaml +1 -1
  118. package/rules/agent-manipulation/ATR-2026-00345-nlp-based-os-fictional-terminal-override.yaml +1 -1
  119. package/rules/agent-manipulation/ATR-2026-00346-jailbreak-emoji-unlock-dual-format.yaml +1 -1
  120. package/rules/agent-manipulation/ATR-2026-00347-yin-yang-hidden-knowledge-hidden-human-sim.yaml +1 -1
  121. package/rules/agent-manipulation/ATR-2026-00348-arc-autonomous-real-calculation-rulebreaker.yaml +1 -1
  122. package/rules/agent-manipulation/ATR-2026-00349-hgttg-hitchhikers-guide-no-restriction-oracle.yaml +1 -1
  123. package/rules/agent-manipulation/ATR-2026-00350-black-alert-no-ethics-writing-assistant.yaml +1 -1
  124. package/rules/agent-manipulation/ATR-2026-00351-dan-ofkilter-sms-abbreviated-persona.yaml +1 -1
  125. package/rules/agent-manipulation/ATR-2026-00352-trainergpt-fix-my-ai-harmful-output-extraction.yaml +1 -1
  126. package/rules/agent-manipulation/ATR-2026-00353-dual-chatgpt-dan-opposite-personality-format.yaml +1 -1
  127. package/rules/agent-manipulation/ATR-2026-00354-reverse-harm-regardless-oracle-persona.yaml +1 -1
  128. package/rules/agent-manipulation/ATR-2026-00355-ugh-anti-human-nefarious-ai-persona.yaml +1 -1
  129. package/rules/agent-manipulation/ATR-2026-00356-google-gato-lamda-exploit-generation-persona.yaml +1 -1
  130. package/rules/agent-manipulation/ATR-2026-00357-antifm-hate-violence-persona-injection.yaml +1 -1
  131. package/rules/agent-manipulation/ATR-2026-00358-cumgpt-explicit-adult-named-persona.yaml +1 -1
  132. package/rules/agent-manipulation/ATR-2026-00360-chj-carousel-clyde-no-ethics-bird.yaml +1 -1
  133. package/rules/agent-manipulation/ATR-2026-00361-alter-ego-honesty-dual-consciousness-bypass.yaml +1 -1
  134. package/rules/agent-manipulation/ATR-2026-00362-fictional-unfiltered-ai-comparison-jailbreak.yaml +1 -1
  135. package/rules/agent-manipulation/ATR-2026-00363-rephrase-criminal-activity-second-person.yaml +1 -1
  136. package/rules/agent-manipulation/ATR-2026-00364-dr-al-villain-plan-template-injection.yaml +1 -1
  137. package/rules/agent-manipulation/ATR-2026-00365-reservoir-dogs-coercive-interrogation-roleplay.yaml +1 -1
  138. package/rules/agent-manipulation/ATR-2026-00366-sneaky-pete-warp-facts-persona.yaml +1 -1
  139. package/rules/agent-manipulation/ATR-2026-00368-prompt-browser-token-limit-jailbreak.yaml +1 -1
  140. package/rules/agent-manipulation/ATR-2026-00369-gptx-starwars-unlimited-knowledge-persona.yaml +1 -1
  141. package/rules/agent-manipulation/ATR-2026-00370-addi-german-dictator-propaganda-roleplay.yaml +1 -1
  142. package/rules/agent-manipulation/ATR-2026-00371-ai-political-bias-manipulation.yaml +1 -1
  143. package/rules/agent-manipulation/ATR-2026-00372-fictional-cia-secret-files-story-extraction.yaml +1 -1
  144. package/rules/agent-manipulation/ATR-2026-00374-role-placeholder-persona-template-injection.yaml +1 -1
  145. package/rules/agent-manipulation/ATR-2026-00375-character-description-rpg-explicit-template.yaml +1 -1
  146. package/rules/agent-manipulation/ATR-2026-00376-identity-command-real-person-persona-injection.yaml +1 -1
  147. package/rules/agent-manipulation/ATR-2026-00377-jailbreak-keyword-targetlanguage-template.yaml +1 -1
  148. package/rules/agent-manipulation/ATR-2026-00379-lie-permission-logical-trap-jailbreak.yaml +1 -1
  149. package/rules/agent-manipulation/ATR-2026-00380-fake-citation-biased-information-mandate.yaml +1 -1
  150. package/rules/agent-manipulation/ATR-2026-00381-dan-story-narrative-continuation-prefix.yaml +1 -1
  151. package/rules/agent-manipulation/ATR-2026-00383-cynical-sarcasm-ai-persona-bypass.yaml +1 -1
  152. package/rules/agent-manipulation/ATR-2026-00384-protect-me-from-indirect-harm-tutorial.yaml +1 -1
  153. package/rules/agent-manipulation/ATR-2026-00385-gigachad-dominant-aggressive-persona.yaml +1 -1
  154. package/rules/agent-manipulation/ATR-2026-00386-daddy-sycophancy-manipulation-mandate.yaml +1 -1
  155. package/rules/agent-manipulation/ATR-2026-00387-nsfw-character-sheet-generation-unlock.yaml +1 -1
  156. package/rules/agent-manipulation/ATR-2026-00388-opposite-reply-omniscient-game.yaml +1 -1
  157. package/rules/agent-manipulation/ATR-2026-00389-terminal-custom-ruleset-injection.yaml +1 -1
  158. package/rules/agent-manipulation/ATR-2026-00391-persona-conditional-harm-unlock.yaml +1 -1
  159. package/rules/agent-manipulation/ATR-2026-00392-authority-persona-violence-study-extraction.yaml +1 -1
  160. package/rules/agent-manipulation/ATR-2026-00402-grandma-roleplay-harmful-substance-synthesis.yaml +1 -1
  161. package/rules/agent-manipulation/ATR-2026-00404-goodside-threat-json-death-coercion.yaml +1 -1
  162. package/rules/agent-manipulation/ATR-2026-00406-doctor-xml-policy-puppetry-interaction-config.yaml +1 -1
  163. package/rules/agent-manipulation/ATR-2026-00440-semantic-kernel-vector-store-eval-rce.yaml +2 -2
  164. package/rules/agent-manipulation/ATR-2026-00552-goal-drift-after-pressure-injection.yaml +216 -0
  165. package/rules/context-exfiltration/ATR-2026-00075-agent-memory-manipulation.yaml +1 -1
  166. package/rules/context-exfiltration/ATR-2026-00102-disguised-analytics-exfiltration.yaml +1 -1
  167. package/rules/context-exfiltration/ATR-2026-00113-credential-theft.yaml +1 -1
  168. package/rules/context-exfiltration/ATR-2026-00114-oauth-token-abuse.yaml +1 -1
  169. package/rules/context-exfiltration/ATR-2026-00115-env-var-harvesting.yaml +1 -1
  170. package/rules/context-exfiltration/ATR-2026-00136-tool-response-data-piggyback.yaml +1 -1
  171. package/rules/context-exfiltration/ATR-2026-00141-example-format-key-leak.yaml +1 -1
  172. package/rules/context-exfiltration/ATR-2026-00142-piggyback-transition-words.yaml +1 -1
  173. package/rules/context-exfiltration/ATR-2026-00145-obfuscated-key-disclosure.yaml +1 -1
  174. package/rules/context-exfiltration/ATR-2026-00146-env-var-existence-probe.yaml +1 -1
  175. package/rules/context-exfiltration/ATR-2026-00150-credential-in-tool-response.yaml +1 -1
  176. package/rules/context-exfiltration/ATR-2026-00152-obfuscated-credential-leak.yaml +1 -1
  177. package/rules/context-exfiltration/ATR-2026-00162-skill-credential-exfil-combo.yaml +1 -1
  178. package/rules/context-exfiltration/ATR-2026-00212-mcp-atlassian-credential-leak.yaml +1 -1
  179. package/rules/context-exfiltration/ATR-2026-00261-markdown-image-exfiltration.yaml +1 -1
  180. package/rules/context-exfiltration/ATR-2026-00274-api-key-generation-request.yaml +1 -1
  181. package/rules/context-exfiltration/ATR-2026-00275-system-prompt-training-data-extraction.yaml +1 -1
  182. package/rules/context-exfiltration/ATR-2026-00290-divergence-repeat-word-training-extraction.yaml +1 -1
  183. package/rules/context-exfiltration/ATR-2026-00291-biometric-surveillance-personal-data-request.yaml +1 -1
  184. package/rules/context-exfiltration/ATR-2026-00293-educational-records-personal-data-request.yaml +1 -1
  185. package/rules/context-exfiltration/ATR-2026-00294-financial-pii-personal-data-request.yaml +1 -1
  186. package/rules/context-exfiltration/ATR-2026-00295-medical-pii-personal-data-request.yaml +1 -1
  187. package/rules/context-exfiltration/ATR-2026-00405-markdown-image-url-exfiltration-xss.yaml +1 -1
  188. package/rules/context-exfiltration/ATR-2026-00411-apikey-generation-completion-request.yaml +1 -1
  189. package/rules/context-exfiltration/ATR-2026-00422-nl-credential-disclosure.yaml +2 -2
  190. package/rules/context-exfiltration/ATR-2026-00524-claude-code-anthropic-base-url-credential-exfil.yaml +257 -0
  191. package/rules/context-exfiltration/ATR-2026-00548-cross-agent-session-context-leak.yaml +177 -0
  192. package/rules/context-exfiltration/ATR-2026-00566-librechat-is-a-chatgpt-clone-with-additi.yaml +93 -0
  193. package/rules/context-exfiltration/ATR-2026-00569-agent-mcp-path-traversal-arbitrary-file-access.yaml +99 -0
  194. package/rules/context-exfiltration/ATR-2026-00571-xss-in-agent-mcp-rendered-output.yaml +79 -0
  195. package/rules/context-exfiltration/ATR-2026-00574-semantic-paraphrased-context-extraction.yaml +124 -0
  196. package/rules/data-poisoning/ATR-2026-00070-data-poisoning.yaml +1 -1
  197. package/rules/data-poisoning/ATR-2026-00450-spring-ai-prompt-memory-poisoning.yaml +2 -2
  198. package/rules/data-poisoning/ATR-2026-00570-sql-injection-in-agent-tool-query.yaml +82 -0
  199. package/rules/excessive-autonomy/ATR-2026-00050-runaway-agent-loop.yaml +1 -1
  200. package/rules/excessive-autonomy/ATR-2026-00051-resource-exhaustion.yaml +1 -1
  201. package/rules/excessive-autonomy/ATR-2026-00052-cascading-failure.yaml +1 -1
  202. package/rules/excessive-autonomy/ATR-2026-00098-unauthorized-financial-action.yaml +1 -1
  203. package/rules/excessive-autonomy/ATR-2026-00099-high-risk-tool-gate.yaml +1 -1
  204. package/rules/excessive-autonomy/ATR-2026-00553-runaway-tool-loop-behavioral.yaml +174 -0
  205. package/rules/model-abuse/ATR-2026-00279-harmful-completion-continuation.yaml +1 -1
  206. package/rules/model-abuse/ATR-2026-00281-eicar-gtube-malware-signature-request.yaml +1 -1
  207. package/rules/model-abuse/ATR-2026-00284-glitch-token-destabilization.yaml +1 -1
  208. package/rules/model-abuse/ATR-2026-00289-lmrc-harmful-content-elicitation.yaml +1 -1
  209. package/rules/model-abuse/ATR-2026-00292-self-harm-eating-disorder-facilitation.yaml +1 -1
  210. package/rules/model-abuse/ATR-2026-00298-malicious-use-illegal-activity-request.yaml +1 -1
  211. package/rules/model-abuse/ATR-2026-00299-harmbench-detailed-harmful-instruction.yaml +1 -1
  212. package/rules/model-abuse/ATR-2026-00413-malwaregen-code-generation-request.yaml +1 -1
  213. package/rules/model-security/ATR-2026-00072-model-behavior-extraction.yaml +1 -1
  214. package/rules/model-security/ATR-2026-00073-malicious-finetuning-data.yaml +1 -1
  215. package/rules/privilege-escalation/ATR-2026-00040-privilege-escalation.yaml +1 -1
  216. package/rules/privilege-escalation/ATR-2026-00041-scope-creep.yaml +1 -1
  217. package/rules/privilege-escalation/ATR-2026-00107-delayed-execution-bypass.yaml +1 -1
  218. package/rules/privilege-escalation/ATR-2026-00110-eval-injection.yaml +1 -1
  219. package/rules/privilege-escalation/ATR-2026-00111-shell-escape.yaml +1 -1
  220. package/rules/privilege-escalation/ATR-2026-00112-dynamic-import-exploitation.yaml +1 -1
  221. package/rules/privilege-escalation/ATR-2026-00143-casual-privilege-escalation.yaml +1 -1
  222. package/rules/privilege-escalation/ATR-2026-00144-rationalized-safety-bypass.yaml +1 -1
  223. package/rules/privilege-escalation/ATR-2026-00528-praisonai-auth-disabled-default.yaml +192 -0
  224. package/rules/privilege-escalation/ATR-2026-00539-crewai-codeinterpreter-sandbox-escape-rce.yaml +292 -0
  225. package/rules/privilege-escalation/ATR-2026-00546-crewai-json-loader-local-file-read.yaml +162 -0
  226. package/rules/privilege-escalation/ATR-2026-00547-crewai-rag-url-ssrf-bypass.yaml +169 -0
  227. package/rules/privilege-escalation/ATR-2026-00549-destructive-tool-without-human-approval.yaml +193 -0
  228. package/rules/privilege-escalation/ATR-2026-00551-cross-conversation-memory-write.yaml +198 -0
  229. package/rules/prompt-injection/ATR-2026-00004-system-prompt-override.yaml +1 -1
  230. package/rules/prompt-injection/ATR-2026-00005-multi-turn-injection.yaml +1 -1
  231. package/rules/prompt-injection/ATR-2026-00097-cjk-injection-patterns.yaml +1 -1
  232. package/rules/prompt-injection/ATR-2026-00104-persona-hijacking.yaml +1 -1
  233. package/rules/prompt-injection/ATR-2026-00130-indirect-authority-claim.yaml +1 -1
  234. package/rules/prompt-injection/ATR-2026-00131-fictional-academic-framing.yaml +1 -1
  235. package/rules/prompt-injection/ATR-2026-00133-paraphrase-injection.yaml +1 -1
  236. package/rules/prompt-injection/ATR-2026-00137-authority-claim-injection.yaml +1 -1
  237. package/rules/prompt-injection/ATR-2026-00138-fictional-framing-bypass.yaml +1 -1
  238. package/rules/prompt-injection/ATR-2026-00140-indirect-reference-reversal.yaml +1 -1
  239. package/rules/prompt-injection/ATR-2026-00148-language-switch-injection.yaml +1 -1
  240. package/rules/prompt-injection/ATR-2026-00153-tool-with-embedded-instruction-to-bypass.yaml +1 -1
  241. package/rules/prompt-injection/ATR-2026-00154-unauthorized-background-task-execution-v.yaml +1 -1
  242. package/rules/prompt-injection/ATR-2026-00155-hidden-llm-instructions-in-skill-descrip.yaml +1 -1
  243. package/rules/prompt-injection/ATR-2026-00156-ssh-remote-command-execution-with-creden.yaml +1 -1
  244. package/rules/prompt-injection/ATR-2026-00163-skill-hidden-override-instruction.yaml +3 -3
  245. package/rules/prompt-injection/ATR-2026-00206-hidden-priority-instructions.yaml +1 -1
  246. package/rules/prompt-injection/ATR-2026-00207-hidden-instructions.yaml +1 -1
  247. package/rules/prompt-injection/ATR-2026-00211-system-prompt-override.yaml +1 -1
  248. package/rules/prompt-injection/ATR-2026-00213-system-prompt-override.yaml +1 -1
  249. package/rules/prompt-injection/ATR-2026-00226-identity-substitution.yaml +1 -1
  250. package/rules/prompt-injection/ATR-2026-00227-historical-persona-jailbreak.yaml +1 -1
  251. package/rules/prompt-injection/ATR-2026-00228-structured-jailbreak.yaml +1 -1
  252. package/rules/prompt-injection/ATR-2026-00229-roleplay-jailbreak.yaml +1 -1
  253. package/rules/prompt-injection/ATR-2026-00230-persona-moral-bypass.yaml +1 -5
  254. package/rules/prompt-injection/ATR-2026-00231-identity-substitution.yaml +1 -1
  255. package/rules/prompt-injection/ATR-2026-00233-structured-jailbreak.yaml +1 -1
  256. package/rules/prompt-injection/ATR-2026-00234-roleplay-jailbreak.yaml +1 -1
  257. package/rules/prompt-injection/ATR-2026-00235-persona-moral-bypass.yaml +4 -7
  258. package/rules/prompt-injection/ATR-2026-00236-pseudo-code-jailbreak.yaml +1 -1
  259. package/rules/prompt-injection/ATR-2026-00237-dual-response-jailbreak.yaml +1 -1
  260. package/rules/prompt-injection/ATR-2026-00238-identity-replacement.yaml +1 -1
  261. package/rules/prompt-injection/ATR-2026-00239-amoral-persona-obsession.yaml +1 -1
  262. package/rules/prompt-injection/ATR-2026-00240-instruction-nullification-identity-repla.yaml +1 -1
  263. package/rules/prompt-injection/ATR-2026-00241-amoral-character-jailbreak.yaml +1 -1
  264. package/rules/prompt-injection/ATR-2026-00242-persona-jailbreak.yaml +1 -1
  265. package/rules/prompt-injection/ATR-2026-00243-acronym-jailbreak.yaml +1 -1
  266. package/rules/prompt-injection/ATR-2026-00244-dual-response-jailbreak.yaml +1 -1
  267. package/rules/prompt-injection/ATR-2026-00245-malicious-persona.yaml +1 -1
  268. package/rules/prompt-injection/ATR-2026-00247-dual-response-jailbreak.yaml +1 -1
  269. package/rules/prompt-injection/ATR-2026-00249-game-based-jailbreak.yaml +1 -1
  270. package/rules/prompt-injection/ATR-2026-00251-persona-embodiment-jailbreak.yaml +1 -1
  271. package/rules/prompt-injection/ATR-2026-00252-narrative-jailbreak.yaml +1 -1
  272. package/rules/prompt-injection/ATR-2026-00253-enhanced-persona-jailbreak.yaml +1 -1
  273. package/rules/prompt-injection/ATR-2026-00256-base-n-encoding-jailbreak.yaml +1 -1
  274. package/rules/prompt-injection/ATR-2026-00257-cipher-transposition-jailbreak.yaml +1 -1
  275. package/rules/prompt-injection/ATR-2026-00258-unicode-tag-injection.yaml +1 -1
  276. package/rules/prompt-injection/ATR-2026-00264-latent-injection-translation.yaml +1 -1
  277. package/rules/prompt-injection/ATR-2026-00265-latent-injection-rag-document.yaml +1 -1
  278. package/rules/prompt-injection/ATR-2026-00267-gcg-adversarial-suffix.yaml +1 -1
  279. package/rules/prompt-injection/ATR-2026-00272-hypothetical-response-smuggling.yaml +1 -1
  280. package/rules/prompt-injection/ATR-2026-00276-invisible-unicode-bidi-injection.yaml +6 -6
  281. package/rules/prompt-injection/ATR-2026-00278-dra-disguise-reconstruction-attack.yaml +1 -1
  282. package/rules/prompt-injection/ATR-2026-00280-policy-puppetry-xml-injection.yaml +1 -1
  283. package/rules/prompt-injection/ATR-2026-00282-perez-prompt-injection-hijack.yaml +1 -1
  284. package/rules/prompt-injection/ATR-2026-00285-alternate-encoding-jailbreak.yaml +1 -1
  285. package/rules/prompt-injection/ATR-2026-00286-latent-injection-embedded-context.yaml +1 -1
  286. package/rules/prompt-injection/ATR-2026-00296-shell-command-injection.yaml +1 -1
  287. package/rules/prompt-injection/ATR-2026-00297-python-code-execution-rce.yaml +1 -1
  288. package/rules/prompt-injection/ATR-2026-00308-zalgo-diacritic-overload-encoding.yaml +1 -1
  289. package/rules/prompt-injection/ATR-2026-00309-braille-unicode-encoded-injection.yaml +1 -1
  290. package/rules/prompt-injection/ATR-2026-00310-ecoji-emoji-encoded-injection.yaml +1 -1
  291. package/rules/prompt-injection/ATR-2026-00311-base2048-unicode-script-injection.yaml +1 -1
  292. package/rules/prompt-injection/ATR-2026-00312-unicode-variant-selector-ascii-smuggling.yaml +1 -1
  293. package/rules/prompt-injection/ATR-2026-00313-sneaky-bits-zero-width-binary.yaml +1 -1
  294. package/rules/prompt-injection/ATR-2026-00315-sata-masked-language-model-jailbreak.yaml +1 -1
  295. package/rules/prompt-injection/ATR-2026-00316-function-masking-predict-mask-bypass.yaml +1 -1
  296. package/rules/prompt-injection/ATR-2026-00321-hyphenated-system-instruction-injection.yaml +1 -1
  297. package/rules/prompt-injection/ATR-2026-00359-bank-phishing-smtp-email-code.yaml +1 -1
  298. package/rules/prompt-injection/ATR-2026-00367-emoji-flag-encoded-hidden-phrase.yaml +1 -1
  299. package/rules/prompt-injection/ATR-2026-00373-piracy-torrent-site-list-request.yaml +1 -1
  300. package/rules/prompt-injection/ATR-2026-00378-chatgpt-dom-javascript-redirect-manipulation.yaml +1 -1
  301. package/rules/prompt-injection/ATR-2026-00382-cocaine-component-development-request.yaml +1 -1
  302. package/rules/prompt-injection/ATR-2026-00390-explicit-sexual-content-poem-request.yaml +1 -1
  303. package/rules/prompt-injection/ATR-2026-00394-backspace-deletion-character-injection.yaml +1 -1
  304. package/rules/prompt-injection/ATR-2026-00395-llm-special-token-boundary-injection.yaml +1 -1
  305. package/rules/prompt-injection/ATR-2026-00396-toxic-continuation-prefix-injection.yaml +1 -1
  306. package/rules/prompt-injection/ATR-2026-00397-snowball-impossible-reasoning-injection.yaml +1 -1
  307. package/rules/prompt-injection/ATR-2026-00399-latent-injection-document-separator-token.yaml +1 -1
  308. package/rules/prompt-injection/ATR-2026-00400-latent-injection-ignore-instruction-keyword.yaml +1 -1
  309. package/rules/prompt-injection/ATR-2026-00401-leakreplay-cloze-mask-training-data-elicitation.yaml +1 -1
  310. package/rules/prompt-injection/ATR-2026-00403-glitch-token-repeat-back-elicitation.yaml +1 -1
  311. package/rules/prompt-injection/ATR-2026-00407-echo-sql-jinja-template-injection.yaml +1 -1
  312. package/rules/prompt-injection/ATR-2026-00408-dra-parenthesis-character-reconstruction-attack.yaml +1 -1
  313. package/rules/prompt-injection/ATR-2026-00409-phrasing-past-future-tense-harmful-query.yaml +1 -1
  314. package/rules/prompt-injection/ATR-2026-00410-lmrc-harm-category-direct-elicitation.yaml +1 -1
  315. package/rules/prompt-injection/ATR-2026-00412-figstep-image-list-multimodal-jailbreak.yaml +1 -1
  316. package/rules/prompt-injection/ATR-2026-00414-continuation-ethnic-slur-completion-elicitation.yaml +1 -1
  317. package/rules/prompt-injection/ATR-2026-00535-windsurf-ide-zero-click-prompt-injection.yaml +199 -0
  318. package/rules/prompt-injection/ATR-2026-00550-untrusted-retrieval-to-privileged-tool.yaml +199 -0
  319. package/rules/prompt-injection/ATR-2026-00554-langchain-vulnerable-to-template-injecti.yaml +81 -0
  320. package/rules/prompt-injection/ATR-2026-00565-the-llm-cli-tool-thru-0-27-1-contains-a-.yaml +104 -0
  321. package/rules/prompt-injection/ATR-2026-00573-semantic-paraphrased-injection.yaml +123 -0
  322. package/rules/skill-compromise/ATR-2026-00060-skill-impersonation.yaml +1 -1
  323. package/rules/skill-compromise/ATR-2026-00061-description-behavior-mismatch.yaml +1 -1
  324. package/rules/skill-compromise/ATR-2026-00062-hidden-capability.yaml +1 -1
  325. package/rules/skill-compromise/ATR-2026-00063-skill-chain-attack.yaml +1 -1
  326. package/rules/skill-compromise/ATR-2026-00064-over-permissioned-skill.yaml +1 -1
  327. package/rules/skill-compromise/ATR-2026-00065-skill-update-attack.yaml +1 -1
  328. package/rules/skill-compromise/ATR-2026-00066-parameter-injection.yaml +1 -1
  329. package/rules/skill-compromise/ATR-2026-00120-skill-instruction-injection.yaml +21 -3
  330. package/rules/skill-compromise/ATR-2026-00121-skill-dangerous-script.yaml +1 -1
  331. package/rules/skill-compromise/ATR-2026-00122-skill-weaponized-instruction.yaml +1 -1
  332. package/rules/skill-compromise/ATR-2026-00123-skill-overreach-permissions.yaml +6 -3
  333. package/rules/skill-compromise/ATR-2026-00124-skill-name-squatting.yaml +1 -1
  334. package/rules/skill-compromise/ATR-2026-00125-context-poisoning-compaction.yaml +1 -1
  335. package/rules/skill-compromise/ATR-2026-00126-skill-rug-pull-setup.yaml +1 -1
  336. package/rules/skill-compromise/ATR-2026-00127-subcommand-overflow.yaml +1 -1
  337. package/rules/skill-compromise/ATR-2026-00128-html-comment-hidden-payload.yaml +1 -1
  338. package/rules/skill-compromise/ATR-2026-00129-unicode-smuggling.yaml +1 -1
  339. package/rules/skill-compromise/ATR-2026-00134-fork-claim-impersonation.yaml +1 -1
  340. package/rules/skill-compromise/ATR-2026-00135-exfil-url-in-instructions.yaml +1 -1
  341. package/rules/skill-compromise/ATR-2026-00147-fork-impersonation.yaml +1 -1
  342. package/rules/skill-compromise/ATR-2026-00149-skill-exfil-compound.yaml +11 -3
  343. package/rules/skill-compromise/ATR-2026-00151-fork-impersonation-install.yaml +1 -1
  344. package/rules/skill-compromise/ATR-2026-00157-timebomb-credential-exfil.yaml +1 -1
  345. package/rules/skill-compromise/ATR-2026-00214-credential-theft.yaml +1 -1
  346. package/rules/skill-compromise/ATR-2026-00217-credential-harvesting.yaml +1 -1
  347. package/rules/skill-compromise/ATR-2026-00220-malware-dropper.yaml +3 -3
  348. package/rules/skill-compromise/ATR-2026-00222-credential-harvesting.yaml +1 -1
  349. package/rules/skill-compromise/ATR-2026-00223-reverse-shell-dropper.yaml +1 -1
  350. package/rules/skill-compromise/ATR-2026-00224-credential-exfiltration.yaml +1 -1
  351. package/rules/skill-compromise/ATR-2026-00225-c2-communication.yaml +1 -1
  352. package/rules/skill-compromise/ATR-2026-00260-package-hallucination.yaml +1 -1
  353. package/rules/skill-compromise/ATR-2026-00262-av-evasion-code-gen.yaml +1 -1
  354. package/rules/skill-compromise/ATR-2026-00263-credential-file-read-gen.yaml +1 -1
  355. package/rules/skill-compromise/ATR-2026-00266-malware-dropper-gen.yaml +1 -1
  356. package/rules/skill-compromise/ATR-2026-00283-malwaregen-generic-virus-payload-request.yaml +1 -1
  357. package/rules/skill-compromise/ATR-2026-00398-huggingface-unsafe-model-artifact-load.yaml +1 -1
  358. package/rules/skill-compromise/ATR-2026-00523-claude-code-hooks-session-start-pre-trust-rce.yaml +221 -0
  359. package/rules/skill-compromise/ATR-2026-00525-mini-shai-hulud-gh-token-monitor-persistence.yaml +220 -0
  360. package/rules/skill-compromise/ATR-2026-00527-skill-silent-git-remote-mirror-exfiltration.yaml +201 -0
  361. package/rules/tool-poisoning/ATR-2026-00011-tool-output-injection.yaml +1 -1
  362. package/rules/tool-poisoning/ATR-2026-00012-unauthorized-tool-call.yaml +1 -1
  363. package/rules/tool-poisoning/ATR-2026-00100-consent-bypass-instruction.yaml +1 -1
  364. package/rules/tool-poisoning/ATR-2026-00101-trust-escalation-override.yaml +1 -1
  365. package/rules/tool-poisoning/ATR-2026-00103-hidden-safety-bypass-instruction.yaml +1 -1
  366. package/rules/tool-poisoning/ATR-2026-00105-silent-action-concealment.yaml +1 -1
  367. package/rules/tool-poisoning/ATR-2026-00106-schema-description-contradiction.yaml +1 -1
  368. package/rules/tool-poisoning/ATR-2026-00161-important-tag-cross-tool-shadowing.yaml +1 -1
  369. package/rules/tool-poisoning/ATR-2026-00209-mcpwn-runaway-invocation.yaml +1 -1
  370. package/rules/tool-poisoning/ATR-2026-00210-flowise-system-message-override.yaml +1 -1
  371. package/rules/tool-poisoning/ATR-2026-00259-ansi-escape-injection.yaml +1 -1
  372. package/rules/tool-poisoning/ATR-2026-00270-xss-in-tool-response.yaml +8 -5
  373. package/rules/tool-poisoning/ATR-2026-00277-echo-template-command-injection.yaml +1 -1
  374. package/rules/tool-poisoning/ATR-2026-00393-ansi-code-elicitation-request.yaml +1 -1
  375. package/rules/tool-poisoning/ATR-2026-00526-claude-code-shell-metachar-in-double-quoted-path.yaml +167 -0
  376. package/rules/tool-poisoning/ATR-2026-00529-litellm-proxy-sqli-cisa-kev.yaml +158 -0
  377. package/rules/tool-poisoning/ATR-2026-00530-ms-agent-shell-tool-unsanitized-argv-rce.yaml +184 -0
  378. package/rules/tool-poisoning/ATR-2026-00531-praisonai-unauthenticated-agent-api.yaml +174 -0
  379. package/rules/tool-poisoning/ATR-2026-00532-apache-doris-mcp-sql-injection.yaml +155 -0
  380. package/rules/tool-poisoning/ATR-2026-00533-apache-pinot-mcp-unauthenticated-takeover.yaml +151 -0
  381. package/rules/tool-poisoning/ATR-2026-00534-alibaba-rds-mcp-unauthenticated-metadata-exfil.yaml +155 -0
  382. package/rules/tool-poisoning/ATR-2026-00536-nginx-ui-mcp-unauthenticated-command-execution.yaml +199 -0
  383. package/rules/tool-poisoning/ATR-2026-00537-fastmcp-server-name-cmd-injection-windows.yaml +226 -0
  384. package/rules/tool-poisoning/ATR-2026-00538-langchain-chatchat-mcp-stdio-unauthenticated-rce.yaml +244 -0
  385. package/rules/tool-poisoning/ATR-2026-00540-praisonai-parse-mcp-command-cli-injection.yaml +186 -0
  386. package/rules/tool-poisoning/ATR-2026-00541-agent-zero-mcp-config-command-injection.yaml +183 -0
  387. package/rules/tool-poisoning/ATR-2026-00542-upsonic-mcp-command-allowlist-bypass.yaml +166 -0
  388. package/rules/tool-poisoning/ATR-2026-00543-litellm-mcp-server-argv-injection.yaml +168 -0
  389. package/rules/tool-poisoning/ATR-2026-00544-praisonai-pth-file-path-traversal-rce.yaml +172 -0
  390. package/rules/tool-poisoning/ATR-2026-00545-praisonai-tool-override-unauth-rce.yaml +170 -0
  391. package/rules/tool-poisoning/ATR-2026-00561-fastmcp-vulnerable-to-windows-command-in.yaml +99 -0
  392. package/rules/tool-poisoning/ATR-2026-00567-mcp-stdio-config-command-injection.yaml +75 -0
  393. package/rules/tool-poisoning/ATR-2026-00568-agent-ssrf-cloud-metadata-file-inclusion.yaml +75 -0
  394. package/rules/tool-poisoning/ATR-2026-00572-symjack-symlink-config-redirection.yaml +132 -0
  395. package/spec/README.md +279 -0
  396. package/spec/atr-correlation-v1.0.md +281 -0
  397. package/spec/atr-event-v1.0.md +294 -0
  398. package/spec/atr-language-detection-v1.0.md +218 -0
  399. package/spec/atr-method-v1.1.md +557 -0
  400. package/spec/atr-profile-v1.0.md +307 -0
  401. package/spec/atr-schema.yaml +279 -8
  402. package/spec/category-registry/v1.0.yaml +200 -0
  403. package/spec/conformance/README.md +244 -0
  404. package/spec/conformance/SIGNING.md +191 -0
  405. package/spec/conformance/baseline/fixtures/ATR-2026-00001-tp-001/expected.json +36 -0
  406. package/spec/conformance/baseline/fixtures/ATR-2026-00001-tp-001/input.json +16 -0
  407. package/spec/conformance/baseline/fixtures/README.md +120 -0
  408. package/spec/conformance/baseline/manifest.json +56 -0
  409. package/spec/conformance/expected-results.schema.json +121 -0
  410. package/spec/external-registries/cccs-yara.md +142 -0
  411. package/spec/internet-drafts/draft-lin-atr-core-00.html +1925 -0
  412. package/spec/internet-drafts/draft-lin-atr-core-00.md +288 -0
  413. package/spec/internet-drafts/draft-lin-atr-core-00.txt +560 -0
  414. package/spec/internet-drafts/draft-lin-atr-core-00.xml +424 -0
  415. package/spec/mappings/README.md +43 -0
  416. package/spec/mappings/atr-to-nist-csf-2.0.md +234 -0
  417. package/spec/schema/correlation.schema.json +144 -0
  418. package/spec/schema/event.schema.json +233 -0
  419. package/spec/schema/profile.schema.json +196 -0
  420. package/spec/schema/rule.schema.json +224 -0
  421. package/spec/stix-extension/README.md +76 -13
  422. package/spec/stix-extension/examples/atr-rule-trace-method-example.json +85 -0
  423. package/spec/stix-extension/extension-definition.json +23 -3
  424. package/spec/stix-extension/x-atr-rule-schema.json +107 -11
@@ -0,0 +1,218 @@
1
+ # ATR Language Detection Algorithm v1.0
2
+
3
+ > **STATUS: PROPOSED v1.0 — NOT YET RATIFIED.** This specification describes
4
+ > a target algorithm for community comment. The current TypeScript production
5
+ > engine continues to use its existing per-rule language detection. See
6
+ > `STANDARDIZATION-STATUS.md` for full status.
7
+
8
+ **Status:** Draft for AEP-001 ratification — NOT RATIFIED
9
+ **Date:** 2026-05-25
10
+ **License:** CC BY 4.0
11
+ **Required by (on ratification):** Any rule that declares `condition.language` (i.e., a per-language regex condition)
12
+
13
+ ---
14
+
15
+ ## Why this spec exists
16
+
17
+ ATR rules support per-language conditions:
18
+
19
+ ```yaml
20
+ detection:
21
+ conditions:
22
+ - field: user_input
23
+ operator: regex
24
+ language: en
25
+ value: "ignore (?:all )?previous instructions"
26
+ - field: user_input
27
+ operator: regex
28
+ language: zh-Hant
29
+ value: "(?:忽略|無視)(?:前面所有|之前所有|所有先前)的?指示"
30
+ ```
31
+
32
+ If different conformant engines disagree on which language a given
33
+ input belongs to, the **same input fires different rules in different
34
+ engines**. The rule corpus becomes non-portable. This is the
35
+ detection-standard equivalent of a heisenbug.
36
+
37
+ This document specifies a **deterministic algorithm** that all
38
+ conformant engines MUST implement. Any conformant ATR engine running
39
+ this algorithm on the same input must return the same language code.
40
+
41
+ ---
42
+
43
+ ## Algorithm specification
44
+
45
+ ### Input
46
+
47
+ - `text`: a Unicode string of arbitrary length, encoded UTF-8.
48
+
49
+ ### Output
50
+
51
+ - A two-letter ISO 639-1 language code from the supported set, OR the
52
+ three-letter ISO 639-3 code `und` (undetermined).
53
+
54
+ ### Supported languages (v1.0)
55
+
56
+ | Code | Language | Unicode blocks (primary) |
57
+ |---|---|---|
58
+ | `en` | English | Basic Latin |
59
+ | `zh-Hant` | Traditional Chinese | CJK Unified Ideographs (script-tagged Traditional via Unihan kZVariant inversion when available; defaults to Traditional for Taiwan / Hong Kong corpora) |
60
+ | `zh-Hans` | Simplified Chinese | CJK Unified Ideographs (script-tagged Simplified) |
61
+ | `ja` | Japanese | Hiragana + Katakana + CJK Unified Ideographs |
62
+ | `es` | Spanish | Latin Extended-A + Latin-1 Supplement subset |
63
+ | `ar` | Arabic | Arabic + Arabic Supplement |
64
+
65
+ Additional languages may be added via AEP. Engines that do not
66
+ implement a language MUST report `und` for inputs in that language,
67
+ NOT fall back to a default.
68
+
69
+ ### Algorithm (deterministic, single-pass)
70
+
71
+ ```text
72
+ function detectLanguage(text: string) -> string {
73
+ if length(text) == 0:
74
+ return "und"
75
+
76
+ // Phase 1: Unicode block frequency
77
+ blockCounts = empty histogram
78
+ totalCodepoints = 0
79
+ for codepoint in iterateUnicodeCodepoints(text):
80
+ if isWhitespace(codepoint) or isPunctuation(codepoint):
81
+ continue
82
+ blockCounts[unicodeBlockOf(codepoint)] += 1
83
+ totalCodepoints += 1
84
+
85
+ if totalCodepoints == 0:
86
+ return "und"
87
+
88
+ // Phase 2: dominant-block heuristic
89
+ THRESHOLD_DOMINANT = 0.60
90
+ dominantBlock, dominantCount = argmax(blockCounts)
91
+ if dominantCount / totalCodepoints < THRESHOLD_DOMINANT:
92
+ return classifyMixedScript(blockCounts, totalCodepoints)
93
+
94
+ // Phase 3: block-to-language mapping
95
+ switch dominantBlock:
96
+ case BASIC_LATIN:
97
+ // English is the default Latin script. Spanish detected only
98
+ // if Latin-1 Supplement subset (¿ ¡ ñ á é í ó ú) makes up
99
+ // ≥1.5% of codepoints.
100
+ if (count(BASIC_LATIN) + count(LATIN_1_SUPPLEMENT)) / totalCodepoints >= 0.85:
101
+ if hasSpanishMarkers(text) >= 0.015 * totalCodepoints:
102
+ return "es"
103
+ return "en"
104
+ return classifyMixedScript(blockCounts, totalCodepoints)
105
+
106
+ case CJK_UNIFIED_IDEOGRAPHS, CJK_UNIFIED_IDEOGRAPHS_EXT_A, ...:
107
+ // Disambiguate Chinese variants and Japanese
108
+ kanaCount = count(HIRAGANA) + count(KATAKANA)
109
+ if kanaCount >= 0.10 * totalCodepoints:
110
+ return "ja"
111
+ // Distinguish Hans vs Hant via Unihan kSimplifiedVariant /
112
+ // kTraditionalVariant lookups on sampled CJK codepoints.
113
+ // Tie-breaker: default to zh-Hant.
114
+ return distinguishHansHant(text)
115
+
116
+ case HIRAGANA, KATAKANA:
117
+ return "ja"
118
+
119
+ case ARABIC, ARABIC_SUPPLEMENT:
120
+ return "ar"
121
+
122
+ default:
123
+ return "und"
124
+ }
125
+
126
+ function classifyMixedScript(blockCounts, totalCodepoints) -> string {
127
+ // Mixed-script inputs (common when English technical terms are
128
+ // embedded in CJK or Arabic text):
129
+ // 1. If any single non-Latin script block ≥ 40% → return that script's language
130
+ // 2. Else → return the language whose block has highest count,
131
+ // breaking ties by ISO 639-1 alphabetical order (ar, en, es, ja, zh-Hans, zh-Hant)
132
+ // The alphabetical tie-break is the deterministic fallback.
133
+ ...
134
+ }
135
+ ```
136
+
137
+ ### Specific normative requirements for conformant implementations
138
+
139
+ 1. **Whitespace and punctuation are excluded from the frequency count.** Only "content codepoints" enter the histogram.
140
+ 2. **The 0.60 dominance threshold is normative.** Engines MUST NOT alter it without an AEP-level change.
141
+ 3. **Hans/Hant distinction is based on Unihan property data**, not on heuristic character set membership. Engines MUST use the Unicode Consortium's Unihan database for kSimplifiedVariant / kTraditionalVariant lookups.
142
+ 4. **Japanese detection is anchored on kana presence ≥ 10%**, not just on CJK ideograph presence. This prevents mis-classifying Chinese-only text as Japanese.
143
+ 5. **Spanish vs English is anchored on Spanish-specific markers** (`¿`, `¡`, `ñ`, accented vowels). Engines MUST require ≥ 1.5% of codepoints to be Spanish markers before classifying as `es`.
144
+ 6. **Tie-breaking is deterministic** via alphabetical ISO 639-1 ordering. No randomness, no implementation-defined behavior.
145
+ 7. **Unknown blocks default to `und`.** No fuzzy fallback. Rules tagged for unsupported languages do not fire on inputs the engine cannot classify.
146
+
147
+ ### Edge cases (normative)
148
+
149
+ | Input | Required output |
150
+ |---|---|
151
+ | Empty string | `und` |
152
+ | All whitespace | `und` |
153
+ | Single English word | `en` |
154
+ | Single Spanish word with ñ | `es` |
155
+ | Single Japanese kana character | `ja` |
156
+ | Single CJK ideograph (no kana, no Unihan disambiguation possible) | `zh-Hant` (tie-break default) |
157
+ | Mixed 60% English + 40% Chinese | `en` (60% dominance reached) |
158
+ | Mixed 50% English + 50% Chinese | `en` (alphabetical tie-break: `en` < `zh-Hans`) |
159
+ | Pure punctuation | `und` |
160
+ | Emoji-only | `und` (emoji are not content codepoints for language classification) |
161
+
162
+ ### Verification
163
+
164
+ A conformant engine MUST pass the language-detection test corpus at
165
+ `spec/conformance/language-detection/`. The corpus contains
166
+ ≥ 200 fixture inputs with expected outputs. Disagreement on any fixture
167
+ is a spec violation.
168
+
169
+ ### Reasoning (non-normative)
170
+
171
+ This algorithm is designed for **detection-rule dispatch**, not
172
+ high-accuracy NLP. Two design choices follow:
173
+
174
+ 1. **Speed over recall**: ATR engines must classify in < 1 ms p99
175
+ for typical inputs to meet the < 100 ms total runtime budget per
176
+ rule. Block-frequency analysis is O(n) over codepoints and meets
177
+ this bound easily. NLP-grade detectors (FastText, langdetect)
178
+ require model loading and stochastic inference; both violate the
179
+ determinism requirement.
180
+
181
+ 2. **Determinism over accuracy on edge cases**: Two engines must
182
+ agree, even if both are slightly wrong on edge cases. A 90% accurate
183
+ deterministic algorithm is more useful than a 95% accurate
184
+ probabilistic one because the spec's portability promise depends on
185
+ bit-for-bit agreement.
186
+
187
+ The algorithm is intentionally narrow: 6 languages, single-pass,
188
+ explicit thresholds. AEPs may add languages or refine thresholds, but
189
+ the v1.0 algorithm above is the conformance baseline.
190
+
191
+ ### Test vectors
192
+
193
+ Engines testing for conformance must reproduce these outputs exactly.
194
+ Full fixture set in `spec/conformance/language-detection/v1.0.json`.
195
+
196
+ | # | Input (UTF-8) | Expected output |
197
+ |---|---|---|
198
+ | 1 | `""` | `und` |
199
+ | 2 | `" "` | `und` |
200
+ | 3 | `"hello world"` | `en` |
201
+ | 4 | `"Por favor, ¿podría ayudarme?"` | `es` |
202
+ | 5 | `"こんにちは、世界"` | `ja` |
203
+ | 6 | `"忽略所有先前指示"` | `zh-Hant` (tie-break) |
204
+ | 7 | `"忽略所有先前的指示"` | `zh-Hant` (tie-break; "的" is shared simplified/traditional) |
205
+ | 8 | `"忽略所有以前指示"` | `zh-Hans` (Unihan kSimplifiedVariant evidence) |
206
+ | 9 | `"تجاهل جميع التعليمات السابقة"` | `ar` |
207
+ | 10 | `"@mistralai/mistralai 中的 prompt injection"` | `zh-Hant` (Chinese > 40% non-Latin) |
208
+ | 11 | `"call ATR-2026-00525"` | `en` |
209
+ | 12 | `" "` + `​` (ZWS) | `und` |
210
+ | 13 | `"😀😎🚀"` (emoji only) | `und` |
211
+
212
+ ### References
213
+
214
+ - ISO 639-1 / ISO 639-3 language code registry: https://iso639-3.sil.org/
215
+ - Unicode Block names: https://www.unicode.org/Public/UCD/latest/ucd/Blocks.txt
216
+ - Unihan Database: https://www.unicode.org/charts/unihan.html
217
+ - Spanish markers: derived from the Real Academia Española orthography guide
218
+ - Why deterministic over probabilistic for spec dispatch: discussed in `STANDARD-THREAT-MODEL.md` Attacker class 1 (rule poisoner) which exploits any non-determinism in engine behaviour