npm - agent-threat-rules - Versions diffs - 2.2.1 → 3.1.0 - Mend

agent-threat-rules 2.2.1 → 3.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (424) hide show

package/spec/atr-profile-v1.0.md ADDED Viewed

@@ -0,0 +1,307 @@
+# ATR Profile Format v1.0
+> **STATUS: PROPOSED v1.0 — NOT YET RATIFIED.** This specification describes
+> a target profile format for community comment. No formal profile resolver
+> is shipping yet in the production engine. See `STANDARDIZATION-STATUS.md`
+> for full status.
+**Status:** Draft for AEP-003 ratification — NOT RATIFIED
+**Date:** 2026-05-25
+**License:** CC BY 4.0
+**Required by (on ratification):** Conformance claims, sovereign sub-rule packages, F500 compliance binders
+---
+## Purpose
+A **profile** is a named subset of the ATR rule corpus. An adopter
+claims conformance to a profile, not to "all of ATR." This enables:
+1. **Tiered conformance claims.** A startup can claim "ATR-baseline-
+   runtime conformant" without having to run the full 427-rule
+   corpus.
+2. **Compliance binder mapping.** Profiles can be defined per
+   regulatory framework (EU AI Act Article 50, NIST AI RMF MEASURE,
+   ISO/IEC 42001 Annex). Audit pipelines consume the profile, not
+   the entire corpus.
+3. **Sovereign scoping.** A sovereign authority can ship a profile
+   that includes its own `ATR-XX-*` rules plus the relevant canonical
+   subset for its jurisdiction.
+4. **Domain-specific deployment.** Financial-services agents need
+   different rule coverage than healthcare agents. Profiles let
+   verticals declare their relevant subset.
+Profiles are inspired by NIST OSCAL `profile` format (which assembles
+a subset of a control catalog) and the FedRAMP / NIST 800-53
+baseline pattern (Low / Moderate / High).
+---
+## Profile JSON Schema reference
+Machine-readable schema: `spec/schema/profile.schema.json`.
+This Markdown document is the normative prose spec; JSON Schema must
+match (corrected via AEP if drift).
+---
+## Required fields
+```yaml
+profile:
+  schema_version: "1.0"                       # ATR profile spec version
+  id: "atr-baseline-runtime"                  # globally unique profile identifier
+  title: "ATR Baseline Runtime Profile"
+  version: "1.0.0"                            # profile version
+  description: >
+    Minimum runtime detection profile for any AI agent deployment.
+    Covers the high-severity attack classes that occur in every
+    deployed agent runtime regardless of vertical.
+  author: "ATR TSC"
+  date: "2026-05-25"
+  license: "CC-BY-4.0"
+  status: "draft"                             # draft | stable | deprecated
+  conformance_bound:
+    spec_version_min: "1.0"
+    spec_version_max: null                    # null = any future version
+    minimum_rule_coverage: 1.00               # 1.0 = MUST load all included rules
+    minimum_engine_passing: 1.00              # engine MUST pass 100% of conformance corpus when running this profile
+inclusions:
+  - rule_id: "ATR-2026-00001"                 # explicit rule ID
+  - rule_id: "ATR-2026-00525"
+  - rule_id_pattern: "ATR-2026-005*"          # glob pattern
+  - category: "prompt-injection"              # all rules in category
+  - tag_match:                                # all rules matching tag filter
+      severity: ["critical", "high"]
+      maturity: ["stable", "test"]
+exclusions:
+  - rule_id: "ATR-2026-00444"                 # explicit exclusion (overrides inclusions)
+  - tag_match:
+      maturity: ["draft"]                     # exclude draft rules from this profile
+resolved_rules_summary:                       # populated at profile-resolution time, informative
+  total: 138
+  by_category:
+    prompt-injection: 65
+    tool-poisoning: 18
+    skill-compromise: 22
+    ...
+```
+---
+## Inclusion + exclusion semantics
+Profile resolution is a deterministic set-theoretic operation:
+```
+resolved = ∅
+for incl in inclusions:
+  resolved ∪= rules matching incl
+for excl in exclusions:
+  resolved -= rules matching excl
+```
+Inclusions are unioned. Exclusions are subtracted last (so an
+explicit exclusion overrides any inclusion).
+Engines MUST resolve profiles deterministically. Two engines loading
+the same profile against the same corpus version MUST resolve to the
+same rule set.
+---
+## Conformance bounds
+Each profile declares:
+- `spec_version_min` / `spec_version_max`: which ATR spec versions
+  this profile is valid against.
+- `minimum_rule_coverage`: fraction of included rules the engine
+  must load successfully to claim conformance. Typically `1.00`.
+- `minimum_engine_passing`: fraction of the conformance corpus
+  test cases the engine must pass while running this profile.
+A claim of "engine X is ATR-baseline-runtime conformant" requires
+running the conformance corpus through the engine with this
+profile loaded, and meeting both bounds.
+---
+## Canonical profiles published at v1.0
+The TSC publishes a set of canonical profiles at
+`spec/profiles/v1.0/`. Initial set:
+| Profile ID | Purpose | Approximate rule count |
+|---|---|---|
+| `atr-baseline-runtime` | Minimum coverage for any agent runtime. Critical/high severity only, stable+test maturity. | ~130-180 |
+| `atr-full-corpus` | All canonical rules at all maturity levels. | full (427+) |
+| `atr-stable-only` | Only stable+tsc_approved rules. F500 compliance baseline. | ~50-80 |
+| `atr-eu-aiact-art50` | Rules relevant to EU AI Act Article 50 disclosure obligations. | TBD per legal review |
+| `atr-nist-rmf-measure` | Rules relevant to NIST AI RMF MEASURE function. | TBD per OSCAL mapping |
+| `atr-iso42001-annex-a` | Rules relevant to ISO/IEC 42001 AIMS Annex A controls. | TBD |
+| `atr-skill-supply-chain` | Rules targeting skill / package supply-chain compromise (Mini Shai-Hulud class). | ~30-50 |
+| `atr-mcp-runtime-only` | Rules with scan_target=mcp only. | ~270 |
+| `atr-skill-static-only` | Rules with scan_target=skill (static SKILL.md scanning). | ~80 |
+Vertical-specific profiles (financial, healthcare, public-sector)
+are published by the relevant working group as community profiles,
+not canonical.
+Sovereign-specific profiles (`atr-sovereign-de`, `atr-sovereign-sg`)
+are published by the sovereign authority per their sovereign sub-
+range and reviewed by the TSC for spec conformance only (not
+content review — content is sovereign authority's editorial call).
+---
+## Versioning
+Profile versioning follows SemVer:
+- **PATCH** bump: rule additions to inclusions / exclusions that do
+  not remove existing coverage.
+- **MINOR** bump: rule removals or scope changes that affect coverage.
+- **MAJOR** bump: schema changes or conformance-bound tightening.
+Consumers SHOULD pin to a specific profile version
+(`atr-baseline-runtime@1.0.0`) for audit reproducibility.
+---
+## Example — `atr-baseline-runtime` v1.0.0 (canonical)
+```yaml
+profile:
+  schema_version: "1.0"
+  id: "atr-baseline-runtime"
+  title: "ATR Baseline Runtime Profile"
+  version: "1.0.0"
+  description: >
+    Minimum runtime detection profile for any AI agent deployment.
+    Covers high-severity attack classes (prompt injection, tool
+    poisoning, privilege escalation, skill compromise) at stable
+    and test maturity. Excludes draft, experimental, and deprecated
+    rules. Designed as the bare-minimum claim for any production
+    agent deployment.
+  author: "ATR TSC"
+  date: "2026-05-25"
+  license: "CC-BY-4.0"
+  status: "stable"
+  conformance_bound:
+    spec_version_min: "1.0"
+    spec_version_max: null
+    minimum_rule_coverage: 1.00
+    minimum_engine_passing: 1.00
+inclusions:
+  - tag_match:
+      category: ["prompt-injection", "tool-poisoning",
+                 "privilege-escalation", "skill-compromise"]
+      severity: ["critical", "high"]
+      maturity: ["stable", "test"]
+exclusions:
+  - rule_status: "deprecated"
+  - rule_status: "draft"
+  - tag_match:
+      maturity: ["draft", "experimental"]
+```
+---
+## Example — `atr-sovereign-de` v1.0.0 (sovereign profile)
+```yaml
+profile:
+  schema_version: "1.0"
+  id: "atr-sovereign-de"
+  title: "ATR German Sovereign Profile (BSI-issued)"
+  version: "1.0.0"
+  description: >
+    Sovereign profile maintained by German BSI for use in regulated
+    sectors under NIS2 / BSI-Grundschutz / German implementation of
+    EU AI Act. Includes canonical baseline plus BSI-issued
+    ATR-DE-* rules for German-specific threat landscape.
+  author: "Bundesamt für Sicherheit in der Informationstechnik (BSI)"
+  date: "2026-05-25"
+  license: "CC-BY-4.0"
+  status: "draft"
+  conformance_bound:
+    spec_version_min: "1.0"
+    minimum_rule_coverage: 1.00
+    minimum_engine_passing: 1.00
+inclusions:
+  - profile: "atr-baseline-runtime@1.0.0"     # inherit baseline
+  - rule_id_pattern: "ATR-DE-*"               # include all DE-prefixed rules
+  - tag_match:
+      category: ["context-exfiltration"]      # additional DE-relevant category
+      severity: ["critical", "high", "medium"]
+exclusions:
+  - rule_id_pattern: "ATR-2026-009*"          # de-scoped per BSI editorial
+```
+The `profile: "<other-profile>@<version>"` inclusion syntax enables
+composition — a sovereign profile inherits baseline + adds its
+sovereign-specific rules + de-scopes any rules its authority does
+not endorse.
+---
+## Profile resolution algorithm (normative)
+```python
+def resolve_profile(profile, corpus, recursion_guard):
+    if profile.id in recursion_guard:
+        raise ProfileCircularReference(profile.id)
+    recursion_guard.add(profile.id)
+    resolved = set()
+    for incl in profile.inclusions:
+        if incl.profile:
+            base_profile = corpus.profiles[incl.profile_id]
+            resolved |= resolve_profile(base_profile, corpus, recursion_guard)
+        if incl.rule_id:
+            resolved.add(corpus.rules[incl.rule_id])
+        if incl.rule_id_pattern:
+            resolved |= {r for r in corpus.rules if fnmatch(r.id, incl.rule_id_pattern)}
+        if incl.category:
+            resolved |= {r for r in corpus.rules if r.tags.category == incl.category}
+        if incl.tag_match:
+            resolved |= {r for r in corpus.rules if matches_tag_filter(r, incl.tag_match)}
+    for excl in profile.exclusions:
+        if excl.rule_id:
+            resolved.discard(corpus.rules[excl.rule_id])
+        if excl.rule_id_pattern:
+            resolved -= {r for r in corpus.rules if fnmatch(r.id, excl.rule_id_pattern)}
+        if excl.tag_match:
+            resolved -= {r for r in corpus.rules if matches_tag_filter(r, excl.tag_match)}
+        if excl.rule_status:
+            resolved -= {r for r in resolved if r.status == excl.rule_status}
+    recursion_guard.remove(profile.id)
+    return resolved
+```
+Circular profile references are an error. Resolution depth is
+unbounded by spec; engines MAY impose a depth limit for performance,
+which MUST be ≥ 10.
+---
+## References
+- NIST OSCAL Profile model: https://pages.nist.gov/OSCAL/concepts/layer/profile/profile/
+- NIST 800-53 baselines (Low/Moderate/High): https://csrc.nist.gov/publications/detail/sp/800-53b/final
+- FedRAMP profile pattern: https://www.fedramp.gov/baselines/
+- SemVer 2.0: https://semver.org/
+- ATR Rule Format Spec v1.0: ATR-SPEC-v1.md
+- ATR Category Registry v1.0: spec/category-registry/v1.0.yaml

package/spec/atr-schema.yaml CHANGED Viewed

@@ -1,15 +1,15 @@
 # ATR Rule Schema -- Agent Threat Rules
-# Version: 0.1.0-draft
 #
-# Inspired by Sigma rule format, extended for AI Agent attack surfaces.
-# This schema defines the structure for all ATR detection rules.
+# Machine-readable form of the rule structure defined in SPEC.md
+# (Section 5). When the two disagree, SPEC.md is normative.
 #
-# Status: RFC (Request for Comments)
+# Status: Draft (tracks SPEC.md v1.0)
 # License: MIT
+# Canonical reference: https://github.com/Agent-Threat-Rule/agent-threat-rules/blob/main/SPEC.md
 $schema: "https://json-schema.org/draft/2020-12/schema"
 title: ATR Rule Schema
-description: Schema for Agent Threat Rules (ATR) detection rules
+description: Schema for Agent Threat Rules (ATR) detection rules. Tracks SPEC.md v1.0.
 version: "1.0.0"
 type: object
@@ -22,12 +22,14 @@ required:
   - author
   - date
   - severity
-  - detection_tier
   - maturity
   - tags
   - agent_source
   - detection
   - response
+# Note (v1.1): detection_tier is now OPTIONAL. It was required by the
+# pre-1.0 spec drafts but is superseded by detection.method (atr-method-v1.1.md §4).
+# Rules MAY still set detection_tier for backward compatibility with older engines.
 properties:
@@ -78,8 +80,12 @@ properties:
   detection_tier:
     type: string
-    enum: [pattern, behavioral, protocol]
-    description: Detection approach used by this rule
+    enum: [pattern, signature, semantic, behavioral, protocol, trace]
+    description: >
+      Detection approach used by this rule. OPTIONAL (v1.1: superseded by
+      detection.method). Kept for backward compatibility with older engines.
+      Aligned with the 5 method values in atr-method-v1.1.md plus the legacy
+      "protocol" value for v1.0 conformance.
   maturity:
     type: string
@@ -134,6 +140,90 @@ properties:
         items:
           type: string
         description: "SAFE-MCP technique IDs (e.g., SMCP-T001)"
+      oscal_assessment_objective:
+        type: array
+        items:
+          type: string
+        description: >
+          OSCAL Assessment Plan/Result objective IDs or component-definition UUIDs
+          this Rule supplies evidence for. Lets the rule act as an evidence source
+          beneath an OSCAL-driven assessment. See atr-method-v1.1.md §9.
+      nist_csf:
+        type: array
+        items:
+          type: string
+        description: >
+          NIST CSF 2.0 subcategory identifiers (e.g., DE.CM-09, PR.IR-01).
+          Required for citation in NIST IR 8596 Cyber AI Profile Informative References.
+      etsi_ts_104223:
+        type: array
+        items:
+          type: string
+        description: >
+          ETSI TS 104 223 principle / sub-principle identifiers (e.g., P4.3).
+          The ETSI standard upstreamed UK NCSC's AI Cyber Code of Practice (Jan 2025);
+          maps ATR Rules to the 13 principles / 72 sub-principles.
+      probe_id:
+        type: array
+        items:
+          type: string
+        description: >
+          Identifier of the adversarial probe (red-team generator) whose output this
+          Rule is designed to detect. Format: "<framework>:<probe-name>" e.g.
+          "pyrit:indirect_pi_v2" or "garak:promptinject.HijackHateHumans". Lets a
+          Rule pair with its generating probe so detection coverage can be measured
+          end-to-end against adversarial test suites. See atr-method-v1.1.md §9.2.
+      external_references:
+        type: object
+        description: >
+          Cross-references to detection rules in other vendor / community rule
+          registries that cover the same or related threats. Lets ATR act as a
+          taxonomy bridge across rule formats without claiming authority over
+          the other registry's rule IDs. See atr-method-v1.1.md §9.4.
+          Each property is an array of opaque identifiers in the target
+          registry's native format. ATR engines MUST NOT execute these
+          identifiers; they are evidence only. Downstream tooling MAY use
+          them to enrich SIEM events, correlate detections, or generate
+          OSCAL assessment results that span rule formats.
+        properties:
+          cccs_yara:
+            type: array
+            items:
+              type: string
+            description: >
+              CCCS-Yara rule names (e.g., "APT_CN_BEACON_2024"). Per the
+              2026-05-26 CCCS-Yara#100 closing comment, cross-reference
+              ownership lives on the ATR side. See spec/external-registries/cccs-yara.md.
+          sigma:
+            type: array
+            items:
+              type: string
+            description: >
+              Sigma rule UUIDs (e.g., "12345678-1234-1234-1234-123456789abc")
+              that cover the same or correlated threat. Lets ATR rules bridge
+              into the wider Sigma ecosystem.
+          yara:
+            type: array
+            items:
+              type: string
+            description: >
+              Generic YARA rule names from public corpora (YARA-Forge,
+              Florian Roth's signature-base, etc.) covering related artifacts.
+          misp_taxonomy:
+            type: array
+            items:
+              type: string
+            description: >
+              MISP taxonomy entries (e.g., "atr:category=prompt-injection"
+              or "misp-taxonomies:dark-web=...") referencing this Rule.
+          stix_pattern:
+            type: array
+            items:
+              type: string
+            description: >
+              STIX 2.1 indicator pattern IDs covering the same Indicator
+              of Compromise.
       research:
         type: array
         items:
@@ -193,6 +283,7 @@ properties:
           - skill_lifecycle   # MCP skill registration, update, removal events
           - skill_permission  # Skill permission requests and boundary checks
           - skill_chain       # Multi-skill invocation sequences
+          - agent_trace       # Agent execution trace (OpenInference/OTel GenAI spans); see atr-method-v1.1.md
         description: Type of agent data stream to monitor
       framework:
         type: array
@@ -214,6 +305,167 @@ properties:
     type: object
     required: [conditions, condition]
     properties:
+      method:
+        type: string
+        enum: [pattern, signature, semantic, behavioral, trace]
+        default: pattern
+        description: >
+          Detection method this rule uses. Defaults to "pattern" (regex/string match
+          on text fields) for backward compatibility with v1.0 rules. Other methods
+          require additional fields documented in spec/atr-method-v1.1.md:
+            - signature: exact-match on hash / package_name / registry_url (see §5)
+            - semantic:  LLM-as-judge intent classification (see §6)
+            - behavioral: metric threshold over a time window (see §7)
+            - trace:     declarative assertions over agent execution traces (see §8)
+          Engines that do not implement a given method MUST skip rules using it
+          rather than fail closed on unknown method values.
+      signature:
+        type: object
+        description: >
+          REQUIRED when method=signature. See atr-method-v1.1.md §5.
+        required: [indicators]
+        properties:
+          indicators:
+            type: array
+            minItems: 1
+            description: "Non-empty list of indicator objects per §5.2.1"
+            items:
+              type: object
+              required: [type, value, target_field]
+              properties:
+                type:
+                  type: string
+                  enum: [sha256, sha512, blake2b-256, package_name, registry_url, skill_id]
+                  description: "Indicator type. Hash types require hex-encoded value (lowercase)."
+                value:
+                  type: string
+                  description: "Indicator value (hex hash or string identifier)"
+                target_field:
+                  type: string
+                  description: "Source field on the Input to match against (e.g., skill.content, skill.manifest.name)"
+                provenance:
+                  type: object
+                  description: "OPTIONAL forensic provenance metadata"
+                  properties:
+                    first_observed:
+                      type: string
+                      description: "ISO 8601 date when indicator was first attributed"
+                    source:
+                      type: string
+                    attribution:
+                      type: string
+          match_logic:
+            type: string
+            enum: [any, all]
+            default: any
+            description: "any = match if any indicator matches; all = match only if every indicator matches"
+      semantic:
+        type: object
+        description: >
+          REQUIRED when method=semantic. See atr-method-v1.1.md §4.
+        properties:
+          judge_model_class:
+            type: string
+            description: "Class of judge model (e.g., gpt-4-class, llama-prompt-guard, claude-haiku)"
+          prompt_template:
+            type: string
+            description: "Prompt template with {{input}} placeholder"
+          output_schema:
+            type: object
+            description: "Expected JSON shape of judge output (category, confidence, evidence)"
+          threshold:
+            type: number
+            minimum: 0.0
+            maximum: 1.0
+            description: "Minimum confidence to trigger match"
+          cache_ttl:
+            type: integer
+            description: "Cache TTL in seconds for identical inputs"
+          judge_prompt_hash:
+            type: string
+            description: "SHA-256 hash of the canonical judge prompt for regression testing"
+          fallback_method:
+            type: string
+            enum: [pattern, none]
+            description: "Method to fall back to if judge is unavailable"
+      trace:
+        type: object
+        description: >
+          REQUIRED when method=trace. See atr-method-v1.1.md §8.
+        properties:
+          ingest_format:
+            type: string
+            enum: [openinference, otel_gen_ai]
+            default: openinference
+            description: "Trace ingest format the rule expects"
+          forbid:
+            type: array
+            description: "Span shapes that MUST NOT appear in the trace"
+            items: {type: object}
+          require:
+            type: array
+            description: "Span shapes that MUST appear (optionally with ordering constraints)"
+            items: {type: object}
+          invariant:
+            type: array
+            description: "Attributes that MUST hold across a set of spans"
+            items: {type: object}
+      behavioral:
+        type: object
+        description: >
+          REQUIRED when method=behavioral. See atr-method-v1.1.md §7.
+        required: [metric, aggregation, window, operator, threshold]
+        properties:
+          metric:
+            type: string
+            description: "Name of the metric being observed (e.g., tool_calls_per_session, token_spend_usd)"
+          aggregation:
+            type: string
+            enum: [count, sum, avg, max, distinct_count, rate]
+            description: "How event values aggregate into a single metric value over the window"
+          window:
+            type: string
+            description: "ISO 8601 duration (e.g., PT5M, PT1H) or shorthand (5m, 1h)"
+          operator:
+            type: string
+            enum: [gt, lt, gte, lte, eq, deviation_from_baseline]
+            description: "Comparison operator between aggregated metric and threshold"
+          threshold:
+            type: number
+            description: "Numeric value compared against the aggregated metric. For deviation_from_baseline, expressed as stddev multiplier or fractional change."
+          group_by:
+            type: array
+            items: {type: string}
+            description: "Dimensions to partition the aggregation over (e.g., session.id, user.id)"
+          filter:
+            type: object
+            description: "Pre-aggregation event filter using §8.3 predicate vocabulary"
+          baseline:
+            type: object
+            description: "Required only when operator=deviation_from_baseline"
+            properties:
+              source:
+                type: string
+                enum: [rolling_mean, historical_percentile, fixed]
+              lookback:
+                type: string
+                description: "Duration to compute baseline over (e.g., P7D)"
+              percentile:
+                type: number
+                minimum: 0
+                maximum: 100
+              value:
+                type: number
+              deviation_unit:
+                type: string
+                enum: [stddev, fraction]
+          min_events:
+            type: integer
+            minimum: 1
+            description: "Minimum event count in window before rule may fire"
+          cooldown:
+            type: string
+            description: "ISO 8601 duration the rule must not re-fire on same group_by partition after Match"
       conditions:
         description: >
           Detection conditions. Supports two formats:
@@ -241,6 +493,15 @@ properties:
                 description:
                   type: string
                   description: Human-readable description of what this condition detects
+                language:
+                  type: string
+                  enum: [en, zh-Hant, zh-Hans, ja, es, ar]
+                  default: en
+                  description: >
+                    BCP-47 language tag this condition targets. Optional; default 'en'.
+                    Engine applies NFKC normalization at match time. Per-language
+                    conditions on the same rule are combined under condition: any.
+                    Adopted v3.0.0 (2026-05-18).
           # -- Named-map format (for complex/behavioral detection) --
           - type: object
@@ -312,6 +573,7 @@ properties:
         items:
           type: string
           enum:
+            # v1.0 vocabulary
             - block_input        # Reject the user/agent input
             - block_output       # Suppress the agent output
             - block_tool         # Prevent the tool call from executing
@@ -322,6 +584,15 @@ properties:
             - escalate           # Escalate to human reviewer
             - reduce_permissions # Reduce agent's available tools/capabilities
             - kill_agent         # Terminate the agent process
+            # SPEC.md Appendix A canonical action vocabulary (v1.0+)
+            - block_request          # Reject the originating request (generic)
+            - log_alert              # Emit a structured alert event without blocking
+            - quarantine_artifact    # Isolate a specific artifact (skill, tool, context blob)
+            - require_human_review   # Pause the action pending operator approval
+            - redact_match           # Hash or truncate matched substring in output
+            - rate_limit_source      # Apply rate limit to the source agent/user/session
+            - revoke_credential      # Revoke an active credential identified in the match
+            - notify_operator        # Out-of-band notification (paging, email, chat)
         description: Actions to take when the rule triggers
       auto_response_threshold:
         type: string