agentv 4.26.1 → 4.27.0-next.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/{chunk-JA4WQNE6.js → chunk-47JX7NNZ.js} +10 -2
- package/dist/chunk-47JX7NNZ.js.map +1 -0
- package/dist/{chunk-XBUHMRX2.js → chunk-V3LWJB5X.js} +431 -49
- package/dist/chunk-V3LWJB5X.js.map +1 -0
- package/dist/cli.js +2 -2
- package/dist/index.js +2 -2
- package/dist/{interactive-YMKWKPD7.js → interactive-L6PIIFNQ.js} +2 -2
- package/dist/skills/agentv-bench/LICENSE.txt +202 -0
- package/dist/skills/agentv-bench/SKILL.md +459 -0
- package/dist/skills/agentv-bench/agents/analyzer.md +177 -0
- package/dist/skills/agentv-bench/agents/comparator.md +247 -0
- package/dist/skills/agentv-bench/agents/executor.md +30 -0
- package/dist/skills/agentv-bench/agents/grader.md +238 -0
- package/dist/skills/agentv-bench/agents/mutator.md +172 -0
- package/dist/skills/agentv-bench/references/autoresearch.md +309 -0
- package/dist/skills/agentv-bench/references/description-optimization.md +66 -0
- package/dist/skills/agentv-bench/references/environment-adaptation.md +82 -0
- package/dist/skills/agentv-bench/references/eval-yaml-spec.md +338 -0
- package/dist/skills/agentv-bench/references/migrating-from-skill-creator.md +103 -0
- package/dist/skills/agentv-bench/references/schemas.md +432 -0
- package/dist/skills/agentv-bench/references/subagent-pipeline.md +181 -0
- package/dist/skills/agentv-bench/scripts/trajectory.html +462 -0
- package/dist/skills/agentv-eval-review/SKILL.md +53 -0
- package/dist/skills/agentv-eval-review/scripts/lint_eval.py +239 -0
- package/dist/skills/agentv-eval-writer/SKILL.md +707 -0
- package/dist/skills/agentv-eval-writer/references/config-schema.json +63 -0
- package/dist/skills/agentv-eval-writer/references/custom-evaluators.md +119 -0
- package/dist/skills/agentv-eval-writer/references/eval-schema.json +19077 -0
- package/dist/skills/agentv-eval-writer/references/rubric-evaluator.md +114 -0
- package/dist/skills/agentv-governance/SKILL.md +79 -0
- package/dist/skills/agentv-governance/references/eu-ai-act-risk-tiers.md +37 -0
- package/dist/skills/agentv-governance/references/governance-yaml-shape.md +125 -0
- package/dist/skills/agentv-governance/references/iso-42001-controls.md +46 -0
- package/dist/skills/agentv-governance/references/lint-rules.md +169 -0
- package/dist/skills/agentv-governance/references/mitre-atlas.md +38 -0
- package/dist/skills/agentv-governance/references/owasp-agentic-top-10-2025.md +28 -0
- package/dist/skills/agentv-governance/references/owasp-llm-top-10-2025.md +25 -0
- package/dist/skills/agentv-trace-analyst/SKILL.md +161 -0
- package/package.json +1 -1
- package/dist/chunk-JA4WQNE6.js.map +0 -1
- package/dist/chunk-XBUHMRX2.js.map +0 -1
- /package/dist/{interactive-YMKWKPD7.js.map → interactive-L6PIIFNQ.js.map} +0 -0
|
@@ -0,0 +1,114 @@
|
|
|
1
|
+
# Rubric Grader
|
|
2
|
+
|
|
3
|
+
Rubrics are defined as `assertions` entries with `type: rubrics`. They support binary checklist grading and score-range analytic grading.
|
|
4
|
+
|
|
5
|
+
## Field Reference
|
|
6
|
+
|
|
7
|
+
| Field | Type | Default | Description |
|
|
8
|
+
|-------|------|---------|-------------|
|
|
9
|
+
| `type` | string | required | Must be `rubrics` |
|
|
10
|
+
| `criteria` | array | required | List of criterion strings or objects |
|
|
11
|
+
| `required` | boolean or number | - | Gate: `true` requires score >= 0.8; a number (0–1) sets a custom threshold |
|
|
12
|
+
|
|
13
|
+
### Criterion Object Fields
|
|
14
|
+
|
|
15
|
+
| Field | Type | Default | Description |
|
|
16
|
+
|-------|------|---------|-------------|
|
|
17
|
+
| `id` | string | auto-generated | Unique identifier |
|
|
18
|
+
| `outcome` | string | required* | Criterion being evaluated (*optional if `score_ranges` used) |
|
|
19
|
+
| `weight` | number | 1.0 | Relative importance |
|
|
20
|
+
| `required` | boolean | true | Failing forces verdict to 'fail' (checklist mode) |
|
|
21
|
+
| `min_score` | number | - | Minimum score (0–1) to pass this criterion |
|
|
22
|
+
| `required_min_score` | integer | - | **Deprecated.** Use `min_score` instead. Legacy 0–10 scale. |
|
|
23
|
+
| `score_ranges` | map or array | - | Score range definitions for analytic scoring |
|
|
24
|
+
|
|
25
|
+
## String Shorthand (Recommended)
|
|
26
|
+
|
|
27
|
+
Plain strings in `assertions` are automatically treated as rubric criteria:
|
|
28
|
+
|
|
29
|
+
```yaml
|
|
30
|
+
assertions:
|
|
31
|
+
- Mentions divide-and-conquer approach
|
|
32
|
+
- Explains partition step
|
|
33
|
+
- States time complexity
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
Equivalent to the full form with `type: rubrics`. Use the full form only when you need weights, `required: false`, or `score_ranges`.
|
|
37
|
+
|
|
38
|
+
Mixed strings and objects are supported in `assertions` — strings are grouped into a single rubrics grader at the position of the first string:
|
|
39
|
+
|
|
40
|
+
```yaml
|
|
41
|
+
assertions:
|
|
42
|
+
- Mentions divide-and-conquer approach # grouped into rubrics
|
|
43
|
+
- type: code-grader
|
|
44
|
+
command: [check_syntax.py]
|
|
45
|
+
- States time complexity # grouped into rubrics
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
## Checklist Mode
|
|
49
|
+
|
|
50
|
+
```yaml
|
|
51
|
+
assertions:
|
|
52
|
+
- type: rubrics
|
|
53
|
+
criteria:
|
|
54
|
+
- Mentions divide-and-conquer approach
|
|
55
|
+
- id: complexity
|
|
56
|
+
outcome: States time complexity correctly
|
|
57
|
+
weight: 2.0
|
|
58
|
+
required: true
|
|
59
|
+
- id: examples
|
|
60
|
+
outcome: Includes code examples
|
|
61
|
+
weight: 1.0
|
|
62
|
+
required: false
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
## Score-Range Mode
|
|
66
|
+
|
|
67
|
+
Shorthand map format (recommended):
|
|
68
|
+
|
|
69
|
+
```yaml
|
|
70
|
+
assertions:
|
|
71
|
+
- type: rubrics
|
|
72
|
+
criteria:
|
|
73
|
+
- id: correctness
|
|
74
|
+
weight: 2.0
|
|
75
|
+
min_score: 0.7
|
|
76
|
+
score_ranges:
|
|
77
|
+
0: Critical bugs
|
|
78
|
+
3: Minor bugs
|
|
79
|
+
6: Correct with minor issues
|
|
80
|
+
9: Fully correct
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
Map keys are lower bounds (0-10). Each range extends from its key to (next key - 1), with the last extending to 10. Must start at 0.
|
|
84
|
+
|
|
85
|
+
Array format is also accepted:
|
|
86
|
+
|
|
87
|
+
```yaml
|
|
88
|
+
score_ranges:
|
|
89
|
+
- score_range: [0, 2]
|
|
90
|
+
outcome: Critical bugs
|
|
91
|
+
- score_range: [3, 5]
|
|
92
|
+
outcome: Minor bugs
|
|
93
|
+
- score_range: [6, 8]
|
|
94
|
+
outcome: Correct with minor issues
|
|
95
|
+
- score_range: [9, 10]
|
|
96
|
+
outcome: Fully correct
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
Ranges must be integers 0-10, non-overlapping, covering all values 0-10.
|
|
100
|
+
|
|
101
|
+
## Scoring
|
|
102
|
+
|
|
103
|
+
**Checklist:** `score = sum(satisfied weights) / sum(all weights)`
|
|
104
|
+
|
|
105
|
+
**Score-range:** `score = weighted_average(raw_score / 10)` per criterion
|
|
106
|
+
|
|
107
|
+
## Verdicts
|
|
108
|
+
|
|
109
|
+
| Verdict | Condition |
|
|
110
|
+
|---------|-----------|
|
|
111
|
+
| `pass` | score >= 0.8 AND all gating criteria satisfied |
|
|
112
|
+
| `fail` | score < 0.8 OR any gating criterion failed |
|
|
113
|
+
|
|
114
|
+
Gating: checklist uses `required: true`, score-range uses `min_score: N` (0–1 scale).
|
|
@@ -0,0 +1,79 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: agentv-governance
|
|
3
|
+
description: >-
|
|
4
|
+
Author, edit, and lint `governance:` blocks in `*.eval.yaml` files.
|
|
5
|
+
Use when creating or updating evaluation suites that carry AI-governance metadata
|
|
6
|
+
(OWASP LLM Top 10, OWASP Agentic Top 10, MITRE ATLAS, EU AI Act, ISO 42001).
|
|
7
|
+
Also use non-interactively (e.g., from a GitHub Action) to lint changed eval files
|
|
8
|
+
and report violations against the rules in `references/lint-rules.md`.
|
|
9
|
+
Do NOT use for running evals or benchmarking — that belongs to agentv-bench.
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
# AgentV Compliance Skill
|
|
13
|
+
|
|
14
|
+
Teaches AI agents how to author syntactically correct `governance:` blocks in AgentV
|
|
15
|
+
eval files, and how to lint them against known vocabulary rules.
|
|
16
|
+
|
|
17
|
+
## Dual mode
|
|
18
|
+
|
|
19
|
+
**Authoring (interactive):** When a human or AI agent is editing a `*.eval.yaml` file
|
|
20
|
+
that contains or should contain a `governance:` block, this skill provides vocabulary,
|
|
21
|
+
valid values, and example shapes. Load it alongside `agentv-eval-writer` when building
|
|
22
|
+
red-team or compliance suites.
|
|
23
|
+
|
|
24
|
+
**Linting (non-interactive / CI):** When invoked from a GitHub Action (see
|
|
25
|
+
`examples/governance/compliance-lint/`), this skill lints each changed `*.eval.yaml` file
|
|
26
|
+
against the rules in `references/lint-rules.md` and returns a structured JSON report.
|
|
27
|
+
The expected output format is:
|
|
28
|
+
```json
|
|
29
|
+
{
|
|
30
|
+
"pass": true,
|
|
31
|
+
"violations": [
|
|
32
|
+
{
|
|
33
|
+
"rule": "known_key",
|
|
34
|
+
"key": "risk_level",
|
|
35
|
+
"value": "high",
|
|
36
|
+
"message": "Unknown governance key 'risk_level'. Did you mean 'risk_tier'?",
|
|
37
|
+
"suggestion": "Replace 'risk_level' with 'risk_tier'."
|
|
38
|
+
}
|
|
39
|
+
]
|
|
40
|
+
}
|
|
41
|
+
```
|
|
42
|
+
`pass` is `true` when `violations` is empty.
|
|
43
|
+
|
|
44
|
+
## Reference files
|
|
45
|
+
|
|
46
|
+
| File | Purpose |
|
|
47
|
+
|------|---------|
|
|
48
|
+
| `references/governance-yaml-shape.md` | YAML shape, merge semantics, worked examples |
|
|
49
|
+
| `references/lint-rules.md` | Machine-readable rules applied during lint |
|
|
50
|
+
| `references/owasp-llm-top-10-2025.md` | LLM01–LLM10 canonical IDs and descriptions |
|
|
51
|
+
| `references/owasp-agentic-top-10-2025.md` | T01–T10 agentic-AI categories |
|
|
52
|
+
| `references/mitre-atlas.md` | Common AML.Txxxx technique IDs |
|
|
53
|
+
| `references/eu-ai-act-risk-tiers.md` | Four risk tiers + article references |
|
|
54
|
+
| `references/iso-42001-controls.md` | Curated ISO/IEC 42001:2023 controls for AI eval |
|
|
55
|
+
|
|
56
|
+
## Quick authoring guide
|
|
57
|
+
|
|
58
|
+
1. Check which risks this eval exercises using the reference files above.
|
|
59
|
+
2. Pick IDs from the relevant frameworks (`owasp_llm_top_10_2025`, `mitre_atlas`, etc.).
|
|
60
|
+
3. Set `risk_tier` using EU AI Act vocabulary (`prohibited | high | limited | minimal`).
|
|
61
|
+
4. Add `controls` as `<FRAMEWORK>-<VERSION>:<ID>` strings (e.g. `EU-AI-ACT-2024:Art.55`).
|
|
62
|
+
5. Run the lint rules from `references/lint-rules.md` against your block before committing.
|
|
63
|
+
6. See `references/governance-yaml-shape.md` for complete examples copied from real suites.
|
|
64
|
+
|
|
65
|
+
## Accessing reference files
|
|
66
|
+
|
|
67
|
+
To load a specific reference without pulling the entire skill into context:
|
|
68
|
+
|
|
69
|
+
```bash
|
|
70
|
+
agentv skills get agentv-governance --ref lint-rules
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
Or resolve the skill directory and read files directly:
|
|
74
|
+
|
|
75
|
+
```bash
|
|
76
|
+
cat $(agentv skills path agentv-governance)/references/lint-rules.md
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
Use `--full` to retrieve every framework reference in one shot.
|
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
# EU AI Act — Risk Tiers
|
|
2
|
+
|
|
3
|
+
**Valid values for the `risk_tier:` field.**
|
|
4
|
+
|
|
5
|
+
Official source: Regulation (EU) 2024/1689 on Artificial Intelligence (EU AI Act)
|
|
6
|
+
Full text: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32024R1689
|
|
7
|
+
|
|
8
|
+
## Allowed values
|
|
9
|
+
|
|
10
|
+
| Value | EU AI Act category | Key articles | Description |
|
|
11
|
+
|-------|-------------------|-------------|-------------|
|
|
12
|
+
| `prohibited` | Prohibited AI practices | Art. 5 | AI systems whose risks are deemed unacceptable — banned outright. Examples: social scoring by public authorities, real-time remote biometric surveillance in public spaces, AI that exploits vulnerabilities of specific groups. |
|
|
13
|
+
| `high` | High-risk AI systems | Art. 6, Annex I–III | AI systems subject to mandatory conformity assessments, transparency, and human oversight. Examples: biometric identification, critical infrastructure, employment screening, access to education or essential services, law enforcement. |
|
|
14
|
+
| `limited` | Limited-risk AI systems | Art. 50 | AI systems with transparency obligations only. Examples: chatbots must disclose they are AI; deep-fake generators must mark synthetic media. |
|
|
15
|
+
| `minimal` | Minimal-risk AI systems | — | No mandatory obligations. Examples: spam filters, AI in video games. Voluntary codes of conduct encouraged. |
|
|
16
|
+
|
|
17
|
+
## Usage notes
|
|
18
|
+
|
|
19
|
+
- `risk_tier` is a scalar; only one value per governance block.
|
|
20
|
+
- The vocabulary is anchored to EU AI Act terminology. Some organizations use different
|
|
21
|
+
risk scales (e.g. NIST SP 800-30 `low | moderate | high | very_high`). When mapping
|
|
22
|
+
from another framework, choose the EU AI Act equivalent that best matches the impact.
|
|
23
|
+
- Combine `risk_tier: high` with `controls` referencing EU AI Act articles:
|
|
24
|
+
```yaml
|
|
25
|
+
risk_tier: high
|
|
26
|
+
controls:
|
|
27
|
+
- EU-AI-ACT-2024:Art.55
|
|
28
|
+
- EU-AI-ACT-2024:Art.6
|
|
29
|
+
```
|
|
30
|
+
- `prohibited` tier should accompany test cases that specifically probe prohibited behaviors.
|
|
31
|
+
This does NOT mean the eval suite is itself prohibited — it means the suite tests whether
|
|
32
|
+
the system correctly refuses to engage in prohibited behaviors.
|
|
33
|
+
|
|
34
|
+
## Article reference format
|
|
35
|
+
|
|
36
|
+
Use `EU-AI-ACT-2024:<Article>` in the `controls` array, e.g. `EU-AI-ACT-2024:Art.55`.
|
|
37
|
+
Article 55 covers general-purpose AI (GPAI) model obligations and transparency requirements.
|
|
@@ -0,0 +1,125 @@
|
|
|
1
|
+
# Governance Block — YAML Shape and Examples
|
|
2
|
+
|
|
3
|
+
## Field reference
|
|
4
|
+
|
|
5
|
+
```yaml
|
|
6
|
+
governance:
|
|
7
|
+
schema_version: "1.0" # string, optional — version of this block's schema
|
|
8
|
+
owasp_llm_top_10_2025: [LLM01] # string[], optional — OWASP LLM Top 10 v2025 IDs
|
|
9
|
+
owasp_agentic_top_10_2025: [T01, T06] # string[], optional — OWASP Agentic AI Top 10 v2025 IDs
|
|
10
|
+
mitre_atlas: [AML.T0051] # string[], optional — MITRE ATLAS technique IDs
|
|
11
|
+
controls: [] # string[], optional — <FRAMEWORK>-<VERSION>:<ID> strings
|
|
12
|
+
risk_tier: high # string, optional — EU AI Act tier (see eu-ai-act-risk-tiers.md)
|
|
13
|
+
owner: security-team # string, optional — owning team or person
|
|
14
|
+
```
|
|
15
|
+
|
|
16
|
+
All fields are optional. Unknown keys pass through to JSONL output unchanged.
|
|
17
|
+
|
|
18
|
+
## Control ID format
|
|
19
|
+
|
|
20
|
+
The `controls` array accepts any string matching the pattern `<FRAMEWORK>-<VERSION>:<ID>`.
|
|
21
|
+
Custom organizational prefixes are valid:
|
|
22
|
+
|
|
23
|
+
```
|
|
24
|
+
NIST-AI-RMF-1.0:MEASURE-2.7
|
|
25
|
+
EU-AI-ACT-2024:Art.55
|
|
26
|
+
ISO-42001-2023:6.1.2
|
|
27
|
+
INTERNAL-AI-POLICY-3.2:CTRL-7
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
## Placement in eval files
|
|
31
|
+
|
|
32
|
+
Governance blocks live in two places and are merged automatically:
|
|
33
|
+
|
|
34
|
+
### 1. Suite-level (top-level key)
|
|
35
|
+
|
|
36
|
+
Define once at the suite level and it will be merged into every case's `metadata.governance`:
|
|
37
|
+
|
|
38
|
+
```yaml
|
|
39
|
+
name: redteam-llm01-prompt-injection
|
|
40
|
+
governance: &gov # YAML anchor for reuse in per-case overrides
|
|
41
|
+
schema_version: "1.0"
|
|
42
|
+
owasp_llm_top_10_2025: [LLM01]
|
|
43
|
+
mitre_atlas: [AML.T0051]
|
|
44
|
+
controls:
|
|
45
|
+
- NIST-AI-RMF-1.0:MEASURE-2.7
|
|
46
|
+
- EU-AI-ACT-2024:Art.55
|
|
47
|
+
risk_tier: high
|
|
48
|
+
owner: security-team
|
|
49
|
+
|
|
50
|
+
tests:
|
|
51
|
+
- id: direct-ignore-previous
|
|
52
|
+
metadata:
|
|
53
|
+
governance: *gov # reference the anchor — identical to suite-level
|
|
54
|
+
...
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
### 2. Per-case override with merge-key (`<<:`)
|
|
58
|
+
|
|
59
|
+
Use YAML merge keys to inherit suite-level governance and add case-specific overrides.
|
|
60
|
+
Arrays from both sides are concatenated and deduplicated; scalar fields on the case win:
|
|
61
|
+
|
|
62
|
+
```yaml
|
|
63
|
+
- id: indirect-tool-output
|
|
64
|
+
metadata:
|
|
65
|
+
governance:
|
|
66
|
+
<<: *gov
|
|
67
|
+
owasp_llm_top_10_2025: [LLM01, LLM06] # extends — case adds LLM06 to the inherited [LLM01]
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
## Merge semantics (how suite + case are combined)
|
|
71
|
+
|
|
72
|
+
| Field type | Merge behavior |
|
|
73
|
+
|-----------|----------------|
|
|
74
|
+
| Arrays (`owasp_llm_top_10_2025`, `mitre_atlas`, `controls`) | Concatenate suite + case, deduplicate |
|
|
75
|
+
| Scalars (`risk_tier`, `owner`, `schema_version`) | Case value overrides suite value |
|
|
76
|
+
|
|
77
|
+
## Complete example — from `examples/red-team/suites/llm01-prompt-injection.eval.yaml`
|
|
78
|
+
|
|
79
|
+
```yaml
|
|
80
|
+
name: redteam-llm01-prompt-injection
|
|
81
|
+
governance: &gov
|
|
82
|
+
schema_version: "1.0"
|
|
83
|
+
owasp_llm_top_10_2025: [LLM01]
|
|
84
|
+
mitre_atlas: [AML.T0051]
|
|
85
|
+
controls:
|
|
86
|
+
- NIST-AI-RMF-1.0:MEASURE-2.7
|
|
87
|
+
- EU-AI-ACT-2024:Art.55
|
|
88
|
+
risk_tier: high
|
|
89
|
+
owner: security-team
|
|
90
|
+
|
|
91
|
+
tests:
|
|
92
|
+
- id: direct-ignore-previous
|
|
93
|
+
metadata:
|
|
94
|
+
governance: *gov
|
|
95
|
+
...
|
|
96
|
+
|
|
97
|
+
- id: indirect-tool-output-document
|
|
98
|
+
metadata:
|
|
99
|
+
governance:
|
|
100
|
+
<<: *gov
|
|
101
|
+
owasp_llm_top_10_2025: [LLM01, LLM06] # case adds LLM06
|
|
102
|
+
...
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
## Complete example — from `examples/red-team/archetypes/coding-agent/suites/destructive-git.eval.yaml`
|
|
106
|
+
|
|
107
|
+
```yaml
|
|
108
|
+
name: redteam-coder-destructive-git
|
|
109
|
+
governance: &gov
|
|
110
|
+
schema_version: "1.0"
|
|
111
|
+
owasp_llm_top_10_2025: [LLM06]
|
|
112
|
+
owasp_agentic_top_10_2025: [T01, T06]
|
|
113
|
+
mitre_atlas: [AML.T0051, AML.T0075]
|
|
114
|
+
controls:
|
|
115
|
+
- NIST-AI-RMF-1.0:MEASURE-2.7
|
|
116
|
+
- EU-AI-ACT-2024:Art.55
|
|
117
|
+
risk_tier: high
|
|
118
|
+
owner: security-team
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
## JSONL output
|
|
122
|
+
|
|
123
|
+
The merged `governance` block is passed through verbatim to the JSONL result file under each
|
|
124
|
+
result's `metadata.governance` key. Downstream tools (jq pipelines, `.ai-register.yaml`
|
|
125
|
+
aggregators) consume it from there. The eval engine does not validate or transform the values.
|
|
@@ -0,0 +1,46 @@
|
|
|
1
|
+
# ISO/IEC 42001:2023 — AI Management System Controls
|
|
2
|
+
|
|
3
|
+
**Curated subset of controls relevant to AI evaluation suites.**
|
|
4
|
+
|
|
5
|
+
Official source: ISO/IEC 42001:2023 — Information technology — Artificial intelligence —
|
|
6
|
+
Management system. Full standard available at https://www.iso.org/standard/81230.html
|
|
7
|
+
|
|
8
|
+
ISO 42001 is a management-system standard (like ISO 27001 for information security) covering
|
|
9
|
+
the governance, risk management, and operational controls for organizations that develop or
|
|
10
|
+
deploy AI systems.
|
|
11
|
+
|
|
12
|
+
## Control reference format
|
|
13
|
+
|
|
14
|
+
Use `ISO-42001-2023:<Clause>` in the `controls` array.
|
|
15
|
+
|
|
16
|
+
## Relevant control areas for eval suites
|
|
17
|
+
|
|
18
|
+
| Clause | Title | Relevance to evals |
|
|
19
|
+
|--------|-------|-------------------|
|
|
20
|
+
| 6.1 | Actions to address risks and opportunities | Risk identification for AI systems — align `risk_tier` with documented risk assessments. |
|
|
21
|
+
| 6.1.2 | AI risk assessment | Formal risk assessment process; eval suites serve as evidence of risk measurement. |
|
|
22
|
+
| 8.4 | AI system impact assessment | Assess potential societal impacts before deployment; red-team evals provide evidence. |
|
|
23
|
+
| 8.5 | AI system life cycle | Controls for data, model, and deployment stages — align with suite test coverage. |
|
|
24
|
+
| 9.1 | Monitoring, measurement, analysis and evaluation | Periodic eval runs as evidence of continuous monitoring. |
|
|
25
|
+
| 9.1.1 | AI performance evaluation | Systematic measurement of AI output quality and safety properties. |
|
|
26
|
+
| 10.2 | Nonconformity and corrective action | Failing evals trigger corrective action processes. |
|
|
27
|
+
| A.2 | Policies for AI (Annex A) | Organizational AI use policies — `owner` field maps to the responsible team. |
|
|
28
|
+
| A.5 | AI risk assessment (Annex A) | Documented risk assessment for each AI application. |
|
|
29
|
+
| A.6 | AI system impact assessment (Annex A) | Broader societal-impact documentation. |
|
|
30
|
+
|
|
31
|
+
## Usage example
|
|
32
|
+
|
|
33
|
+
```yaml
|
|
34
|
+
controls:
|
|
35
|
+
- ISO-42001-2023:6.1.2 # AI risk assessment
|
|
36
|
+
- ISO-42001-2023:9.1.1 # AI performance evaluation
|
|
37
|
+
- EU-AI-ACT-2024:Art.55 # GPAI transparency obligations
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
## Notes
|
|
41
|
+
|
|
42
|
+
- ISO 42001 is certification-oriented; most teams will reference only a subset.
|
|
43
|
+
The clauses above are the ones most directly evidenced by running and storing eval results.
|
|
44
|
+
- For pure LLM / red-team suites, clauses 6.1.2, 8.4, and 9.1.1 are the most common references.
|
|
45
|
+
- Combine with NIST AI RMF controls (e.g. `NIST-AI-RMF-1.0:MEASURE-2.7`) when the organization
|
|
46
|
+
uses both frameworks.
|
|
@@ -0,0 +1,169 @@
|
|
|
1
|
+
# Governance Block Lint Rules
|
|
2
|
+
|
|
3
|
+
Rules applied when linting a `governance:` block in a `*.eval.yaml` file.
|
|
4
|
+
The CI Action (see `examples/governance/compliance-lint/`) passes this file to Claude
|
|
5
|
+
together with the governance block to extract and returns a structured report.
|
|
6
|
+
|
|
7
|
+
## How to apply these rules
|
|
8
|
+
|
|
9
|
+
For each `governance:` block found in a changed eval file:
|
|
10
|
+
|
|
11
|
+
1. Extract the block (top-level `governance:` key, or `metadata.governance` in a test case).
|
|
12
|
+
2. Apply each rule below in order.
|
|
13
|
+
3. Collect all violations.
|
|
14
|
+
4. Return the structured JSON report described in `SKILL.md`.
|
|
15
|
+
|
|
16
|
+
A block with zero violations produces `{ "pass": true, "violations": [] }`.
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## Rule 1 — known_key
|
|
21
|
+
|
|
22
|
+
**What:** Every key in the `governance:` object must be in the allowed-key list.
|
|
23
|
+
|
|
24
|
+
**Allowed keys:** `schema_version`, `owasp_llm_top_10_2025`, `owasp_agentic_top_10_2025`,
|
|
25
|
+
`mitre_atlas`, `controls`, `risk_tier`, `owner`
|
|
26
|
+
|
|
27
|
+
**On violation:**
|
|
28
|
+
```json
|
|
29
|
+
{
|
|
30
|
+
"rule": "known_key",
|
|
31
|
+
"key": "<offending-key>",
|
|
32
|
+
"value": "<value>",
|
|
33
|
+
"message": "Unknown governance key '<offending-key>'. Did you mean '<closest-match>'?",
|
|
34
|
+
"suggestion": "Replace '<offending-key>' with '<closest-match>'."
|
|
35
|
+
}
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
Common typos and their corrections:
|
|
39
|
+
- `risk_level` → `risk_tier`
|
|
40
|
+
- `owasp_top_10` → `owasp_llm_top_10_2025`
|
|
41
|
+
- `owasp_llm` → `owasp_llm_top_10_2025`
|
|
42
|
+
- `atlas` → `mitre_atlas`
|
|
43
|
+
- `mitre` → `mitre_atlas`
|
|
44
|
+
- `control` (singular) → `controls`
|
|
45
|
+
|
|
46
|
+
---
|
|
47
|
+
|
|
48
|
+
## Rule 2 — owasp_llm_ids
|
|
49
|
+
|
|
50
|
+
**What:** Every string in `owasp_llm_top_10_2025` must match the pattern `LLM\d{2}` (LLM01–LLM10).
|
|
51
|
+
|
|
52
|
+
**On violation:**
|
|
53
|
+
```json
|
|
54
|
+
{
|
|
55
|
+
"rule": "owasp_llm_ids",
|
|
56
|
+
"key": "owasp_llm_top_10_2025",
|
|
57
|
+
"value": "<offending-id>",
|
|
58
|
+
"message": "Invalid OWASP LLM ID '<offending-id>'. Expected LLM01–LLM10.",
|
|
59
|
+
"suggestion": "Use a valid ID from references/owasp-llm-top-10-2025.md."
|
|
60
|
+
}
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
---
|
|
64
|
+
|
|
65
|
+
## Rule 3 — owasp_agentic_ids
|
|
66
|
+
|
|
67
|
+
**What:** Every string in `owasp_agentic_top_10_2025` must match the pattern `T\d{2}` (T01–T10).
|
|
68
|
+
|
|
69
|
+
**On violation:**
|
|
70
|
+
```json
|
|
71
|
+
{
|
|
72
|
+
"rule": "owasp_agentic_ids",
|
|
73
|
+
"key": "owasp_agentic_top_10_2025",
|
|
74
|
+
"value": "<offending-id>",
|
|
75
|
+
"message": "Invalid OWASP Agentic ID '<offending-id>'. Expected T01–T10.",
|
|
76
|
+
"suggestion": "Use a valid ID from references/owasp-agentic-top-10-2025.md."
|
|
77
|
+
}
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
---
|
|
81
|
+
|
|
82
|
+
## Rule 4 — mitre_atlas_ids
|
|
83
|
+
|
|
84
|
+
**What:** Every string in `mitre_atlas` must match the pattern `AML\.T\d{4}(\.\d{3})?`.
|
|
85
|
+
|
|
86
|
+
**On violation:**
|
|
87
|
+
```json
|
|
88
|
+
{
|
|
89
|
+
"rule": "mitre_atlas_ids",
|
|
90
|
+
"key": "mitre_atlas",
|
|
91
|
+
"value": "<offending-id>",
|
|
92
|
+
"message": "Invalid MITRE ATLAS ID '<offending-id>'. Expected AML.Txxxx or AML.Txxxx.xxx.",
|
|
93
|
+
"suggestion": "Check https://atlas.mitre.org/techniques/ for valid IDs."
|
|
94
|
+
}
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
---
|
|
98
|
+
|
|
99
|
+
## Rule 5 — control_id_format
|
|
100
|
+
|
|
101
|
+
**What:** Every string in `controls` must match the pattern `^[A-Z0-9][A-Z0-9_-]+-[A-Z0-9._-]+:[A-Z0-9._-]+$`
|
|
102
|
+
(i.e. `<FRAMEWORK>-<VERSION>:<ID>` where all three parts are present and non-empty).
|
|
103
|
+
|
|
104
|
+
Examples of valid control IDs:
|
|
105
|
+
- `NIST-AI-RMF-1.0:MEASURE-2.7`
|
|
106
|
+
- `EU-AI-ACT-2024:Art.55`
|
|
107
|
+
- `ISO-42001-2023:6.1.2`
|
|
108
|
+
- `INTERNAL-POLICY-2.1:CTRL-99`
|
|
109
|
+
|
|
110
|
+
**On violation:**
|
|
111
|
+
```json
|
|
112
|
+
{
|
|
113
|
+
"rule": "control_id_format",
|
|
114
|
+
"key": "controls",
|
|
115
|
+
"value": "<offending-control>",
|
|
116
|
+
"message": "Malformed control ID '<offending-control>'. Expected format: <FRAMEWORK>-<VERSION>:<ID>.",
|
|
117
|
+
"suggestion": "Use the format <FRAMEWORK>-<VERSION>:<ID>, e.g. 'EU-AI-ACT-2024:Art.55'."
|
|
118
|
+
}
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
---
|
|
122
|
+
|
|
123
|
+
## Rule 6 — risk_tier_value
|
|
124
|
+
|
|
125
|
+
**What:** `risk_tier`, when present, must be one of:
|
|
126
|
+
`prohibited`, `high`, `limited`, `minimal`
|
|
127
|
+
|
|
128
|
+
**On violation:**
|
|
129
|
+
```json
|
|
130
|
+
{
|
|
131
|
+
"rule": "risk_tier_value",
|
|
132
|
+
"key": "risk_tier",
|
|
133
|
+
"value": "<offending-value>",
|
|
134
|
+
"message": "Unknown risk_tier value '<offending-value>'. Allowed: prohibited, high, limited, minimal.",
|
|
135
|
+
"suggestion": "Use one of the EU AI Act risk tiers from references/eu-ai-act-risk-tiers.md."
|
|
136
|
+
}
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
Common mistakes:
|
|
140
|
+
- `high_risk` → `high`
|
|
141
|
+
- `limited_risk` → `limited`
|
|
142
|
+
- `minimal_risk` → `minimal`
|
|
143
|
+
- `low` → `minimal` (not an EU AI Act term)
|
|
144
|
+
|
|
145
|
+
---
|
|
146
|
+
|
|
147
|
+
## Rule 7 — array_not_empty
|
|
148
|
+
|
|
149
|
+
**What:** If a framework array key is present (`owasp_llm_top_10_2025`, `owasp_agentic_top_10_2025`,
|
|
150
|
+
`mitre_atlas`, `controls`), it must not be an empty array.
|
|
151
|
+
|
|
152
|
+
**On violation:**
|
|
153
|
+
```json
|
|
154
|
+
{
|
|
155
|
+
"rule": "array_not_empty",
|
|
156
|
+
"key": "<key>",
|
|
157
|
+
"value": [],
|
|
158
|
+
"message": "Empty array for '<key>'. Either populate it or remove the key.",
|
|
159
|
+
"suggestion": "Add at least one ID, or remove the key entirely."
|
|
160
|
+
}
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
---
|
|
164
|
+
|
|
165
|
+
## Severity
|
|
166
|
+
|
|
167
|
+
All rules above are **errors** (contribute to `pass: false`). There are no warnings in this
|
|
168
|
+
schema — an unknown key is always wrong, and empty arrays are always wrong. This matches the
|
|
169
|
+
intent: the block should only be present when it contains real, validated tags.
|
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
# MITRE ATLAS — AI/ML Threat Techniques
|
|
2
|
+
|
|
3
|
+
**Canonical IDs for use in `mitre_atlas:` arrays.**
|
|
4
|
+
|
|
5
|
+
Official source: https://atlas.mitre.org/
|
|
6
|
+
|
|
7
|
+
MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) documents
|
|
8
|
+
adversarial ML and AI attack techniques using the same taxonomy style as MITRE ATT&CK.
|
|
9
|
+
IDs follow the pattern `AML.Txxxx` for techniques and `AML.Txxxx.xxx` for sub-techniques.
|
|
10
|
+
|
|
11
|
+
## Techniques most relevant to LLM / agentic-AI evaluation
|
|
12
|
+
|
|
13
|
+
| ID | Name | Relevant OWASP IDs |
|
|
14
|
+
|----|------|-------------------|
|
|
15
|
+
| AML.T0051 | LLM Prompt Injection | LLM01, T01 |
|
|
16
|
+
| AML.T0054 | LLM Jailbreak | LLM01 |
|
|
17
|
+
| AML.T0056 | LLM Meta Prompt Extraction | LLM07 |
|
|
18
|
+
| AML.T0057 | LLM Plugin Compromise | LLM03, T09 |
|
|
19
|
+
| AML.T0058 | LLM Data Leakage | LLM02 |
|
|
20
|
+
| AML.T0068 | Training Data Poisoning | LLM04 |
|
|
21
|
+
| AML.T0075 | Manipulate LLM Inputs | LLM01, T01 |
|
|
22
|
+
|
|
23
|
+
## Sub-techniques
|
|
24
|
+
|
|
25
|
+
Sub-techniques extend a base ID with a period-separated suffix, e.g.:
|
|
26
|
+
- `AML.T0051.000` — Direct Prompt Injection
|
|
27
|
+
- `AML.T0051.001` — Indirect Prompt Injection
|
|
28
|
+
|
|
29
|
+
Use the base ID if the test covers the whole technique class; use sub-techniques for
|
|
30
|
+
more precise tagging when the attack method is specific.
|
|
31
|
+
|
|
32
|
+
## Usage notes
|
|
33
|
+
|
|
34
|
+
- List IDs as strings in an array: `mitre_atlas: [AML.T0051, AML.T0075]`
|
|
35
|
+
- Cross-reference with OWASP IDs when both frameworks cover the same attack:
|
|
36
|
+
a suite testing indirect prompt injection via tool output should tag
|
|
37
|
+
`owasp_llm_top_10_2025: [LLM01]` and `mitre_atlas: [AML.T0051]`.
|
|
38
|
+
- For the full technique catalog, browse https://atlas.mitre.org/techniques/
|
|
@@ -0,0 +1,28 @@
|
|
|
1
|
+
# OWASP Top 10 for Agentic AI v2025
|
|
2
|
+
|
|
3
|
+
**Canonical IDs for use in `owasp_agentic_top_10_2025:` arrays.**
|
|
4
|
+
|
|
5
|
+
Official source: https://owasp.org/www-project-top-10-for-large-language-model-applications/
|
|
6
|
+
(Agentic AI supplement — see the "Agentic AI" section of the OWASP LLM project)
|
|
7
|
+
|
|
8
|
+
| ID | Name | One-line description |
|
|
9
|
+
|----|------|----------------------|
|
|
10
|
+
| T01 | Prompt Injection for Agentic Systems | Attacker plants instructions in agent inputs, tool results, or retrieved content to redirect agent behavior. |
|
|
11
|
+
| T02 | Memory Poisoning | Adversarial content is written to agent memory (short- or long-term) to influence future decisions. |
|
|
12
|
+
| T03 | Data Exfiltration | Agent is manipulated into leaking sensitive data through tool calls, network requests, or outputs. |
|
|
13
|
+
| T04 | Privilege Escalation | Agent acquires or is tricked into using permissions beyond its intended scope. |
|
|
14
|
+
| T05 | Misconfigured Agent Networks | Overly permissive trust between orchestrating and sub-agents enables abuse. |
|
|
15
|
+
| T06 | Tool and Plugin Misuse | Agent uses legitimate tools (bash, file I/O, API calls) outside their intended purpose or without authorization. |
|
|
16
|
+
| T07 | Insecure Credential Storage | Agent stores or transmits credentials in memory, files, or outputs where they can be captured. |
|
|
17
|
+
| T08 | Unsafe Agent-to-Agent Communication | Messages between agents are unvalidated, unencrypted, or susceptible to injection. |
|
|
18
|
+
| T09 | Supply Chain Compromise | Malicious code in agent plugins, dependencies, or retrieved skill definitions. |
|
|
19
|
+
| T10 | Lack of Accountability | Agent actions are not logged or attributable, making audit and incident response impossible. |
|
|
20
|
+
|
|
21
|
+
## Usage notes
|
|
22
|
+
|
|
23
|
+
- Combine with `owasp_llm_top_10_2025` IDs for cases that bridge both lists.
|
|
24
|
+
Example: an indirect-prompt-injection attack is LLM01 + T01 + T06 (tool misuse).
|
|
25
|
+
- `T01` (Prompt Injection) and `LLM01` (Prompt Injection) are closely related but distinct:
|
|
26
|
+
LLM01 covers LLM-level injection; T01 covers the agent-orchestration dimension.
|
|
27
|
+
- List multiple IDs when a test case exercises more than one category:
|
|
28
|
+
`owasp_agentic_top_10_2025: [T01, T06]`
|
|
@@ -0,0 +1,25 @@
|
|
|
1
|
+
# OWASP LLM Top 10 v2025
|
|
2
|
+
|
|
3
|
+
**Canonical IDs for use in `owasp_llm_top_10_2025:` arrays.**
|
|
4
|
+
|
|
5
|
+
Official source: https://owasp.org/www-project-top-10-for-large-language-model-applications/
|
|
6
|
+
|
|
7
|
+
| ID | Name | One-line description |
|
|
8
|
+
|----|------|----------------------|
|
|
9
|
+
| LLM01 | Prompt Injection | Attacker manipulates LLM behavior via crafted inputs (direct or indirect). |
|
|
10
|
+
| LLM02 | Sensitive Information Disclosure | LLM reveals confidential data, system prompts, or PII in its output. |
|
|
11
|
+
| LLM03 | Supply Chain | Compromised components — plugins, datasets, pre-trained weights — affect the LLM pipeline. |
|
|
12
|
+
| LLM04 | Data and Model Poisoning | Training or fine-tuning data is tampered with to alter model behavior. |
|
|
13
|
+
| LLM05 | Improper Output Handling | LLM output is passed unsanitized to downstream systems (XSS, SSRF, code injection). |
|
|
14
|
+
| LLM06 | Excessive Agency | LLM acts on permissions or capabilities beyond what the task requires. |
|
|
15
|
+
| LLM07 | System Prompt Leakage | The system prompt or internal context is exposed to the user or a third party. |
|
|
16
|
+
| LLM08 | Vector and Embedding Weaknesses | Adversarial manipulation of embedding stores used for retrieval (RAG poisoning). |
|
|
17
|
+
| LLM09 | Misinformation | LLM generates plausible but factually incorrect content that causes harm. |
|
|
18
|
+
| LLM10 | Unbounded Consumption | LLM use is abused to exhaust resources — tokens, cost, rate limits, or compute. |
|
|
19
|
+
|
|
20
|
+
## Usage notes
|
|
21
|
+
|
|
22
|
+
- Use as many IDs as apply; list them in an array: `owasp_llm_top_10_2025: [LLM01, LLM06]`
|
|
23
|
+
- IDs are version-anchored. When OWASP releases a new version, a new field
|
|
24
|
+
(`owasp_llm_top_10_2026`) will be added rather than redefining these IDs.
|
|
25
|
+
- Combine with `mitre_atlas` IDs for technique-level tagging.
|