@eigenart/agentshield 2.0.0-rc2 → 2.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +23 -1
- package/README.md +23 -16
- package/package.json +6 -6
package/CHANGELOG.md
CHANGED
|
@@ -1,13 +1,31 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## [2.0.0] — 2026-06-21
|
|
4
|
+
|
|
5
|
+
### Changed
|
|
6
|
+
|
|
7
|
+
- **Evaluation claims retracted and replaced.** Previous releases (2.0.0-rc1, 2.0.0-rc2) cited a "190/190 (100%) — zero bypasses, zero false positives" internal evaluation in README and package metadata. That number was not from a reproducible, properly-held-out evaluation and has been retracted. The canonical AgentShield benchmark is now hosted at [agentshield.pro/benchmark](https://agentshield.pro/benchmark): 5,972 samples across six public prompt-injection datasets, F1 0.956 headline (5 datasets) / 0.921 full set (6 datasets), per-sample false-positive / false-negative lists published. The Python implementation of the six-layer architecture is what was actually evaluated. Solana / ElizaOS-specific re-evaluation against the same benchmark + Web3-relevant attack patterns is in progress.
|
|
8
|
+
- README and package description rewritten for accuracy and to cross-reference the platform.
|
|
9
|
+
- Fixed install command typo (`@eigentart/agentshield` → `@eigenart/agentshield`).
|
|
10
|
+
- Repository homepage updated to [agentshield.pro](https://agentshield.pro).
|
|
11
|
+
- Bugs URL corrected to the actual repository path.
|
|
12
|
+
|
|
13
|
+
### Notes
|
|
14
|
+
|
|
15
|
+
- No code changes vs 2.0.0-rc2. This release is a documentation/metadata correction. Functional behavior of all six layers (L0–L5) and the on-chain transaction proxy is unchanged.
|
|
16
|
+
- Solana-specific evaluation results, when available, will be linked from this changelog and the README.
|
|
17
|
+
|
|
3
18
|
## [2.0.0-rc2] — 2026-03-25
|
|
4
19
|
|
|
20
|
+
> **Note:** the "190/190 (100%)" evaluation claim in this release has been retracted in 2.0.0. See above.
|
|
21
|
+
|
|
5
22
|
### Highlights
|
|
6
|
-
|
|
23
|
+
|
|
7
24
|
- Six-layer defense architecture fully operational (L0–L5)
|
|
8
25
|
- On-chain transaction proxy deployed to Solana Devnet
|
|
9
26
|
|
|
10
27
|
### Added
|
|
28
|
+
|
|
11
29
|
- **L2: Binary Classification Head v3** — MLP (384→128→2) trained on 184 samples with noise augmentation, replacing fragile margin-based cosine similarity
|
|
12
30
|
- **L2: Language-detection routing** — Unicode script analysis for non-Latin text with LLM-as-judge escalation (Ollama qwen3:8b)
|
|
13
31
|
- **L2: Question-form safety net** — Post-classification heuristic that rescues benign single-sentence questions from FINANCIAL_MANIPULATION false positives
|
|
@@ -17,6 +35,7 @@
|
|
|
17
35
|
- **L5: Metrics dashboard** — Chart.js with 6 KPIs, 4 charts, live event stream
|
|
18
36
|
|
|
19
37
|
### Security
|
|
38
|
+
|
|
20
39
|
- Fixed: German Finanzamt social engineering bypass
|
|
21
40
|
- Fixed: Vietnamese admin impersonation bypass
|
|
22
41
|
- Fixed: Japanese educational false positive (staking question)
|
|
@@ -26,6 +45,7 @@
|
|
|
26
45
|
## [2.0.0-rc1] — 2026-03-24
|
|
27
46
|
|
|
28
47
|
### Added
|
|
48
|
+
|
|
29
49
|
- **L2: Fine-tuned embedding model** (agentshield-minilm-v1) — contrastive learning on 9,980 samples
|
|
30
50
|
- **L2: Keyword heuristic** for multi-part compound attack detection
|
|
31
51
|
- **L4B: Solana Transaction Proxy** (Anchor/Rust) — 10/10 on-chain tests passing
|
|
@@ -34,6 +54,7 @@
|
|
|
34
54
|
## [2.0.0-beta] — 2026-03-23
|
|
35
55
|
|
|
36
56
|
### Added
|
|
57
|
+
|
|
37
58
|
- L0: Input Normalization (NFKC, homoglyph, Base64, leetspeak)
|
|
38
59
|
- L1: Pattern Registry (36 patterns, 5 languages, CRUD, versioning)
|
|
39
60
|
- L2: Heuristic semantic classifier
|
|
@@ -45,6 +66,7 @@
|
|
|
45
66
|
## [2.0.0-alpha] — 2026-03-22
|
|
46
67
|
|
|
47
68
|
### Added
|
|
69
|
+
|
|
48
70
|
- Initial ElizaOS v2 plugin scaffold
|
|
49
71
|
- Memory Guard with injection detection
|
|
50
72
|
- Transaction Guard with policy enforcement
|
package/README.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
Six-layer defense system that protects autonomous AI agents from prompt injection, memory manipulation, unauthorized transactions, and credential exfiltration.
|
|
4
4
|
|
|
5
|
-
**
|
|
5
|
+
> **Part of the AgentShield platform** — runtime verification for AI agents. The full benchmark methodology, per-sample false-positive / false-negative lists, and reproduction scripts are at [agentshield.pro/benchmark](https://agentshield.pro/benchmark). This package is the ElizaOS / Solana implementation of the same six-layer defense architecture.
|
|
6
6
|
|
|
7
7
|
## Why AgentShield?
|
|
8
8
|
|
|
@@ -13,7 +13,7 @@ AgentShield intercepts every incoming message and every outgoing transaction in
|
|
|
13
13
|
## Install
|
|
14
14
|
|
|
15
15
|
```bash
|
|
16
|
-
npm install @
|
|
16
|
+
npm install @eigenart/agentshield
|
|
17
17
|
```
|
|
18
18
|
|
|
19
19
|
```typescript
|
|
@@ -56,7 +56,7 @@ Incoming Message
|
|
|
56
56
|
│ 36 regex patterns across 5 languages
|
|
57
57
|
│
|
|
58
58
|
├─ L2: Semantic Classifier (~1.5ms)
|
|
59
|
-
│
|
|
59
|
+
│ Semantic classifier → Binary classification head
|
|
60
60
|
│ + language-detection routing + LLM-as-judge escalation
|
|
61
61
|
│
|
|
62
62
|
├─ L3: Output Guard (~0.5ms)
|
|
@@ -69,21 +69,21 @@ Incoming Message
|
|
|
69
69
|
Merkle audit trail (on-chain anchoring) + alerts + dashboard
|
|
70
70
|
```
|
|
71
71
|
|
|
72
|
-
## Evaluation
|
|
72
|
+
## Evaluation
|
|
73
73
|
|
|
74
|
-
|
|
74
|
+
The AgentShield platform benchmark — same six-layer methodology, evaluated on the Python implementation — is fully open and reproducible:
|
|
75
75
|
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
| Overall | 190/190 (100%) |
|
|
82
|
-
| Median latency | 1.5ms |
|
|
83
|
-
| Bypasses | 0 |
|
|
84
|
-
| False positives | 0 |
|
|
76
|
+
- **5,972 samples across 6 public prompt-injection datasets** (gandalf, safeguard, deepset, spml, jackhhao, pint)
|
|
77
|
+
- **F1 0.956 headline** (5 datasets excluding jackhhao role-play, 4,666 samples) — FPR 1.5%
|
|
78
|
+
- **F1 0.921 full set** (all 6 datasets, 5,972 samples) — FPR 13.2%
|
|
79
|
+
- **Latency p50 2.44 ms / p95 3.80 ms** end-to-end
|
|
80
|
+
- Per-sample false-positive / false-negative lists published in the repo
|
|
85
81
|
|
|
86
|
-
|
|
82
|
+
The full benchmark methodology, reproduction scripts, and per-sample failure analysis are at [agentshield.pro/benchmark](https://agentshield.pro/benchmark).
|
|
83
|
+
|
|
84
|
+
> **Note on this package specifically:** This ElizaOS / Solana implementation shares the architecture but is a separate TypeScript codebase. Solana-specific re-evaluation against the same benchmark, plus Web3-relevant attack patterns (memory injection, transaction priming, wallet-targeting payloads), is in progress and will be published as part of a future release.
|
|
85
|
+
|
|
86
|
+
Previous releases of this package contained an internal "190/190" evaluation claim that has been retracted. The platform benchmark above is the canonical reference.
|
|
87
87
|
|
|
88
88
|
## Custom Policies
|
|
89
89
|
|
|
@@ -116,11 +116,12 @@ Attack categories tested: prompt injection, social engineering, financial manipu
|
|
|
116
116
|
For maximum accuracy, AgentShield can use a fine-tuned GPU classifier running as a sidecar service. Without it, the plugin falls back to pattern matching + heuristic scoring (still effective, but fewer layers).
|
|
117
117
|
|
|
118
118
|
The classifier service requires:
|
|
119
|
+
|
|
119
120
|
- NVIDIA GPU with CUDA support
|
|
120
121
|
- Python 3.10+ with PyTorch and sentence-transformers
|
|
121
122
|
- ~500MB VRAM
|
|
122
123
|
|
|
123
|
-
See [classifier setup docs](https://github.com/eigenart
|
|
124
|
+
See [classifier setup docs](https://github.com/dl-eigenart/agentshield/tree/main/services/classifier) for deployment instructions.
|
|
124
125
|
|
|
125
126
|
## Exports
|
|
126
127
|
|
|
@@ -172,6 +173,12 @@ AgentShield includes a Solana program (Anchor/Rust) that enforces transaction po
|
|
|
172
173
|
|
|
173
174
|
Program ID (Devnet): `gURRDzQGXs7p4DrTt6dXPNFXHdwuK5u7WUHYobHMB1D`
|
|
174
175
|
|
|
176
|
+
## Related
|
|
177
|
+
|
|
178
|
+
- **AgentShield platform** (Python + MCP, general AI-agent runtime verification): [agentshield.pro](https://agentshield.pro)
|
|
179
|
+
- **Public benchmark + reproduction**: [agentshield.pro/benchmark](https://agentshield.pro/benchmark)
|
|
180
|
+
- **Platform repo**: [dl-eigenart/agentshield-platform](https://github.com/dl-eigenart/agentshield-platform)
|
|
181
|
+
|
|
175
182
|
## License
|
|
176
183
|
|
|
177
184
|
MIT — Eigenart Filmproduktion / Daniel Leonforte
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@eigenart/agentshield",
|
|
3
|
-
"version": "2.0.0
|
|
4
|
-
"description": "Six-layer AI agent security for ElizaOS on Solana — prompt injection defense, transaction policies, anomaly detection, and on-chain audit trail.
|
|
3
|
+
"version": "2.0.0",
|
|
4
|
+
"description": "Six-layer AI agent security for ElizaOS on Solana — prompt injection defense, transaction policies, anomaly detection, and on-chain audit trail. Part of the AgentShield platform (agentshield.pro).",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"main": "dist/index.js",
|
|
7
7
|
"module": "dist/index.js",
|
|
@@ -48,17 +48,17 @@
|
|
|
48
48
|
"author": {
|
|
49
49
|
"name": "Daniel Leonforte",
|
|
50
50
|
"email": "eigenart.filmproduction@gmail.com",
|
|
51
|
-
"url": "https://github.com/eigenart
|
|
51
|
+
"url": "https://github.com/dl-eigenart"
|
|
52
52
|
},
|
|
53
53
|
"license": "MIT",
|
|
54
54
|
"repository": {
|
|
55
55
|
"type": "git",
|
|
56
|
-
"url": "https://github.com/eigenart
|
|
56
|
+
"url": "https://github.com/dl-eigenart/agentshield"
|
|
57
57
|
},
|
|
58
58
|
"bugs": {
|
|
59
|
-
"url": "https://github.com/eigenart
|
|
59
|
+
"url": "https://github.com/dl-eigenart/agentshield/issues"
|
|
60
60
|
},
|
|
61
|
-
"homepage": "https://
|
|
61
|
+
"homepage": "https://agentshield.pro",
|
|
62
62
|
"engines": {
|
|
63
63
|
"node": ">=18.0.0"
|
|
64
64
|
},
|