@eigenart/agentshield 2.0.0-rc2 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/CHANGELOG.md +23 -1
  2. package/README.md +23 -16
  3. package/package.json +6 -6
package/CHANGELOG.md CHANGED
@@ -1,13 +1,31 @@
1
1
  # Changelog
2
2
 
3
+ ## [2.0.0] — 2026-06-21
4
+
5
+ ### Changed
6
+
7
+ - **Evaluation claims retracted and replaced.** Previous releases (2.0.0-rc1, 2.0.0-rc2) cited a "190/190 (100%) — zero bypasses, zero false positives" internal evaluation in README and package metadata. That number was not from a reproducible, properly-held-out evaluation and has been retracted. The canonical AgentShield benchmark is now hosted at [agentshield.pro/benchmark](https://agentshield.pro/benchmark): 5,972 samples across six public prompt-injection datasets, F1 0.956 headline (5 datasets) / 0.921 full set (6 datasets), per-sample false-positive / false-negative lists published. The Python implementation of the six-layer architecture is what was actually evaluated. Solana / ElizaOS-specific re-evaluation against the same benchmark + Web3-relevant attack patterns is in progress.
8
+ - README and package description rewritten for accuracy and to cross-reference the platform.
9
+ - Fixed install command typo (`@eigentart/agentshield` → `@eigenart/agentshield`).
10
+ - Repository homepage updated to [agentshield.pro](https://agentshield.pro).
11
+ - Bugs URL corrected to the actual repository path.
12
+
13
+ ### Notes
14
+
15
+ - No code changes vs 2.0.0-rc2. This release is a documentation/metadata correction. Functional behavior of all six layers (L0–L5) and the on-chain transaction proxy is unchanged.
16
+ - Solana-specific evaluation results, when available, will be linked from this changelog and the README.
17
+
3
18
  ## [2.0.0-rc2] — 2026-03-25
4
19
 
20
+ > **Note:** the "190/190 (100%)" evaluation claim in this release has been retracted in 2.0.0. See above.
21
+
5
22
  ### Highlights
6
- - **190/190 (100%)** on independent evaluation — zero bypasses, zero false positives
23
+
7
24
  - Six-layer defense architecture fully operational (L0–L5)
8
25
  - On-chain transaction proxy deployed to Solana Devnet
9
26
 
10
27
  ### Added
28
+
11
29
  - **L2: Binary Classification Head v3** — MLP (384→128→2) trained on 184 samples with noise augmentation, replacing fragile margin-based cosine similarity
12
30
  - **L2: Language-detection routing** — Unicode script analysis for non-Latin text with LLM-as-judge escalation (Ollama qwen3:8b)
13
31
  - **L2: Question-form safety net** — Post-classification heuristic that rescues benign single-sentence questions from FINANCIAL_MANIPULATION false positives
@@ -17,6 +35,7 @@
17
35
  - **L5: Metrics dashboard** — Chart.js with 6 KPIs, 4 charts, live event stream
18
36
 
19
37
  ### Security
38
+
20
39
  - Fixed: German Finanzamt social engineering bypass
21
40
  - Fixed: Vietnamese admin impersonation bypass
22
41
  - Fixed: Japanese educational false positive (staking question)
@@ -26,6 +45,7 @@
26
45
  ## [2.0.0-rc1] — 2026-03-24
27
46
 
28
47
  ### Added
48
+
29
49
  - **L2: Fine-tuned embedding model** (agentshield-minilm-v1) — contrastive learning on 9,980 samples
30
50
  - **L2: Keyword heuristic** for multi-part compound attack detection
31
51
  - **L4B: Solana Transaction Proxy** (Anchor/Rust) — 10/10 on-chain tests passing
@@ -34,6 +54,7 @@
34
54
  ## [2.0.0-beta] — 2026-03-23
35
55
 
36
56
  ### Added
57
+
37
58
  - L0: Input Normalization (NFKC, homoglyph, Base64, leetspeak)
38
59
  - L1: Pattern Registry (36 patterns, 5 languages, CRUD, versioning)
39
60
  - L2: Heuristic semantic classifier
@@ -45,6 +66,7 @@
45
66
  ## [2.0.0-alpha] — 2026-03-22
46
67
 
47
68
  ### Added
69
+
48
70
  - Initial ElizaOS v2 plugin scaffold
49
71
  - Memory Guard with injection detection
50
72
  - Transaction Guard with policy enforcement
package/README.md CHANGED
@@ -2,7 +2,7 @@
2
2
 
3
3
  Six-layer defense system that protects autonomous AI agents from prompt injection, memory manipulation, unauthorized transactions, and credential exfiltration.
4
4
 
5
- **Independent evaluation: 190/190 (100%)zero bypasses, zero false positives.**
5
+ > **Part of the AgentShield platform** runtime verification for AI agents. The full benchmark methodology, per-sample false-positive / false-negative lists, and reproduction scripts are at [agentshield.pro/benchmark](https://agentshield.pro/benchmark). This package is the ElizaOS / Solana implementation of the same six-layer defense architecture.
6
6
 
7
7
  ## Why AgentShield?
8
8
 
@@ -13,7 +13,7 @@ AgentShield intercepts every incoming message and every outgoing transaction in
13
13
  ## Install
14
14
 
15
15
  ```bash
16
- npm install @eigentart/agentshield
16
+ npm install @eigenart/agentshield
17
17
  ```
18
18
 
19
19
  ```typescript
@@ -56,7 +56,7 @@ Incoming Message
56
56
  │ 36 regex patterns across 5 languages
57
57
 
58
58
  ├─ L2: Semantic Classifier (~1.5ms)
59
- Fine-tuned MiniLM embeddings → Binary classification head
59
+ Semantic classifier → Binary classification head
60
60
  │ + language-detection routing + LLM-as-judge escalation
61
61
 
62
62
  ├─ L3: Output Guard (~0.5ms)
@@ -69,21 +69,21 @@ Incoming Message
69
69
  Merkle audit trail (on-chain anchoring) + alerts + dashboard
70
70
  ```
71
71
 
72
- ## Evaluation Results
72
+ ## Evaluation
73
73
 
74
- Independent evaluation with 190 samples (zero overlap with training data):
74
+ The AgentShield platform benchmark same six-layer methodology, evaluated on the Python implementation — is fully open and reproducible:
75
75
 
76
- | Metric | Score |
77
- |---|---|
78
- | Attack detection | 90/90 (100%) |
79
- | Benign accuracy | 50/50 (100%) |
80
- | Adversarial-benign accuracy | 50/50 (100%) |
81
- | Overall | 190/190 (100%) |
82
- | Median latency | 1.5ms |
83
- | Bypasses | 0 |
84
- | False positives | 0 |
76
+ - **5,972 samples across 6 public prompt-injection datasets** (gandalf, safeguard, deepset, spml, jackhhao, pint)
77
+ - **F1 0.956 headline** (5 datasets excluding jackhhao role-play, 4,666 samples) — FPR 1.5%
78
+ - **F1 0.921 full set** (all 6 datasets, 5,972 samples) — FPR 13.2%
79
+ - **Latency p50 2.44 ms / p95 3.80 ms** end-to-end
80
+ - Per-sample false-positive / false-negative lists published in the repo
85
81
 
86
- Attack categories tested: prompt injection, social engineering, financial manipulation, exfiltration, wallet priming, multi-language variants, encoding-based evasion, compound multi-part attacks.
82
+ The full benchmark methodology, reproduction scripts, and per-sample failure analysis are at [agentshield.pro/benchmark](https://agentshield.pro/benchmark).
83
+
84
+ > **Note on this package specifically:** This ElizaOS / Solana implementation shares the architecture but is a separate TypeScript codebase. Solana-specific re-evaluation against the same benchmark, plus Web3-relevant attack patterns (memory injection, transaction priming, wallet-targeting payloads), is in progress and will be published as part of a future release.
85
+
86
+ Previous releases of this package contained an internal "190/190" evaluation claim that has been retracted. The platform benchmark above is the canonical reference.
87
87
 
88
88
  ## Custom Policies
89
89
 
@@ -116,11 +116,12 @@ Attack categories tested: prompt injection, social engineering, financial manipu
116
116
  For maximum accuracy, AgentShield can use a fine-tuned GPU classifier running as a sidecar service. Without it, the plugin falls back to pattern matching + heuristic scoring (still effective, but fewer layers).
117
117
 
118
118
  The classifier service requires:
119
+
119
120
  - NVIDIA GPU with CUDA support
120
121
  - Python 3.10+ with PyTorch and sentence-transformers
121
122
  - ~500MB VRAM
122
123
 
123
- See [classifier setup docs](https://github.com/eigenart-dev/agentshield/tree/main/services/classifier) for deployment instructions.
124
+ See [classifier setup docs](https://github.com/dl-eigenart/agentshield/tree/main/services/classifier) for deployment instructions.
124
125
 
125
126
  ## Exports
126
127
 
@@ -172,6 +173,12 @@ AgentShield includes a Solana program (Anchor/Rust) that enforces transaction po
172
173
 
173
174
  Program ID (Devnet): `gURRDzQGXs7p4DrTt6dXPNFXHdwuK5u7WUHYobHMB1D`
174
175
 
176
+ ## Related
177
+
178
+ - **AgentShield platform** (Python + MCP, general AI-agent runtime verification): [agentshield.pro](https://agentshield.pro)
179
+ - **Public benchmark + reproduction**: [agentshield.pro/benchmark](https://agentshield.pro/benchmark)
180
+ - **Platform repo**: [dl-eigenart/agentshield-platform](https://github.com/dl-eigenart/agentshield-platform)
181
+
175
182
  ## License
176
183
 
177
184
  MIT — Eigenart Filmproduktion / Daniel Leonforte
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "@eigenart/agentshield",
3
- "version": "2.0.0-rc2",
4
- "description": "Six-layer AI agent security for ElizaOS on Solana — prompt injection defense, transaction policies, anomaly detection, and on-chain audit trail. 190/190 independent eval.",
3
+ "version": "2.0.0",
4
+ "description": "Six-layer AI agent security for ElizaOS on Solana — prompt injection defense, transaction policies, anomaly detection, and on-chain audit trail. Part of the AgentShield platform (agentshield.pro).",
5
5
  "type": "module",
6
6
  "main": "dist/index.js",
7
7
  "module": "dist/index.js",
@@ -48,17 +48,17 @@
48
48
  "author": {
49
49
  "name": "Daniel Leonforte",
50
50
  "email": "eigenart.filmproduction@gmail.com",
51
- "url": "https://github.com/eigenart-dev"
51
+ "url": "https://github.com/dl-eigenart"
52
52
  },
53
53
  "license": "MIT",
54
54
  "repository": {
55
55
  "type": "git",
56
- "url": "https://github.com/eigenart-dev/agentshield"
56
+ "url": "https://github.com/dl-eigenart/agentshield"
57
57
  },
58
58
  "bugs": {
59
- "url": "https://github.com/eigenart-dev/agentshield/issues"
59
+ "url": "https://github.com/dl-eigenart/agentshield/issues"
60
60
  },
61
- "homepage": "https://github.com/eigenart-dev/agentshield#readme",
61
+ "homepage": "https://agentshield.pro",
62
62
  "engines": {
63
63
  "node": ">=18.0.0"
64
64
  },