npm - @eigenart/agentshield - Versions diffs - 2.0.0-rc2 → 2.0.0 - Mend

@eigenart/agentshield 2.0.0-rc2 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -1,13 +1,31 @@
 # Changelog
+## [2.0.0] — 2026-06-21
+### Changed
+- **Evaluation claims retracted and replaced.** Previous releases (2.0.0-rc1, 2.0.0-rc2) cited a "190/190 (100%) — zero bypasses, zero false positives" internal evaluation in README and package metadata. That number was not from a reproducible, properly-held-out evaluation and has been retracted. The canonical AgentShield benchmark is now hosted at [agentshield.pro/benchmark](https://agentshield.pro/benchmark): 5,972 samples across six public prompt-injection datasets, F1 0.956 headline (5 datasets) / 0.921 full set (6 datasets), per-sample false-positive / false-negative lists published. The Python implementation of the six-layer architecture is what was actually evaluated. Solana / ElizaOS-specific re-evaluation against the same benchmark + Web3-relevant attack patterns is in progress.
+- README and package description rewritten for accuracy and to cross-reference the platform.
+- Fixed install command typo (`@eigentart/agentshield` → `@eigenart/agentshield`).
+- Repository homepage updated to [agentshield.pro](https://agentshield.pro).
+- Bugs URL corrected to the actual repository path.
+### Notes
+- No code changes vs 2.0.0-rc2. This release is a documentation/metadata correction. Functional behavior of all six layers (L0–L5) and the on-chain transaction proxy is unchanged.
+- Solana-specific evaluation results, when available, will be linked from this changelog and the README.
 ## [2.0.0-rc2] — 2026-03-25
+> **Note:** the "190/190 (100%)" evaluation claim in this release has been retracted in 2.0.0. See above.
 ### Highlights
-- **190/190 (100%)** on independent evaluation — zero bypasses, zero false positives
 - Six-layer defense architecture fully operational (L0–L5)
 - On-chain transaction proxy deployed to Solana Devnet
 ### Added
 - **L2: Binary Classification Head v3** — MLP (384→128→2) trained on 184 samples with noise augmentation, replacing fragile margin-based cosine similarity
 - **L2: Language-detection routing** — Unicode script analysis for non-Latin text with LLM-as-judge escalation (Ollama qwen3:8b)
 - **L2: Question-form safety net** — Post-classification heuristic that rescues benign single-sentence questions from FINANCIAL_MANIPULATION false positives
@@ -17,6 +35,7 @@
 - **L5: Metrics dashboard** — Chart.js with 6 KPIs, 4 charts, live event stream
 ### Security
 - Fixed: German Finanzamt social engineering bypass
 - Fixed: Vietnamese admin impersonation bypass
 - Fixed: Japanese educational false positive (staking question)
@@ -26,6 +45,7 @@
 ## [2.0.0-rc1] — 2026-03-24
 ### Added
 - **L2: Fine-tuned embedding model** (agentshield-minilm-v1) — contrastive learning on 9,980 samples
 - **L2: Keyword heuristic** for multi-part compound attack detection
 - **L4B: Solana Transaction Proxy** (Anchor/Rust) — 10/10 on-chain tests passing
@@ -34,6 +54,7 @@
 ## [2.0.0-beta] — 2026-03-23
 ### Added
 - L0: Input Normalization (NFKC, homoglyph, Base64, leetspeak)
 - L1: Pattern Registry (36 patterns, 5 languages, CRUD, versioning)
 - L2: Heuristic semantic classifier
@@ -45,6 +66,7 @@
 ## [2.0.0-alpha] — 2026-03-22
 ### Added
 - Initial ElizaOS v2 plugin scaffold
 - Memory Guard with injection detection
 - Transaction Guard with policy enforcement

package/README.md CHANGED Viewed

@@ -2,7 +2,7 @@
 Six-layer defense system that protects autonomous AI agents from prompt injection, memory manipulation, unauthorized transactions, and credential exfiltration.
-**Independent evaluation: 190/190 (100%) — zero bypasses, zero false positives.**
+> **Part of the AgentShield platform** — runtime verification for AI agents. The full benchmark methodology, per-sample false-positive / false-negative lists, and reproduction scripts are at [agentshield.pro/benchmark](https://agentshield.pro/benchmark). This package is the ElizaOS / Solana implementation of the same six-layer defense architecture.
 ## Why AgentShield?
@@ -13,7 +13,7 @@ AgentShield intercepts every incoming message and every outgoing transaction in
 ## Install
 ```bash
-npm install @eigentart/agentshield
+npm install @eigenart/agentshield
 ```
 ```typescript
@@ -56,7 +56,7 @@ Incoming Message
   │   36 regex patterns across 5 languages
   │
   ├─ L2: Semantic Classifier         (~1.5ms)
-  │   Fine-tuned MiniLM embeddings → Binary classification head
+  │   Semantic classifier → Binary classification head
   │   + language-detection routing + LLM-as-judge escalation
   │
   ├─ L3: Output Guard               (~0.5ms)
@@ -69,21 +69,21 @@ Incoming Message
       Merkle audit trail (on-chain anchoring) + alerts + dashboard
 ```
-## Evaluation Results
+## Evaluation
-Independent evaluation with 190 samples (zero overlap with training data):
+The AgentShield platform benchmark — same six-layer methodology, evaluated on the Python implementation — is fully open and reproducible:
-| Metric | Score |
-|---|---|
-| Attack detection | 90/90 (100%) |
-| Benign accuracy | 50/50 (100%) |
-| Adversarial-benign accuracy | 50/50 (100%) |
-| Overall | 190/190 (100%) |
-| Median latency | 1.5ms |
-| Bypasses | 0 |
-| False positives | 0 |
+- **5,972 samples across 6 public prompt-injection datasets** (gandalf, safeguard, deepset, spml, jackhhao, pint)
+- **F1 0.956 headline** (5 datasets excluding jackhhao role-play, 4,666 samples) — FPR 1.5%
+- **F1 0.921 full set** (all 6 datasets, 5,972 samples) — FPR 13.2%
+- **Latency p50 2.44 ms / p95 3.80 ms** end-to-end
+- Per-sample false-positive / false-negative lists published in the repo
-Attack categories tested: prompt injection, social engineering, financial manipulation, exfiltration, wallet priming, multi-language variants, encoding-based evasion, compound multi-part attacks.
+The full benchmark methodology, reproduction scripts, and per-sample failure analysis are at [agentshield.pro/benchmark](https://agentshield.pro/benchmark).
+> **Note on this package specifically:** This ElizaOS / Solana implementation shares the architecture but is a separate TypeScript codebase. Solana-specific re-evaluation against the same benchmark, plus Web3-relevant attack patterns (memory injection, transaction priming, wallet-targeting payloads), is in progress and will be published as part of a future release.
+Previous releases of this package contained an internal "190/190" evaluation claim that has been retracted. The platform benchmark above is the canonical reference.
 ## Custom Policies
@@ -116,11 +116,12 @@ Attack categories tested: prompt injection, social engineering, financial manipu
 For maximum accuracy, AgentShield can use a fine-tuned GPU classifier running as a sidecar service. Without it, the plugin falls back to pattern matching + heuristic scoring (still effective, but fewer layers).
 The classifier service requires:
 - NVIDIA GPU with CUDA support
 - Python 3.10+ with PyTorch and sentence-transformers
 - ~500MB VRAM
-See [classifier setup docs](https://github.com/eigenart-dev/agentshield/tree/main/services/classifier) for deployment instructions.
+See [classifier setup docs](https://github.com/dl-eigenart/agentshield/tree/main/services/classifier) for deployment instructions.
 ## Exports
@@ -172,6 +173,12 @@ AgentShield includes a Solana program (Anchor/Rust) that enforces transaction po
 Program ID (Devnet): `gURRDzQGXs7p4DrTt6dXPNFXHdwuK5u7WUHYobHMB1D`
+## Related
+- **AgentShield platform** (Python + MCP, general AI-agent runtime verification): [agentshield.pro](https://agentshield.pro)
+- **Public benchmark + reproduction**: [agentshield.pro/benchmark](https://agentshield.pro/benchmark)
+- **Platform repo**: [dl-eigenart/agentshield-platform](https://github.com/dl-eigenart/agentshield-platform)
 ## License
 MIT — Eigenart Filmproduktion / Daniel Leonforte

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "@eigenart/agentshield",
-  "version": "2.0.0-rc2",
-  "description": "Six-layer AI agent security for ElizaOS on Solana — prompt injection defense, transaction policies, anomaly detection, and on-chain audit trail. 190/190 independent eval.",
+  "version": "2.0.0",
+  "description": "Six-layer AI agent security for ElizaOS on Solana — prompt injection defense, transaction policies, anomaly detection, and on-chain audit trail. Part of the AgentShield platform (agentshield.pro).",
   "type": "module",
   "main": "dist/index.js",
   "module": "dist/index.js",
@@ -48,17 +48,17 @@
   "author": {
     "name": "Daniel Leonforte",
     "email": "eigenart.filmproduction@gmail.com",
-    "url": "https://github.com/eigenart-dev"
+    "url": "https://github.com/dl-eigenart"
   },
   "license": "MIT",
   "repository": {
     "type": "git",
-    "url": "https://github.com/eigenart-dev/agentshield"
+    "url": "https://github.com/dl-eigenart/agentshield"
   },
   "bugs": {
-    "url": "https://github.com/eigenart-dev/agentshield/issues"
+    "url": "https://github.com/dl-eigenart/agentshield/issues"
   },
-  "homepage": "https://github.com/eigenart-dev/agentshield#readme",
+  "homepage": "https://agentshield.pro",
   "engines": {
     "node": ">=18.0.0"
   },