npm - llm-trust-guard - Versions diffs - 4.17.1 → 4.18.0 - Mend

llm-trust-guard 4.17.1 → 4.18.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/CHANGELOG.md +20 -0
package/package.json +1 -1

package/CHANGELOG.md CHANGED Viewed

@@ -5,6 +5,26 @@ All notable changes to `llm-trust-guard` will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [4.18.0] - 2026-04-10
+### Removed — TF-IDF Built-in Classifier
+Removed the experimental TF-IDF classifier after rigorous testing showed it is **not viable** for prompt injection detection:
+- Trained on 3 datasets (CCS'24 2023, JailbreakDB Oct 2025, hlyn Apr 2026)
+- All showed bimodal behavior or inadequate recall on modern attacks
+- Root cause: bag-of-words (TF-IDF) cannot distinguish intent from vocabulary — attack prompts and creative prompts use identical language
+- Research confirms: TF-IDF F1 ceiling for prompt injection is fundamentally limited (Trend Micro 2024)
+**For users who need ML-level prompt injection detection:** Use the `DetectionClassifier` interface to plug in a real model like Meta Prompt Guard 2 (22M params, 88.7% recall at 1% FPR) or protectai/DeBERTa-v3.
+### Added
+- `CLAUDE.md` with project rules for data freshness validation and honest benchmarking
+### Stats
+- 34 guards, 695 tests, <5ms latency, zero dependencies
+- Package size reduced ~300KB (model JSON removed)
 ## [4.17.1] - 2026-04-05
 ### Fixed — Pattern Weight and Regex Corrections

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "llm-trust-guard",
-  "version": "4.17.1",
+  "version": "4.18.0",
   "description": "Comprehensive security guards for LLM-powered and agentic AI applications - 22 protection layers covering OWASP Top 10 for LLMs 2025, Agentic Applications 2026, and MCP Security. All guards now accessible via unified TrustGuard facade. Features prompt injection (PAP/persuasion), multi-modal attacks, RAG poisoning with embedding attack detection, memory persistence attacks, code execution sandboxing, multi-agent security, MCP tool shadowing prevention, system prompt leakage protection, human-agent trust exploitation (ASI09), autonomy escalation (ASI10), state persistence (ASI08), tool chain validation v2 (ASI07/ASI04), circuit breaker, drift detection, and more",
   "main": "dist/index.js",
   "module": "dist/index.mjs",