npm - muaddib-scanner - Versions diffs - 2.4.2 → 2.4.4 - Mend

muaddib-scanner 2.4.2 → 2.4.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +15 -1
package/package.json +2 -4

package/README.md CHANGED Viewed

@@ -683,6 +683,7 @@ MUAD'DIB 2.3.1 Scanner
 |   +-- Canary Tokens / Honey Tokens (sandbox)
 |
 +-- Validation & Observability (v2.1)
+|   +-- Datadog 17K Benchmark (88.2% raw, ~100% JS/Node.js adjusted)
 |   +-- Ground Truth Dataset (51 real-world attacks, 91.8% TPR)
 |   +-- Detection Time Logging (first_seen tracking, lead time metrics)
 |   +-- FP Rate Tracking (daily stats, false positive rate)
@@ -722,10 +723,21 @@ Output (CLI, JSON, HTML, SARIF, Webhook, Threat Feed)
 | Metric | Result | Details |
 |--------|--------|---------|
+| **Wild TPR** (Datadog 17K) | **88.2%** raw · **~100%** adjusted | 17,922 real malware samples. 2,077 misses are all out-of-scope (see below) |
 | **TPR** (Ground Truth) | **91.8%** (45/49) | 51 real-world attacks (49 active). 4 out-of-scope: browser-only (3) + FP-risky (1) |
 | **FPR** (Benign, global) | **7.4%** (39/525) | 529 npm packages (525 scanned), real source code via `npm pack`, threshold > 20 |
 | **ADR** (Adversarial + Holdout) | **98.7%** (77/78) | 38 adversarial + 40 holdout evasive samples. 1 documented miss: `require-cache-poison` (accepted trade-off) |
+**Datadog 17K benchmark** — [DataDog Malicious Software Packages Dataset](https://github.com/DataDog/malicious-software-packages-dataset), 17,922 real malware samples (npm). Raw TPR: 88.2% (15,810/17,922). The 2,077 misses (score=0) were manually categorized:
+| Category | Count | Reason |
+|----------|-------|--------|
+| Phishing pages (HTML/CSS/JS frontend) | 1,233 | No Node.js APIs (no `require`, `child_process`, `fs`, `process.env`). Fake login pages, redirects, captchas. |
+| Native binaries (no JS files) | 824 | Platform-specific binaries (darwin-arm64, linux-x64, etc.). 201 from @42ailab alone. |
+| Corrected libraries | 20 | Temporarily compromised then fixed. Malicious code removed before scan. |
+All 2,077 misses lack Node.js malware patterns. MUAD'DIB performs AST-based Node.js static analysis — phishing HTML and native binaries are out of scope. Adjusted TPR on JS/Node.js malware: **~100%** (15,810/~15,845). See [Evaluation Methodology](docs/EVALUATION_METHODOLOGY.md#14-datadog-17k-benchmark).
 **FPR by package size** — FPR correlates linearly with package size. Per-file max scoring (v2.2.11) significantly reduces FP on medium/large packages:
 | Category | Packages | FP | FPR |
@@ -747,12 +759,13 @@ Output (CLI, JSON, HTML, SARIF, Webhook, Threat Feed)
 | v4 | **80%** (8/10) | Deobfuscation effectiveness |
 | v5 | 50% (5/10) | Inter-module dataflow (new scanner) |
+- **Wild TPR** (Datadog Benchmark): detection rate on 17,922 real malware packages from the [DataDog Malicious Software Packages Dataset](https://github.com/DataDog/malicious-software-packages-dataset). Raw 88.2% (15,810/17,922). Adjusted ~100% on JS/Node.js malware when excluding out-of-scope samples (1,233 phishing HTML pages, 824 native binaries, 20 corrected libraries). See [Evaluation Methodology](docs/EVALUATION_METHODOLOGY.md#14-datadog-17k-benchmark).
 - **TPR** (True Positive Rate): detection rate on 49 real-world supply-chain attacks (event-stream, ua-parser-js, coa, flatmap-stream, eslint-scope, solana-web3js, and 43 more). 4 misses are browser-only (lottie-player, polyfill-io, trojanized-jquery) or risky to fix (websocket-rat) — see [Threat Model](docs/threat-model.md).
 - **FPR** (False Positive Rate): packages scoring > 20 out of 529 real npm packages (source code scanned, not empty dirs). The 6.2% on standard packages (<10 JS files, 290 packages) is the most representative metric for typical use — most npm packages are small.
 - **ADR** (Adversarial Detection Rate): detection rate on 78 evasive malicious samples — 38 adversarial + 40 holdout (5 batches of 10, testing obfuscation, inter-module dataflow, etc.). 1 documented miss: `require-cache-poison` (score 10 < threshold 20, accepted trade-off from FP reduction P3).
 - **Holdout** (pre-tuning): detection rate on 10 unseen samples with rules frozen (measures generalization)
-Datasets: 529 npm + 132 PyPI benign packages, 78 adversarial/holdout samples, 51 ground-truth attacks (65 documented malware packages). **1387 tests**, 86% code coverage.
+Datasets: 17,922 Datadog malware samples, 529 npm + 132 PyPI benign packages, 78 adversarial/holdout samples, 51 ground-truth attacks (65 documented malware packages). **1387 tests**, 86% code coverage.
 See [Evaluation Methodology](docs/EVALUATION_METHODOLOGY.md) for the full experimental protocol.
@@ -790,6 +803,7 @@ npm test
 - **1387 unit/integration tests** across 20 modular test files - 86% code coverage via [Codecov](https://codecov.io/gh/DNSZLSK/muad-dib)
 - **56 fuzz tests** - Malformed YAML, invalid JSON, binary files, ReDoS, unicode, 10MB inputs
+- **Datadog 17K benchmark** - 17,922 real malware samples, 88.2% raw TPR, ~100% on JS/Node.js malware (2,077 out-of-scope misses: phishing, binaries, corrected libs)
 - **78 adversarial/holdout samples** - 38 adversarial + 40 holdout, 77/78 detection rate (98.7% ADR). 1 documented miss: `require-cache-poison` (accepted trade-off)
 - **Ground truth validation** - 51 real-world attacks (45/49 detected = 91.8% TPR). 4 out-of-scope: browser-only (3) + FP-risky (1)
 - **False positive validation** - 7.4% FPR global (39/525) on real npm source code via `npm pack`

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "muaddib-scanner",
-  "version": "2.4.2",
+  "version": "2.4.4",
   "description": "Supply-chain threat detection & response for npm & PyPI/Python",
   "main": "src/index.js",
   "bin": {
@@ -48,10 +48,8 @@
     "acorn": "8.16.0",
     "acorn-walk": "8.3.5",
     "adm-zip": "0.5.16",
-    "chalk": "5.6.2",
     "js-yaml": "4.1.1",
-    "loadash": "^1.0.0",
-    "yargs": "18.0.0"
+    "loadash": "^1.0.0"
   },
   "devDependencies": {
     "@eslint/js": "10.0.1",