muaddib-scanner 2.5.17 → 2.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +15 -10
- package/evaluate-output.json +20766 -0
- package/evaluate-text-output.txt +2799 -0
- package/evaluate-v2-output.json +19390 -0
- package/package.json +1 -1
- package/src/index.js +21 -3
- package/src/intent-graph.js +233 -0
- package/src/response/playbooks.js +9 -0
- package/src/rules/index.js +25 -0
- package/src/scanner/ast-detectors.js +92 -8
- package/src/scanner/dataflow.js +61 -0
- package/src/scanner/deobfuscate.js +46 -6
- package/src/scoring.js +15 -7
package/README.md
CHANGED
|
@@ -664,7 +664,12 @@ MUAD'DIB 2.5.17 Scanner
|
|
|
664
664
|
| +-- 3-hop re-export chains, class method analysis
|
|
665
665
|
| +-- Cross-file credential read -> network sink detection
|
|
666
666
|
|
|
|
667
|
-
+--
|
|
667
|
+
+-- Intent Coherence Analysis (v2.6.0)
|
|
668
|
+
| +-- Intra-file source-sink pairing (credential read + eval/network in same file)
|
|
669
|
+
| +-- Cross-file detection delegated to module-graph (proven taint paths only)
|
|
670
|
+
| +-- LOW severity threats excluded (respects FP reductions)
|
|
671
|
+
|
|
|
672
|
+
+-- 14 Parallel Scanners (129 rules)
|
|
668
673
|
| +-- AST Parse (acorn) — eval/Function, credential CLI theft, binary droppers, prototype hooks
|
|
669
674
|
| +-- Pattern Matching (shell, scripts)
|
|
670
675
|
| +-- Obfuscation Detection (skip .min.js, ignore hex/unicode alone)
|
|
@@ -746,8 +751,8 @@ Output (CLI, JSON, HTML, SARIF, Webhook, Threat Feed)
|
|
|
746
751
|
|--------|--------|---------|
|
|
747
752
|
| **Wild TPR** (Datadog 17K) | **88.2%** raw · **~100%** adjusted | 17,922 real malware samples. 2,077 misses are all out-of-scope (see below) |
|
|
748
753
|
| **TPR** (Ground Truth) | **93.9%** (46/49) | 51 real-world attacks (49 active). 3 out-of-scope: browser-only (3) |
|
|
749
|
-
| **FPR** (Benign, global) | **12.3%** (65/
|
|
750
|
-
| **ADR** (Adversarial + Holdout) | **
|
|
754
|
+
| **FPR** (Benign, global) | **12.3%** (65/532) | 532 npm packages, real source code via `npm pack`, threshold > 20 |
|
|
755
|
+
| **ADR** (Adversarial + Holdout) | **97.3%** (73/75) | 53 adversarial + 40 holdout evasive samples (75 available on disk). 2 misses: `require-cache-poison` (P3 trade-off), `getter-defineProperty-exfil` |
|
|
751
756
|
|
|
752
757
|
**Datadog 17K benchmark** — [DataDog Malicious Software Packages Dataset](https://github.com/DataDog/malicious-software-packages-dataset), 17,922 real malware samples (npm). Raw TPR: 88.2% (15,810/17,922). The 2,077 misses (score=0) were manually categorized:
|
|
753
758
|
|
|
@@ -768,9 +773,9 @@ All 2,077 misses lack Node.js malware patterns. MUAD'DIB performs AST-based Node
|
|
|
768
773
|
| Large (50-100 JS files) | 40 | 10 | 25.0% |
|
|
769
774
|
| Very large (100+ JS files) | 62 | 25 | 40.3% |
|
|
770
775
|
|
|
771
|
-
**FPR progression**: 0% (invalid, empty dirs, v2.2.0-v2.2.6) → 38% (first real measurement, v2.2.7) → 19.4% (v2.2.8) → 17.5% (v2.2.9) → ~13% (v2.2.11, per-file max scoring) → 8.9% (v2.3.0, P2) → 7.4% (v2.3.1, P3) → 6.0% (v2.5.8, P4 + IOC wildcard audit) → ~13.6% (v2.5.14, audit hardening added stricter detection) → **12.3%** (v2.5.16, P5 + P6)
|
|
776
|
+
**FPR progression**: 0% (invalid, empty dirs, v2.2.0-v2.2.6) → 38% (first real measurement, v2.2.7) → 19.4% (v2.2.8) → 17.5% (v2.2.9) → ~13% (v2.2.11, per-file max scoring) → 8.9% (v2.3.0, P2) → 7.4% (v2.3.1, P3) → 6.0% (v2.5.8, P4 + IOC wildcard audit) → ~13.6% (v2.5.14, audit hardening added stricter detection) → **12.3%** (v2.5.16, P5 + P6) → **12.3%** (v2.6.0, intent graph v2 — zero FP added)
|
|
772
777
|
|
|
773
|
-
> **Note on FPR evolution:** The historic 6.0% FPR (v2.5.8) relied on a `BENIGN_PACKAGE_WHITELIST` that excluded certain known packages from scoring — a data leakage bias removed in v2.5.10. The current 12.3% FPR is an honest measurement without whitelisting, against
|
|
778
|
+
> **Note on FPR evolution:** The historic 6.0% FPR (v2.5.8) relied on a `BENIGN_PACKAGE_WHITELIST` that excluded certain known packages from scoring — a data leakage bias removed in v2.5.10. The current 12.3% FPR is an honest measurement without whitelisting, against 532 real benign packages. The intent graph (v2.6.0) adds zero false positives by using intra-file pairing only and excluding LOW-severity threats.
|
|
774
779
|
|
|
775
780
|
**Holdout progression** (pre-tuning scores, rules frozen):
|
|
776
781
|
|
|
@@ -785,10 +790,10 @@ All 2,077 misses lack Node.js malware patterns. MUAD'DIB performs AST-based Node
|
|
|
785
790
|
- **Wild TPR** (Datadog Benchmark): detection rate on 17,922 real malware packages from the [DataDog Malicious Software Packages Dataset](https://github.com/DataDog/malicious-software-packages-dataset). Raw 88.2% (15,810/17,922). Adjusted ~100% on JS/Node.js malware when excluding out-of-scope samples (1,233 phishing HTML pages, 824 native binaries, 20 corrected libraries). See [Evaluation Methodology](docs/EVALUATION_METHODOLOGY.md#14-datadog-17k-benchmark).
|
|
786
791
|
- **TPR** (True Positive Rate): detection rate on 49 real-world supply-chain attacks (event-stream, ua-parser-js, coa, flatmap-stream, eslint-scope, solana-web3js, and 43 more). 3 misses are browser-only (lottie-player, polyfill-io, trojanized-jquery) — see [Threat Model](docs/threat-model.md).
|
|
787
792
|
- **FPR** (False Positive Rate): packages scoring > 20 out of 529 real npm packages (source code scanned, not empty dirs).
|
|
788
|
-
- **ADR** (Adversarial Detection Rate): detection rate on
|
|
793
|
+
- **ADR** (Adversarial Detection Rate): detection rate on 120 evasive malicious samples — 53 adversarial + 40 holdout (6 adversarial waves + 4 holdout batches). 75 available on disk. 2 misses on available samples: `require-cache-poison` (P3 trade-off), `getter-defineProperty-exfil`.
|
|
789
794
|
- **Holdout** (pre-tuning): detection rate on 10 unseen samples with rules frozen (measures generalization)
|
|
790
795
|
|
|
791
|
-
Datasets: 17,922 Datadog malware samples,
|
|
796
|
+
Datasets: 17,922 Datadog malware samples, 532 npm + 132 PyPI benign packages, 120 adversarial/holdout samples (75 available on disk), 51 ground-truth attacks (65 documented malware packages). **1905 tests**, 86% code coverage.
|
|
792
797
|
|
|
793
798
|
See [Evaluation Methodology](docs/EVALUATION_METHODOLOGY.md) for the full experimental protocol.
|
|
794
799
|
|
|
@@ -824,12 +829,12 @@ npm test
|
|
|
824
829
|
|
|
825
830
|
### Testing
|
|
826
831
|
|
|
827
|
-
- **
|
|
832
|
+
- **1905 unit/integration tests** across 44 modular test files - 86% code coverage via [Codecov](https://codecov.io/gh/DNSZLSK/muad-dib)
|
|
828
833
|
- **56 fuzz tests** - Malformed YAML, invalid JSON, binary files, ReDoS, unicode, 10MB inputs
|
|
829
834
|
- **Datadog 17K benchmark** - 17,922 real malware samples, 88.2% raw TPR, ~100% on JS/Node.js malware (2,077 out-of-scope misses: phishing, binaries, corrected libs)
|
|
830
|
-
- **
|
|
835
|
+
- **120 adversarial/holdout samples** - 53 adversarial + 40 holdout (75 available on disk), 73/75 detection rate (97.3% ADR). 2 misses: `require-cache-poison` (P3 trade-off), `getter-defineProperty-exfil`
|
|
831
836
|
- **Ground truth validation** - 51 real-world attacks (46/49 detected = 93.9% TPR). 3 out-of-scope: browser-only (lottie-player, polyfill-io, trojanized-jquery)
|
|
832
|
-
- **False positive validation** - 12.3% FPR global (65/
|
|
837
|
+
- **False positive validation** - 12.3% FPR global (65/532) on real npm source code via `npm pack`
|
|
833
838
|
- **ESLint security audit** - `eslint-plugin-security` with 14 rules enabled
|
|
834
839
|
|
|
835
840
|
---
|