muaddib-scanner 2.5.17 → 2.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +16 -11
- package/package.json +1 -1
- package/src/index.js +36 -7
- package/src/intent-graph.js +233 -0
- package/src/response/playbooks.js +9 -0
- package/src/rules/index.js +25 -0
- package/src/scanner/ast-detectors.js +92 -8
- package/src/scanner/dataflow.js +61 -0
- package/src/scanner/deobfuscate.js +46 -6
- package/src/scanner/module-graph.js +726 -16
- package/src/scoring.js +15 -7
package/README.md
CHANGED
|
@@ -642,7 +642,7 @@ Alerts appear in Security > Code scanning alerts.
|
|
|
642
642
|
## Architecture
|
|
643
643
|
|
|
644
644
|
```
|
|
645
|
-
MUAD'DIB 2.
|
|
645
|
+
MUAD'DIB 2.6.1 Scanner
|
|
646
646
|
|
|
|
647
647
|
+-- IOC Match (225,000+ packages, JSON DB)
|
|
648
648
|
| +-- OSV.dev npm dump (200K+ MAL-* entries)
|
|
@@ -664,7 +664,12 @@ MUAD'DIB 2.5.17 Scanner
|
|
|
664
664
|
| +-- 3-hop re-export chains, class method analysis
|
|
665
665
|
| +-- Cross-file credential read -> network sink detection
|
|
666
666
|
|
|
|
667
|
-
+--
|
|
667
|
+
+-- Intent Coherence Analysis (v2.6.0)
|
|
668
|
+
| +-- Intra-file source-sink pairing (credential read + eval/network in same file)
|
|
669
|
+
| +-- Cross-file detection delegated to module-graph (proven taint paths only)
|
|
670
|
+
| +-- LOW severity threats excluded (respects FP reductions)
|
|
671
|
+
|
|
|
672
|
+
+-- 14 Parallel Scanners (129 rules)
|
|
668
673
|
| +-- AST Parse (acorn) — eval/Function, credential CLI theft, binary droppers, prototype hooks
|
|
669
674
|
| +-- Pattern Matching (shell, scripts)
|
|
670
675
|
| +-- Obfuscation Detection (skip .min.js, ignore hex/unicode alone)
|
|
@@ -746,8 +751,8 @@ Output (CLI, JSON, HTML, SARIF, Webhook, Threat Feed)
|
|
|
746
751
|
|--------|--------|---------|
|
|
747
752
|
| **Wild TPR** (Datadog 17K) | **88.2%** raw · **~100%** adjusted | 17,922 real malware samples. 2,077 misses are all out-of-scope (see below) |
|
|
748
753
|
| **TPR** (Ground Truth) | **93.9%** (46/49) | 51 real-world attacks (49 active). 3 out-of-scope: browser-only (3) |
|
|
749
|
-
| **FPR** (Benign, global) | **12.3%** (65/
|
|
750
|
-
| **ADR** (Adversarial + Holdout) | **
|
|
754
|
+
| **FPR** (Benign, global) | **12.3%** (65/532) | 532 npm packages, real source code via `npm pack`, threshold > 20 |
|
|
755
|
+
| **ADR** (Adversarial + Holdout) | **97.3%** (73/75) | 53 adversarial + 40 holdout evasive samples (75 available on disk). 2 misses: `require-cache-poison` (P3 trade-off), `getter-defineProperty-exfil` |
|
|
751
756
|
|
|
752
757
|
**Datadog 17K benchmark** — [DataDog Malicious Software Packages Dataset](https://github.com/DataDog/malicious-software-packages-dataset), 17,922 real malware samples (npm). Raw TPR: 88.2% (15,810/17,922). The 2,077 misses (score=0) were manually categorized:
|
|
753
758
|
|
|
@@ -768,9 +773,9 @@ All 2,077 misses lack Node.js malware patterns. MUAD'DIB performs AST-based Node
|
|
|
768
773
|
| Large (50-100 JS files) | 40 | 10 | 25.0% |
|
|
769
774
|
| Very large (100+ JS files) | 62 | 25 | 40.3% |
|
|
770
775
|
|
|
771
|
-
**FPR progression**: 0% (invalid, empty dirs, v2.2.0-v2.2.6) → 38% (first real measurement, v2.2.7) → 19.4% (v2.2.8) → 17.5% (v2.2.9) → ~13% (v2.2.11, per-file max scoring) → 8.9% (v2.3.0, P2) → 7.4% (v2.3.1, P3) → 6.0% (v2.5.8, P4 + IOC wildcard audit) → ~13.6% (v2.5.14, audit hardening added stricter detection) → **12.3%** (v2.5.16, P5 + P6)
|
|
776
|
+
**FPR progression**: 0% (invalid, empty dirs, v2.2.0-v2.2.6) → 38% (first real measurement, v2.2.7) → 19.4% (v2.2.8) → 17.5% (v2.2.9) → ~13% (v2.2.11, per-file max scoring) → 8.9% (v2.3.0, P2) → 7.4% (v2.3.1, P3) → 6.0% (v2.5.8, P4 + IOC wildcard audit) → ~13.6% (v2.5.14, audit hardening added stricter detection) → **12.3%** (v2.5.16, P5 + P6) → **12.3%** (v2.6.0, intent graph v2 — zero FP added) → **12.3%** (v2.6.1, module-graph bounded path — zero FP added)
|
|
772
777
|
|
|
773
|
-
> **Note on FPR evolution:** The historic 6.0% FPR (v2.5.8) relied on a `BENIGN_PACKAGE_WHITELIST` that excluded certain known packages from scoring — a data leakage bias removed in v2.5.10. The current 12.3% FPR is an honest measurement without whitelisting, against
|
|
778
|
+
> **Note on FPR evolution:** The historic 6.0% FPR (v2.5.8) relied on a `BENIGN_PACKAGE_WHITELIST` that excluded certain known packages from scoring — a data leakage bias removed in v2.5.10. The current 12.3% FPR is an honest measurement without whitelisting, against 532 real benign packages. The intent graph (v2.6.0) adds zero false positives by using intra-file pairing only and excluding LOW-severity threats.
|
|
774
779
|
|
|
775
780
|
**Holdout progression** (pre-tuning scores, rules frozen):
|
|
776
781
|
|
|
@@ -785,10 +790,10 @@ All 2,077 misses lack Node.js malware patterns. MUAD'DIB performs AST-based Node
|
|
|
785
790
|
- **Wild TPR** (Datadog Benchmark): detection rate on 17,922 real malware packages from the [DataDog Malicious Software Packages Dataset](https://github.com/DataDog/malicious-software-packages-dataset). Raw 88.2% (15,810/17,922). Adjusted ~100% on JS/Node.js malware when excluding out-of-scope samples (1,233 phishing HTML pages, 824 native binaries, 20 corrected libraries). See [Evaluation Methodology](docs/EVALUATION_METHODOLOGY.md#14-datadog-17k-benchmark).
|
|
786
791
|
- **TPR** (True Positive Rate): detection rate on 49 real-world supply-chain attacks (event-stream, ua-parser-js, coa, flatmap-stream, eslint-scope, solana-web3js, and 43 more). 3 misses are browser-only (lottie-player, polyfill-io, trojanized-jquery) — see [Threat Model](docs/threat-model.md).
|
|
787
792
|
- **FPR** (False Positive Rate): packages scoring > 20 out of 529 real npm packages (source code scanned, not empty dirs).
|
|
788
|
-
- **ADR** (Adversarial Detection Rate): detection rate on
|
|
793
|
+
- **ADR** (Adversarial Detection Rate): detection rate on 120 evasive malicious samples — 53 adversarial + 40 holdout (6 adversarial waves + 4 holdout batches). 75 available on disk. 2 misses on available samples: `require-cache-poison` (P3 trade-off), `getter-defineProperty-exfil`.
|
|
789
794
|
- **Holdout** (pre-tuning): detection rate on 10 unseen samples with rules frozen (measures generalization)
|
|
790
795
|
|
|
791
|
-
Datasets: 17,922 Datadog malware samples,
|
|
796
|
+
Datasets: 17,922 Datadog malware samples, 532 npm + 132 PyPI benign packages, 120 adversarial/holdout samples (75 available on disk), 51 ground-truth attacks (65 documented malware packages). **1932 tests**, 86% code coverage.
|
|
792
797
|
|
|
793
798
|
See [Evaluation Methodology](docs/EVALUATION_METHODOLOGY.md) for the full experimental protocol.
|
|
794
799
|
|
|
@@ -824,12 +829,12 @@ npm test
|
|
|
824
829
|
|
|
825
830
|
### Testing
|
|
826
831
|
|
|
827
|
-
- **
|
|
832
|
+
- **1932 unit/integration tests** across 44 modular test files - 86% code coverage via [Codecov](https://codecov.io/gh/DNSZLSK/muad-dib)
|
|
828
833
|
- **56 fuzz tests** - Malformed YAML, invalid JSON, binary files, ReDoS, unicode, 10MB inputs
|
|
829
834
|
- **Datadog 17K benchmark** - 17,922 real malware samples, 88.2% raw TPR, ~100% on JS/Node.js malware (2,077 out-of-scope misses: phishing, binaries, corrected libs)
|
|
830
|
-
- **
|
|
835
|
+
- **120 adversarial/holdout samples** - 53 adversarial + 40 holdout (75 available on disk), 73/75 detection rate (97.3% ADR). 2 misses: `require-cache-poison` (P3 trade-off), `getter-defineProperty-exfil`
|
|
831
836
|
- **Ground truth validation** - 51 real-world attacks (46/49 detected = 93.9% TPR). 3 out-of-scope: browser-only (lottie-player, polyfill-io, trojanized-jquery)
|
|
832
|
-
- **False positive validation** - 12.3% FPR global (65/
|
|
837
|
+
- **False positive validation** - 12.3% FPR global (65/532) on real npm source code via `npm pack`
|
|
833
838
|
- **ESLint security audit** - `eslint-plugin-security` with 14 rules enabled
|
|
834
839
|
|
|
835
840
|
---
|
package/package.json
CHANGED
package/src/index.js
CHANGED
|
@@ -23,12 +23,13 @@ const { ensureIOCs } = require('./ioc/bootstrap.js');
|
|
|
23
23
|
const { scanEntropy } = require('./scanner/entropy.js');
|
|
24
24
|
const { scanAIConfig } = require('./scanner/ai-config.js');
|
|
25
25
|
const { deobfuscate } = require('./scanner/deobfuscate.js');
|
|
26
|
-
const { buildModuleGraph, annotateTaintedExports, detectCrossFileFlows, annotateSinkExports, detectCallbackCrossFileFlows } = require('./scanner/module-graph.js');
|
|
26
|
+
const { buildModuleGraph, annotateTaintedExports, detectCrossFileFlows, annotateSinkExports, detectCallbackCrossFileFlows, detectEventEmitterFlows } = require('./scanner/module-graph.js');
|
|
27
27
|
const { computeReachableFiles } = require('./scanner/reachability.js');
|
|
28
28
|
const { runTemporalAnalyses } = require('./temporal-runner.js');
|
|
29
29
|
const { formatOutput } = require('./output-formatter.js');
|
|
30
30
|
const { setExtraExcludes, getExtraExcludes, Spinner, listInstalledPackages, clearFileListCache, debugLog } = require('./utils.js');
|
|
31
31
|
const { SEVERITY_WEIGHTS, RISK_THRESHOLDS, MAX_RISK_SCORE, isPackageLevelThreat, computeGroupScore, applyFPReductions, calculateRiskScore } = require('./scoring.js');
|
|
32
|
+
const { buildIntentPairs } = require('./intent-graph.js');
|
|
32
33
|
|
|
33
34
|
const { MAX_FILE_SIZE, safeParse } = require('./shared/constants.js');
|
|
34
35
|
const walk = require('acorn-walk');
|
|
@@ -356,16 +357,27 @@ async function run(targetPath, options = {}) {
|
|
|
356
357
|
|
|
357
358
|
// Cross-file module graph analysis (before individual scanners)
|
|
358
359
|
// Wrapped in yieldThen to unblock spinner animation
|
|
360
|
+
// Bounded: 5s timeout to prevent DoS on large/adversarial packages
|
|
361
|
+
const MODULE_GRAPH_TIMEOUT_MS = 5000;
|
|
359
362
|
let crossFileFlows = [];
|
|
360
363
|
if (!options.noModuleGraph) {
|
|
361
|
-
|
|
364
|
+
const moduleGraphWork = async () => {
|
|
362
365
|
const graph = await yieldThen(() => buildModuleGraph(targetPath));
|
|
363
366
|
const tainted = await yieldThen(() => annotateTaintedExports(graph, targetPath));
|
|
364
|
-
crossFileFlows = await yieldThen(() => detectCrossFileFlows(graph, tainted, targetPath));
|
|
365
|
-
// Callback-based cross-file flow detection
|
|
366
367
|
const sinkAnnotations = await yieldThen(() => annotateSinkExports(graph, targetPath));
|
|
368
|
+
crossFileFlows = await yieldThen(() => detectCrossFileFlows(graph, tainted, sinkAnnotations, targetPath));
|
|
369
|
+
// Callback-based cross-file flow detection
|
|
367
370
|
const callbackFlows = await yieldThen(() => detectCallbackCrossFileFlows(graph, tainted, sinkAnnotations, targetPath));
|
|
368
371
|
crossFileFlows = crossFileFlows.concat(callbackFlows);
|
|
372
|
+
// EventEmitter cross-module flow detection
|
|
373
|
+
const emitterFlows = await yieldThen(() => detectEventEmitterFlows(graph, tainted, sinkAnnotations, targetPath));
|
|
374
|
+
crossFileFlows = crossFileFlows.concat(emitterFlows);
|
|
375
|
+
};
|
|
376
|
+
const timeout = new Promise((_, reject) =>
|
|
377
|
+
setTimeout(() => reject(new Error('Module graph timeout')), MODULE_GRAPH_TIMEOUT_MS)
|
|
378
|
+
);
|
|
379
|
+
try {
|
|
380
|
+
await Promise.race([moduleGraphWork(), timeout]);
|
|
369
381
|
} catch (e) {
|
|
370
382
|
// Graceful fallback — module graph is best-effort
|
|
371
383
|
debugLog('[MODULE-GRAPH] Error:', e && e.message);
|
|
@@ -529,6 +541,23 @@ async function run(targetPath, options = {}) {
|
|
|
529
541
|
// A malware package typically has 1-3 occurrences, not dozens.
|
|
530
542
|
applyFPReductions(deduped, reachableFiles, packageName);
|
|
531
543
|
|
|
544
|
+
// Intent coherence analysis: detect source→sink pairs across files
|
|
545
|
+
const intentResult = buildIntentPairs(deduped);
|
|
546
|
+
// Add intent threats to deduped before enrichment so they get rules/playbooks
|
|
547
|
+
if (intentResult.intentThreats) {
|
|
548
|
+
for (const it of intentResult.intentThreats) {
|
|
549
|
+
// Respect reachability: downgrade intent threats in unreachable files
|
|
550
|
+
if (reachableFiles && reachableFiles.size > 0 && it.file) {
|
|
551
|
+
const normalizedFile = it.file.replace(/\\/g, '/');
|
|
552
|
+
if (!reachableFiles.has(normalizedFile)) {
|
|
553
|
+
it.severity = 'LOW';
|
|
554
|
+
it.unreachable = true;
|
|
555
|
+
}
|
|
556
|
+
}
|
|
557
|
+
deduped.push(it);
|
|
558
|
+
}
|
|
559
|
+
}
|
|
560
|
+
|
|
532
561
|
// Enrich each threat with rules
|
|
533
562
|
const enrichedThreats = deduped.map(t => {
|
|
534
563
|
const rule = getRule(t.type);
|
|
@@ -550,12 +579,12 @@ async function run(targetPath, options = {}) {
|
|
|
550
579
|
.map(t => ({ rule: t.rule_id, type: t.type, points: t.points, reason: t.message }))
|
|
551
580
|
.sort((a, b) => b.points - a.points);
|
|
552
581
|
|
|
553
|
-
// Per-file max scoring (v2.2.11)
|
|
582
|
+
// Per-file max scoring (v2.2.11) with intent graph bonus
|
|
554
583
|
const {
|
|
555
584
|
riskScore, riskLevel, globalRiskScore,
|
|
556
|
-
maxFileScore, packageScore, mostSuspiciousFile, fileScores,
|
|
585
|
+
maxFileScore, packageScore, intentBonus, mostSuspiciousFile, fileScores,
|
|
557
586
|
criticalCount, highCount, mediumCount, lowCount
|
|
558
|
-
} = calculateRiskScore(deduped);
|
|
587
|
+
} = calculateRiskScore(deduped, intentResult);
|
|
559
588
|
|
|
560
589
|
// Python scan metadata
|
|
561
590
|
const pythonInfo = pythonDeps.length > 0 ? {
|
|
@@ -0,0 +1,233 @@
|
|
|
1
|
+
'use strict';
|
|
2
|
+
|
|
3
|
+
// ============================================
|
|
4
|
+
// INTENT GRAPH — Intra-File Coherence Analysis
|
|
5
|
+
// ============================================
|
|
6
|
+
// Boosts score when a SINGLE file contains both a high-confidence credential
|
|
7
|
+
// source AND a dangerous sink (eval, exec, network). This is genuinely suspicious
|
|
8
|
+
// because legitimate code rarely reads .npmrc and evals in the same file.
|
|
9
|
+
//
|
|
10
|
+
// DESIGN PRINCIPLES (informed by SpiderScan, Cerebro, taint-slicing research):
|
|
11
|
+
// 1. INTRA-FILE ONLY — cross-file pairing without proven data flow = FP explosion
|
|
12
|
+
// (aws-sdk has process.env in config.js + https.request in http.js = not malicious)
|
|
13
|
+
// 2. Cross-file detection is handled by module-graph.js → cross_file_dataflow threats
|
|
14
|
+
// 3. Sources = ONLY high-confidence credential access (NOT env_access, NOT suspicious_dataflow)
|
|
15
|
+
// 4. Sinks = ONLY threats already identified by scanners (NO content-based scanning)
|
|
16
|
+
// 5. No double-counting — suspicious_dataflow is already a compound detection
|
|
17
|
+
|
|
18
|
+
// ============================================
|
|
19
|
+
// SOURCE CLASSIFICATION
|
|
20
|
+
// ============================================
|
|
21
|
+
const SOURCE_TYPES = {
|
|
22
|
+
sensitive_string: 'credential_read', // .npmrc, .ssh, .env file references
|
|
23
|
+
env_harvesting_dynamic: 'credential_read', // Object.keys(process.env), rest destructuring
|
|
24
|
+
credential_regex_harvest: 'credential_read', // regex patterns for tokens/passwords
|
|
25
|
+
llm_api_key_harvest: 'credential_read', // OPENAI_API_KEY, ANTHROPIC_API_KEY
|
|
26
|
+
credential_cli_steal: 'credential_read', // gh auth token, gcloud auth
|
|
27
|
+
// env_access EXCLUDED — standard config (process.env.PORT, AWS_REGION, NODE_ENV)
|
|
28
|
+
// suspicious_dataflow EXCLUDED — already compound detection
|
|
29
|
+
// cross_file_dataflow EXCLUDED — already scored CRITICAL by module-graph
|
|
30
|
+
};
|
|
31
|
+
|
|
32
|
+
// ============================================
|
|
33
|
+
// SINK CLASSIFICATION (from existing threats only)
|
|
34
|
+
// ============================================
|
|
35
|
+
const THREAT_SINK_TYPES = {
|
|
36
|
+
dangerous_call_eval: 'exec_sink',
|
|
37
|
+
dangerous_call_function: 'exec_sink',
|
|
38
|
+
staged_eval_decode: 'exec_sink',
|
|
39
|
+
vm_code_execution: 'exec_sink',
|
|
40
|
+
module_compile: 'exec_sink',
|
|
41
|
+
module_compile_dynamic: 'exec_sink',
|
|
42
|
+
credential_tampering: 'file_tamper',
|
|
43
|
+
git_hook_injection: 'file_tamper',
|
|
44
|
+
workflow_write: 'file_tamper',
|
|
45
|
+
mcp_config_injection: 'file_tamper',
|
|
46
|
+
ide_persistence: 'file_tamper',
|
|
47
|
+
};
|
|
48
|
+
|
|
49
|
+
// Message-based sink detection for threats not in THREAT_SINK_TYPES
|
|
50
|
+
const SINK_MESSAGE_PATTERNS = [
|
|
51
|
+
{ pattern: /https?\.request|dns\.resolve|net\.connect/, type: 'network_external' },
|
|
52
|
+
{ pattern: /webhook/i, type: 'network_external' },
|
|
53
|
+
];
|
|
54
|
+
|
|
55
|
+
// ============================================
|
|
56
|
+
// COHERENCE MATRIX
|
|
57
|
+
// ============================================
|
|
58
|
+
// Only applied to intra-file pairs. Cross-file coherence is handled by module-graph.
|
|
59
|
+
const COHERENCE_MATRIX = {
|
|
60
|
+
credential_read: {
|
|
61
|
+
network_external: { modifier: 30, severity: 'CRITICAL' },
|
|
62
|
+
network_internal: { modifier: 10, severity: 'HIGH' },
|
|
63
|
+
exec_sink: { modifier: 25, severity: 'CRITICAL' },
|
|
64
|
+
file_local: { modifier: 5, severity: 'MEDIUM' },
|
|
65
|
+
file_tamper: { modifier: 20, severity: 'HIGH' },
|
|
66
|
+
},
|
|
67
|
+
fingerprint_read: {
|
|
68
|
+
network_external: { modifier: 0, severity: 'LOW' },
|
|
69
|
+
network_internal: { modifier: 0, severity: 'LOW' },
|
|
70
|
+
exec_sink: { modifier: 10, severity: 'MEDIUM' },
|
|
71
|
+
file_local: { modifier: 0, severity: 'LOW' },
|
|
72
|
+
file_tamper: { modifier: 5, severity: 'LOW' },
|
|
73
|
+
},
|
|
74
|
+
telemetry_read: {
|
|
75
|
+
network_external: { modifier: 0, severity: 'LOW' },
|
|
76
|
+
network_internal: { modifier: 0, severity: 'LOW' },
|
|
77
|
+
exec_sink: { modifier: 0, severity: 'LOW' },
|
|
78
|
+
file_local: { modifier: 0, severity: 'LOW' },
|
|
79
|
+
file_tamper: { modifier: 0, severity: 'LOW' },
|
|
80
|
+
},
|
|
81
|
+
config_read: {
|
|
82
|
+
network_external: { modifier: 5, severity: 'LOW' },
|
|
83
|
+
network_internal: { modifier: 0, severity: 'LOW' },
|
|
84
|
+
exec_sink: { modifier: 5, severity: 'LOW' },
|
|
85
|
+
file_local: { modifier: 0, severity: 'LOW' },
|
|
86
|
+
file_tamper: { modifier: 0, severity: 'LOW' },
|
|
87
|
+
},
|
|
88
|
+
command_output: {
|
|
89
|
+
network_external: { modifier: 20, severity: 'HIGH' },
|
|
90
|
+
network_internal: { modifier: 5, severity: 'MEDIUM' },
|
|
91
|
+
exec_sink: { modifier: 15, severity: 'HIGH' },
|
|
92
|
+
file_local: { modifier: 5, severity: 'MEDIUM' },
|
|
93
|
+
file_tamper: { modifier: 15, severity: 'HIGH' },
|
|
94
|
+
},
|
|
95
|
+
};
|
|
96
|
+
|
|
97
|
+
// Kept for backward compatibility but no longer used in pairing
|
|
98
|
+
// Cross-file detection is handled by module-graph.js (cross_file_dataflow)
|
|
99
|
+
const CROSS_FILE_MULTIPLIER = 0.5;
|
|
100
|
+
|
|
101
|
+
/**
|
|
102
|
+
* Classify a threat as a source type.
|
|
103
|
+
* Only high-confidence credential access patterns.
|
|
104
|
+
*/
|
|
105
|
+
function classifySource(threat) {
|
|
106
|
+
if (SOURCE_TYPES[threat.type]) return SOURCE_TYPES[threat.type];
|
|
107
|
+
|
|
108
|
+
// Explicitly excluded types
|
|
109
|
+
if (threat.type === 'suspicious_dataflow') return null;
|
|
110
|
+
if (threat.type === 'env_access') return null;
|
|
111
|
+
if (threat.type === 'cross_file_dataflow') return null;
|
|
112
|
+
|
|
113
|
+
// Message-based: only for threats referencing sensitive file paths
|
|
114
|
+
if (threat.message) {
|
|
115
|
+
const msg = threat.message;
|
|
116
|
+
if (/\.npmrc|\.ssh\/|\.aws\/|id_rsa|\.gitconfig/i.test(msg)) {
|
|
117
|
+
return 'credential_read';
|
|
118
|
+
}
|
|
119
|
+
}
|
|
120
|
+
|
|
121
|
+
return null;
|
|
122
|
+
}
|
|
123
|
+
|
|
124
|
+
/**
|
|
125
|
+
* Classify a threat as a sink type.
|
|
126
|
+
* Only from existing threat types — no content scanning.
|
|
127
|
+
*/
|
|
128
|
+
function classifySink(threat) {
|
|
129
|
+
if (THREAT_SINK_TYPES[threat.type]) return THREAT_SINK_TYPES[threat.type];
|
|
130
|
+
|
|
131
|
+
if (threat.message) {
|
|
132
|
+
for (const { pattern, type } of SINK_MESSAGE_PATTERNS) {
|
|
133
|
+
if (pattern.test(threat.message)) return type;
|
|
134
|
+
}
|
|
135
|
+
}
|
|
136
|
+
|
|
137
|
+
return null;
|
|
138
|
+
}
|
|
139
|
+
|
|
140
|
+
/**
|
|
141
|
+
* Build intent pairs from INTRA-FILE co-occurrence only.
|
|
142
|
+
* Cross-file detection is handled by module-graph.js (cross_file_dataflow).
|
|
143
|
+
*
|
|
144
|
+
* @param {Array} threats - deduplicated threat array
|
|
145
|
+
* @returns {Object} { pairs, intentScore, intentThreats }
|
|
146
|
+
*/
|
|
147
|
+
function buildIntentPairs(threats) {
|
|
148
|
+
// Only consider MEDIUM+ threats. LOW severity means applyFPReductions already
|
|
149
|
+
// determined this is noise (bundler artifact, dist/ file, count threshold exceeded).
|
|
150
|
+
// Re-elevating LOW threats via intent pairing would undo FP reductions.
|
|
151
|
+
const eligible = threats.filter(t => t.severity !== 'LOW');
|
|
152
|
+
|
|
153
|
+
// Group eligible threats by file
|
|
154
|
+
const byFile = new Map();
|
|
155
|
+
for (const t of eligible) {
|
|
156
|
+
const file = t.file || '(unknown)';
|
|
157
|
+
if (!byFile.has(file)) byFile.set(file, []);
|
|
158
|
+
byFile.get(file).push(t);
|
|
159
|
+
}
|
|
160
|
+
|
|
161
|
+
const pairSet = new Set();
|
|
162
|
+
const pairs = [];
|
|
163
|
+
let intentScore = 0;
|
|
164
|
+
|
|
165
|
+
// Only pair sources and sinks within the SAME file
|
|
166
|
+
for (const [file, fileThreats] of byFile) {
|
|
167
|
+
const sources = [];
|
|
168
|
+
const sinks = [];
|
|
169
|
+
|
|
170
|
+
for (const t of fileThreats) {
|
|
171
|
+
const srcType = classifySource(t);
|
|
172
|
+
const sinkType = classifySink(t);
|
|
173
|
+
if (srcType) sources.push(srcType);
|
|
174
|
+
if (sinkType) sinks.push(sinkType);
|
|
175
|
+
}
|
|
176
|
+
|
|
177
|
+
if (sources.length === 0 || sinks.length === 0) continue;
|
|
178
|
+
|
|
179
|
+
// Deduplicate source×sink combinations within this file
|
|
180
|
+
for (const srcType of new Set(sources)) {
|
|
181
|
+
const srcMatrix = COHERENCE_MATRIX[srcType];
|
|
182
|
+
if (!srcMatrix) continue;
|
|
183
|
+
|
|
184
|
+
for (const sinkType of new Set(sinks)) {
|
|
185
|
+
const entry = srcMatrix[sinkType];
|
|
186
|
+
if (!entry || entry.modifier === 0) continue;
|
|
187
|
+
|
|
188
|
+
const pairKey = `${srcType}:${sinkType}:${file}`;
|
|
189
|
+
if (pairSet.has(pairKey)) continue;
|
|
190
|
+
pairSet.add(pairKey);
|
|
191
|
+
|
|
192
|
+
pairs.push({
|
|
193
|
+
sourceType: srcType,
|
|
194
|
+
sinkType,
|
|
195
|
+
severity: entry.severity,
|
|
196
|
+
modifier: entry.modifier,
|
|
197
|
+
crossFile: false,
|
|
198
|
+
sourceFile: file,
|
|
199
|
+
sinkFile: file
|
|
200
|
+
});
|
|
201
|
+
intentScore += entry.modifier;
|
|
202
|
+
}
|
|
203
|
+
}
|
|
204
|
+
}
|
|
205
|
+
|
|
206
|
+
// Generate intent threats only for high-confidence pairs (modifier >= 25)
|
|
207
|
+
const intentThreats = [];
|
|
208
|
+
for (const pair of pairs) {
|
|
209
|
+
if (pair.modifier >= 25) {
|
|
210
|
+
const type = pair.sourceType === 'credential_read'
|
|
211
|
+
? 'intent_credential_exfil'
|
|
212
|
+
: pair.sourceType === 'command_output'
|
|
213
|
+
? 'intent_command_exfil'
|
|
214
|
+
: 'intent_credential_exfil';
|
|
215
|
+
intentThreats.push({
|
|
216
|
+
type,
|
|
217
|
+
severity: pair.severity,
|
|
218
|
+
message: `Intent coherence: ${pair.sourceType} → ${pair.sinkType} (${pair.sourceFile})`,
|
|
219
|
+
file: pair.sourceFile
|
|
220
|
+
});
|
|
221
|
+
}
|
|
222
|
+
}
|
|
223
|
+
|
|
224
|
+
return { pairs, intentScore, intentThreats };
|
|
225
|
+
}
|
|
226
|
+
|
|
227
|
+
module.exports = {
|
|
228
|
+
classifySource,
|
|
229
|
+
classifySink,
|
|
230
|
+
buildIntentPairs,
|
|
231
|
+
COHERENCE_MATRIX,
|
|
232
|
+
CROSS_FILE_MULTIPLIER
|
|
233
|
+
};
|
|
@@ -486,6 +486,15 @@ const PLAYBOOKS = {
|
|
|
486
486
|
'CRITIQUE: Un Proxy JavaScript avec trap set/get/apply est combine avec un appel reseau. ' +
|
|
487
487
|
'Technique d\'interception: le Proxy capture toutes les ecritures de proprietes (credentials, tokens, config) ' +
|
|
488
488
|
'et les exfiltre via HTTPS/fetch/dgram. Supprimer le package. Auditer tous les modules qui importent ce package.',
|
|
489
|
+
intent_credential_exfil:
|
|
490
|
+
'CRITIQUE: Coherence d\'intention detectee — lecture de credentials combinee avec exfiltration reseau. ' +
|
|
491
|
+
'Pattern multi-fichier DPRK/Lazarus: chaque fichier semble legitime individuellement mais le package ' +
|
|
492
|
+
'dans son ensemble collecte des secrets et les envoie sur le reseau. Supprimer le package immediatement. ' +
|
|
493
|
+
'Regenerer tous les tokens/credentials exposes. Auditer le package.json pour les scripts lifecycle.',
|
|
494
|
+
intent_command_exfil:
|
|
495
|
+
'Coherence d\'intention detectee — sortie de commande systeme combinee avec exfiltration reseau. ' +
|
|
496
|
+
'Le package execute des commandes et transmet les resultats. Verifier les commandes executees. ' +
|
|
497
|
+
'Supprimer le package si non attendu. Auditer les logs reseau pour identifier les donnees exfiltrees.',
|
|
489
498
|
};
|
|
490
499
|
|
|
491
500
|
function getPlaybook(threatType) {
|
package/src/rules/index.js
CHANGED
|
@@ -1357,6 +1357,31 @@ const RULES = {
|
|
|
1357
1357
|
],
|
|
1358
1358
|
mitre: 'T1557'
|
|
1359
1359
|
},
|
|
1360
|
+
// Intent Graph rules (v2.6.0)
|
|
1361
|
+
intent_credential_exfil: {
|
|
1362
|
+
id: 'MUADDIB-INTENT-001',
|
|
1363
|
+
name: 'Intent Credential Exfiltration',
|
|
1364
|
+
severity: 'CRITICAL',
|
|
1365
|
+
confidence: 'high',
|
|
1366
|
+
description: 'Coherence d\'intention: lecture de credentials (fichiers sensibles, env vars) combinee avec un sink reseau ou exec dans le meme package. Pattern typique DPRK/Lazarus: code malveillant fragmente sur plusieurs fichiers avec uniquement des APIs legitimes.',
|
|
1367
|
+
references: [
|
|
1368
|
+
'https://attack.mitre.org/techniques/T1041/',
|
|
1369
|
+
'https://www.cisa.gov/news-events/cybersecurity-advisories/aa22-108a'
|
|
1370
|
+
],
|
|
1371
|
+
mitre: 'T1041'
|
|
1372
|
+
},
|
|
1373
|
+
intent_command_exfil: {
|
|
1374
|
+
id: 'MUADDIB-INTENT-002',
|
|
1375
|
+
name: 'Intent Command Output Exfiltration',
|
|
1376
|
+
severity: 'HIGH',
|
|
1377
|
+
confidence: 'medium',
|
|
1378
|
+
description: 'Coherence d\'intention: sortie de commande systeme combinee avec un sink reseau. Le code execute des commandes et transmet les resultats sur le reseau — reconnaissance ou exfiltration.',
|
|
1379
|
+
references: [
|
|
1380
|
+
'https://attack.mitre.org/techniques/T1059/',
|
|
1381
|
+
'https://attack.mitre.org/techniques/T1041/'
|
|
1382
|
+
],
|
|
1383
|
+
mitre: 'T1059'
|
|
1384
|
+
},
|
|
1360
1385
|
};
|
|
1361
1386
|
|
|
1362
1387
|
function getRule(type) {
|
|
@@ -389,6 +389,32 @@ function handleVariableDeclarator(node, ctx) {
|
|
|
389
389
|
if (cn === 'eval' || cn === 'Function') ctx.evalAliases.set(node.id.name, cn);
|
|
390
390
|
}
|
|
391
391
|
}
|
|
392
|
+
// B1 fix: const getEval = () => eval; — 0-param arrow/function returning eval/Function as value (not call)
|
|
393
|
+
if ((node.init?.type === 'ArrowFunctionExpression' || node.init?.type === 'FunctionExpression')) {
|
|
394
|
+
const body = node.init.body;
|
|
395
|
+
// () => eval
|
|
396
|
+
if (body?.type === 'Identifier' && (body.name === 'eval' || body.name === 'Function')) {
|
|
397
|
+
ctx.evalAliases.set(node.id.name, body.name + '_factory');
|
|
398
|
+
}
|
|
399
|
+
// () => { return eval; }
|
|
400
|
+
if (body?.type === 'BlockStatement' && body.body?.length === 1 &&
|
|
401
|
+
body.body[0].type === 'ReturnStatement' &&
|
|
402
|
+
body.body[0].argument?.type === 'Identifier' &&
|
|
403
|
+
(body.body[0].argument.name === 'eval' || body.body[0].argument.name === 'Function')) {
|
|
404
|
+
ctx.evalAliases.set(node.id.name, body.body[0].argument.name + '_factory');
|
|
405
|
+
}
|
|
406
|
+
}
|
|
407
|
+
|
|
408
|
+
// B1 fix: const compiler = getCompiler() where getCompiler is eval_factory
|
|
409
|
+
if (node.init?.type === 'CallExpression' &&
|
|
410
|
+
node.init.callee?.type === 'Identifier' &&
|
|
411
|
+
ctx.evalAliases?.has(node.init.callee.name)) {
|
|
412
|
+
const aliased = ctx.evalAliases.get(node.init.callee.name);
|
|
413
|
+
if (aliased.endsWith('_factory')) {
|
|
414
|
+
const baseName = aliased.replace('_factory', '');
|
|
415
|
+
ctx.evalAliases.set(node.id.name, baseName);
|
|
416
|
+
}
|
|
417
|
+
}
|
|
392
418
|
|
|
393
419
|
// B5: Track object literal string properties
|
|
394
420
|
if (node.init?.type === 'ObjectExpression') {
|
|
@@ -558,6 +584,23 @@ function handleCallExpression(node, ctx) {
|
|
|
558
584
|
});
|
|
559
585
|
}
|
|
560
586
|
}
|
|
587
|
+
// B3 fix: require(/child_process/.source) — RegExpLiteral.source resolution
|
|
588
|
+
else if (arg.type === 'MemberExpression' &&
|
|
589
|
+
arg.object?.type === 'Literal' && arg.object.regex &&
|
|
590
|
+
arg.property?.type === 'Identifier' && arg.property.name === 'source') {
|
|
591
|
+
const regexSource = arg.object.regex.pattern;
|
|
592
|
+
const DANGEROUS_MODS = ['child_process', 'fs', 'net', 'dns', 'http', 'https', 'tls'];
|
|
593
|
+
const norm = regexSource.startsWith('node:') ? regexSource.slice(5) : regexSource;
|
|
594
|
+
if (DANGEROUS_MODS.includes(norm)) {
|
|
595
|
+
ctx.threats.push({ type: 'dynamic_require', severity: 'CRITICAL',
|
|
596
|
+
message: `require(/${regexSource}/.source) resolves to "${norm}" — regex .source evasion.`,
|
|
597
|
+
file: ctx.relFile });
|
|
598
|
+
} else {
|
|
599
|
+
ctx.threats.push({ type: 'dynamic_require', severity: 'HIGH',
|
|
600
|
+
message: `require() with regex .source argument (/${regexSource}/.source) — obfuscation technique.`,
|
|
601
|
+
file: ctx.relFile });
|
|
602
|
+
}
|
|
603
|
+
}
|
|
561
604
|
// B5: require(obj.prop) — MemberExpression argument
|
|
562
605
|
else if (arg.type === 'MemberExpression') {
|
|
563
606
|
const objName = arg.object?.type === 'Identifier' ? arg.object.name : null;
|
|
@@ -1013,14 +1056,37 @@ function handleCallExpression(node, ctx) {
|
|
|
1013
1056
|
// B1: Alias call — E('code') where E = eval or F = Function
|
|
1014
1057
|
if (node.callee.type === 'Identifier' && ctx.evalAliases?.has(node.callee.name)) {
|
|
1015
1058
|
const aliased = ctx.evalAliases.get(node.callee.name);
|
|
1016
|
-
|
|
1017
|
-
|
|
1018
|
-
|
|
1019
|
-
|
|
1020
|
-
|
|
1021
|
-
|
|
1022
|
-
|
|
1023
|
-
|
|
1059
|
+
if (aliased.endsWith('_factory')) {
|
|
1060
|
+
// Factory pattern: getEval()('code') — the callee is detected at outer CallExpression level
|
|
1061
|
+
// Mark the identifier so we can detect the outer call
|
|
1062
|
+
} else {
|
|
1063
|
+
ctx.hasEvalInFile = true;
|
|
1064
|
+
ctx.hasDynamicExec = true;
|
|
1065
|
+
ctx.threats.push({
|
|
1066
|
+
type: aliased === 'eval' ? 'dangerous_call_eval' : 'dangerous_call_function',
|
|
1067
|
+
severity: 'HIGH',
|
|
1068
|
+
message: `Indirect ${aliased} via alias "${node.callee.name}" — eval wrapper evasion.`,
|
|
1069
|
+
file: ctx.relFile
|
|
1070
|
+
});
|
|
1071
|
+
}
|
|
1072
|
+
}
|
|
1073
|
+
|
|
1074
|
+
// B1 fix: Factory call — getEval()('code') where getEval = () => eval
|
|
1075
|
+
if (node.callee.type === 'CallExpression' &&
|
|
1076
|
+
node.callee.callee?.type === 'Identifier' &&
|
|
1077
|
+
ctx.evalAliases?.has(node.callee.callee.name)) {
|
|
1078
|
+
const aliased = ctx.evalAliases.get(node.callee.callee.name);
|
|
1079
|
+
if (aliased.endsWith('_factory')) {
|
|
1080
|
+
const baseName = aliased.replace('_factory', '');
|
|
1081
|
+
ctx.hasEvalInFile = true;
|
|
1082
|
+
ctx.hasDynamicExec = true;
|
|
1083
|
+
ctx.threats.push({
|
|
1084
|
+
type: baseName === 'eval' ? 'dangerous_call_eval' : 'dangerous_call_function',
|
|
1085
|
+
severity: 'HIGH',
|
|
1086
|
+
message: `Indirect ${baseName} via factory function "${node.callee.callee.name}()" — eval factory evasion.`,
|
|
1087
|
+
file: ctx.relFile
|
|
1088
|
+
});
|
|
1089
|
+
}
|
|
1024
1090
|
}
|
|
1025
1091
|
|
|
1026
1092
|
if (callName === 'eval') {
|
|
@@ -1119,6 +1185,24 @@ function handleCallExpression(node, ctx) {
|
|
|
1119
1185
|
file: ctx.relFile
|
|
1120
1186
|
});
|
|
1121
1187
|
}
|
|
1188
|
+
// B2 fix: Function.prototype.call.call(eval, null, code) / X.call.call(eval, ...)
|
|
1189
|
+
// Deep MemberExpression: obj is itself a MemberExpression ending in .call/.apply
|
|
1190
|
+
if (obj?.type === 'MemberExpression' &&
|
|
1191
|
+
obj.property?.type === 'Identifier' &&
|
|
1192
|
+
(obj.property.name === 'call' || obj.property.name === 'apply') &&
|
|
1193
|
+
node.arguments.length >= 2) {
|
|
1194
|
+
const firstArg = node.arguments[0];
|
|
1195
|
+
if (firstArg?.type === 'Identifier' && (firstArg.name === 'eval' || firstArg.name === 'Function')) {
|
|
1196
|
+
ctx.hasEvalInFile = true;
|
|
1197
|
+
ctx.hasDynamicExec = true;
|
|
1198
|
+
ctx.threats.push({
|
|
1199
|
+
type: firstArg.name === 'eval' ? 'dangerous_call_eval' : 'dangerous_call_function',
|
|
1200
|
+
severity: 'HIGH',
|
|
1201
|
+
message: `${firstArg.name} passed to .call.call() — nested call/apply evasion technique.`,
|
|
1202
|
+
file: ctx.relFile
|
|
1203
|
+
});
|
|
1204
|
+
}
|
|
1205
|
+
}
|
|
1122
1206
|
}
|
|
1123
1207
|
|
|
1124
1208
|
// Detect array access pattern: [require][0]('child_process') or [eval][0](code)
|
package/src/scanner/dataflow.js
CHANGED
|
@@ -109,6 +109,31 @@ function buildTaintMap(ast) {
|
|
|
109
109
|
}
|
|
110
110
|
}
|
|
111
111
|
}
|
|
112
|
+
|
|
113
|
+
// B5 fix: const tools = { read: fs.readFileSync, home: os.homedir }
|
|
114
|
+
// Track object properties that reference tainted module methods as tainted aliases
|
|
115
|
+
if (node.id.type === 'Identifier' && init.type === 'ObjectExpression') {
|
|
116
|
+
for (const prop of init.properties) {
|
|
117
|
+
if (prop.type !== 'Property') continue;
|
|
118
|
+
const key = prop.key?.type === 'Identifier' ? prop.key.name :
|
|
119
|
+
(prop.key?.type === 'Literal' ? String(prop.key.value) : null);
|
|
120
|
+
if (!key) continue;
|
|
121
|
+
// Property value is a MemberExpression on a tainted module: fs.readFileSync, os.homedir
|
|
122
|
+
if (prop.value?.type === 'MemberExpression' &&
|
|
123
|
+
prop.value.object?.type === 'Identifier' &&
|
|
124
|
+
prop.value.property?.type === 'Identifier') {
|
|
125
|
+
const parentTaint = taintMap.get(prop.value.object.name);
|
|
126
|
+
if (parentTaint && TRACKED_MODULES.has(parentTaint.source)) {
|
|
127
|
+
const methodName = prop.value.property.name;
|
|
128
|
+
// Store as "objName.key" so tools.read calls are resolved
|
|
129
|
+
taintMap.set(`${node.id.name}.${key}`, {
|
|
130
|
+
source: parentTaint.source,
|
|
131
|
+
detail: `${parentTaint.source}.${methodName}`
|
|
132
|
+
});
|
|
133
|
+
}
|
|
134
|
+
}
|
|
135
|
+
}
|
|
136
|
+
}
|
|
112
137
|
}
|
|
113
138
|
});
|
|
114
139
|
|
|
@@ -466,6 +491,42 @@ function analyzeFile(content, filePath, basePath) {
|
|
|
466
491
|
});
|
|
467
492
|
}
|
|
468
493
|
}
|
|
494
|
+
|
|
495
|
+
// B5 fix: object method alias — tools.read(...) where tools.read = fs.readFileSync
|
|
496
|
+
const aliasKey = `${obj.name}.${prop.name}`;
|
|
497
|
+
const aliasTaint = taintMap.get(aliasKey);
|
|
498
|
+
if (aliasTaint && aliasTaint.detail.includes('.')) {
|
|
499
|
+
const [aliasModule, aliasMethod] = aliasTaint.detail.split('.');
|
|
500
|
+
const aliasSourceMethods = MODULE_SOURCE_METHODS[aliasModule];
|
|
501
|
+
if (aliasSourceMethods && aliasSourceMethods[aliasMethod]) {
|
|
502
|
+
// For credential_read: also check if the argument is a sensitive path
|
|
503
|
+
const isCredArg = node.arguments[0] && isCredentialPath(node.arguments[0], sensitivePathVars);
|
|
504
|
+
if (aliasSourceMethods[aliasMethod] === 'credential_read' && isCredArg) {
|
|
505
|
+
sources.push({
|
|
506
|
+
type: 'credential_read',
|
|
507
|
+
name: `${aliasModule}.${aliasMethod}`,
|
|
508
|
+
line: node.loc?.start?.line,
|
|
509
|
+
taint_tracked: true
|
|
510
|
+
});
|
|
511
|
+
} else if (aliasSourceMethods[aliasMethod] !== 'credential_read') {
|
|
512
|
+
sources.push({
|
|
513
|
+
type: aliasSourceMethods[aliasMethod],
|
|
514
|
+
name: `${aliasModule}.${aliasMethod}`,
|
|
515
|
+
line: node.loc?.start?.line,
|
|
516
|
+
taint_tracked: true
|
|
517
|
+
});
|
|
518
|
+
}
|
|
519
|
+
}
|
|
520
|
+
const aliasSinkMethods = MODULE_SINK_METHODS[aliasModule];
|
|
521
|
+
if (aliasSinkMethods && aliasSinkMethods[aliasMethod]) {
|
|
522
|
+
sinks.push({
|
|
523
|
+
type: aliasSinkMethods[aliasMethod],
|
|
524
|
+
name: `${aliasModule}.${aliasMethod}`,
|
|
525
|
+
line: node.loc?.start?.line,
|
|
526
|
+
taint_tracked: true
|
|
527
|
+
});
|
|
528
|
+
}
|
|
529
|
+
}
|
|
469
530
|
}
|
|
470
531
|
}
|
|
471
532
|
|