muaddib-scanner 2.5.17 → 2.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -642,7 +642,7 @@ Alerts appear in Security > Code scanning alerts.
642
642
  ## Architecture
643
643
 
644
644
  ```
645
- MUAD'DIB 2.5.17 Scanner
645
+ MUAD'DIB 2.6.1 Scanner
646
646
  |
647
647
  +-- IOC Match (225,000+ packages, JSON DB)
648
648
  | +-- OSV.dev npm dump (200K+ MAL-* entries)
@@ -664,7 +664,12 @@ MUAD'DIB 2.5.17 Scanner
664
664
  | +-- 3-hop re-export chains, class method analysis
665
665
  | +-- Cross-file credential read -> network sink detection
666
666
  |
667
- +-- 14 Parallel Scanners (121 rules)
667
+ +-- Intent Coherence Analysis (v2.6.0)
668
+ | +-- Intra-file source-sink pairing (credential read + eval/network in same file)
669
+ | +-- Cross-file detection delegated to module-graph (proven taint paths only)
670
+ | +-- LOW severity threats excluded (respects FP reductions)
671
+ |
672
+ +-- 14 Parallel Scanners (129 rules)
668
673
  | +-- AST Parse (acorn) — eval/Function, credential CLI theft, binary droppers, prototype hooks
669
674
  | +-- Pattern Matching (shell, scripts)
670
675
  | +-- Obfuscation Detection (skip .min.js, ignore hex/unicode alone)
@@ -746,8 +751,8 @@ Output (CLI, JSON, HTML, SARIF, Webhook, Threat Feed)
746
751
  |--------|--------|---------|
747
752
  | **Wild TPR** (Datadog 17K) | **88.2%** raw · **~100%** adjusted | 17,922 real malware samples. 2,077 misses are all out-of-scope (see below) |
748
753
  | **TPR** (Ground Truth) | **93.9%** (46/49) | 51 real-world attacks (49 active). 3 out-of-scope: browser-only (3) |
749
- | **FPR** (Benign, global) | **12.3%** (65/529) | 529 npm packages, real source code via `npm pack`, threshold > 20 |
750
- | **ADR** (Adversarial + Holdout) | **94.0%** (63/67) | 62 adversarial + 40 holdout evasive samples. 4 misses: `require-cache-poison` (P3 trade-off), `getter-defineProperty-exfil`, `setTimeout-eval-chain`, `setter-trap-exfil` |
754
+ | **FPR** (Benign, global) | **12.3%** (65/532) | 532 npm packages, real source code via `npm pack`, threshold > 20 |
755
+ | **ADR** (Adversarial + Holdout) | **97.3%** (73/75) | 53 adversarial + 40 holdout evasive samples (75 available on disk). 2 misses: `require-cache-poison` (P3 trade-off), `getter-defineProperty-exfil` |
751
756
 
752
757
  **Datadog 17K benchmark** — [DataDog Malicious Software Packages Dataset](https://github.com/DataDog/malicious-software-packages-dataset), 17,922 real malware samples (npm). Raw TPR: 88.2% (15,810/17,922). The 2,077 misses (score=0) were manually categorized:
753
758
 
@@ -768,9 +773,9 @@ All 2,077 misses lack Node.js malware patterns. MUAD'DIB performs AST-based Node
768
773
  | Large (50-100 JS files) | 40 | 10 | 25.0% |
769
774
  | Very large (100+ JS files) | 62 | 25 | 40.3% |
770
775
 
771
- **FPR progression**: 0% (invalid, empty dirs, v2.2.0-v2.2.6) → 38% (first real measurement, v2.2.7) → 19.4% (v2.2.8) → 17.5% (v2.2.9) → ~13% (v2.2.11, per-file max scoring) → 8.9% (v2.3.0, P2) → 7.4% (v2.3.1, P3) → 6.0% (v2.5.8, P4 + IOC wildcard audit) → ~13.6% (v2.5.14, audit hardening added stricter detection) → **12.3%** (v2.5.16, P5 + P6)
776
+ **FPR progression**: 0% (invalid, empty dirs, v2.2.0-v2.2.6) → 38% (first real measurement, v2.2.7) → 19.4% (v2.2.8) → 17.5% (v2.2.9) → ~13% (v2.2.11, per-file max scoring) → 8.9% (v2.3.0, P2) → 7.4% (v2.3.1, P3) → 6.0% (v2.5.8, P4 + IOC wildcard audit) → ~13.6% (v2.5.14, audit hardening added stricter detection) → **12.3%** (v2.5.16, P5 + P6) → **12.3%** (v2.6.0, intent graph v2 — zero FP added) → **12.3%** (v2.6.1, module-graph bounded path — zero FP added)
772
777
 
773
- > **Note on FPR evolution:** The historic 6.0% FPR (v2.5.8) relied on a `BENIGN_PACKAGE_WHITELIST` that excluded certain known packages from scoring — a data leakage bias removed in v2.5.10. The current 12.3% FPR is an honest measurement without whitelisting, against 529 real benign packages. The P5/P6 reductions (setTimeout precision, dist/ two-notch downgrade, credential_regex count-based, env segment matching, etc.) are detector precision improvements, not whitelisting.
778
+ > **Note on FPR evolution:** The historic 6.0% FPR (v2.5.8) relied on a `BENIGN_PACKAGE_WHITELIST` that excluded certain known packages from scoring — a data leakage bias removed in v2.5.10. The current 12.3% FPR is an honest measurement without whitelisting, against 532 real benign packages. The intent graph (v2.6.0) adds zero false positives by using intra-file pairing only and excluding LOW-severity threats.
774
779
 
775
780
  **Holdout progression** (pre-tuning scores, rules frozen):
776
781
 
@@ -785,10 +790,10 @@ All 2,077 misses lack Node.js malware patterns. MUAD'DIB performs AST-based Node
785
790
  - **Wild TPR** (Datadog Benchmark): detection rate on 17,922 real malware packages from the [DataDog Malicious Software Packages Dataset](https://github.com/DataDog/malicious-software-packages-dataset). Raw 88.2% (15,810/17,922). Adjusted ~100% on JS/Node.js malware when excluding out-of-scope samples (1,233 phishing HTML pages, 824 native binaries, 20 corrected libraries). See [Evaluation Methodology](docs/EVALUATION_METHODOLOGY.md#14-datadog-17k-benchmark).
786
791
  - **TPR** (True Positive Rate): detection rate on 49 real-world supply-chain attacks (event-stream, ua-parser-js, coa, flatmap-stream, eslint-scope, solana-web3js, and 43 more). 3 misses are browser-only (lottie-player, polyfill-io, trojanized-jquery) — see [Threat Model](docs/threat-model.md).
787
792
  - **FPR** (False Positive Rate): packages scoring > 20 out of 529 real npm packages (source code scanned, not empty dirs).
788
- - **ADR** (Adversarial Detection Rate): detection rate on 102 evasive malicious samples — 62 adversarial + 40 holdout (5 adversarial waves + 4 holdout batches). 4 misses on available samples: `require-cache-poison` (P3 trade-off), `getter-defineProperty-exfil`, `setTimeout-eval-chain`, `setter-trap-exfil`.
793
+ - **ADR** (Adversarial Detection Rate): detection rate on 120 evasive malicious samples — 53 adversarial + 40 holdout (6 adversarial waves + 4 holdout batches). 75 available on disk. 2 misses on available samples: `require-cache-poison` (P3 trade-off), `getter-defineProperty-exfil`.
789
794
  - **Holdout** (pre-tuning): detection rate on 10 unseen samples with rules frozen (measures generalization)
790
795
 
791
- Datasets: 17,922 Datadog malware samples, 529 npm + 132 PyPI benign packages, 102 adversarial/holdout samples, 51 ground-truth attacks (65 documented malware packages). **1869 tests**, 86% code coverage.
796
+ Datasets: 17,922 Datadog malware samples, 532 npm + 132 PyPI benign packages, 120 adversarial/holdout samples (75 available on disk), 51 ground-truth attacks (65 documented malware packages). **1932 tests**, 86% code coverage.
792
797
 
793
798
  See [Evaluation Methodology](docs/EVALUATION_METHODOLOGY.md) for the full experimental protocol.
794
799
 
@@ -824,12 +829,12 @@ npm test
824
829
 
825
830
  ### Testing
826
831
 
827
- - **1869 unit/integration tests** across 43 modular test files - 86% code coverage via [Codecov](https://codecov.io/gh/DNSZLSK/muad-dib)
832
+ - **1932 unit/integration tests** across 44 modular test files - 86% code coverage via [Codecov](https://codecov.io/gh/DNSZLSK/muad-dib)
828
833
  - **56 fuzz tests** - Malformed YAML, invalid JSON, binary files, ReDoS, unicode, 10MB inputs
829
834
  - **Datadog 17K benchmark** - 17,922 real malware samples, 88.2% raw TPR, ~100% on JS/Node.js malware (2,077 out-of-scope misses: phishing, binaries, corrected libs)
830
- - **102 adversarial/holdout samples** - 62 adversarial + 40 holdout, 63/67 detection rate on available samples (94.0% ADR). 4 misses: `require-cache-poison` (P3 trade-off), `getter-defineProperty-exfil`, `setTimeout-eval-chain`, `setter-trap-exfil`
835
+ - **120 adversarial/holdout samples** - 53 adversarial + 40 holdout (75 available on disk), 73/75 detection rate (97.3% ADR). 2 misses: `require-cache-poison` (P3 trade-off), `getter-defineProperty-exfil`
831
836
  - **Ground truth validation** - 51 real-world attacks (46/49 detected = 93.9% TPR). 3 out-of-scope: browser-only (lottie-player, polyfill-io, trojanized-jquery)
832
- - **False positive validation** - 12.3% FPR global (65/529) on real npm source code via `npm pack`
837
+ - **False positive validation** - 12.3% FPR global (65/532) on real npm source code via `npm pack`
833
838
  - **ESLint security audit** - `eslint-plugin-security` with 14 rules enabled
834
839
 
835
840
  ---
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "muaddib-scanner",
3
- "version": "2.5.17",
3
+ "version": "2.6.1",
4
4
  "description": "Supply-chain threat detection & response for npm & PyPI/Python",
5
5
  "main": "src/index.js",
6
6
  "bin": {
package/src/index.js CHANGED
@@ -23,12 +23,13 @@ const { ensureIOCs } = require('./ioc/bootstrap.js');
23
23
  const { scanEntropy } = require('./scanner/entropy.js');
24
24
  const { scanAIConfig } = require('./scanner/ai-config.js');
25
25
  const { deobfuscate } = require('./scanner/deobfuscate.js');
26
- const { buildModuleGraph, annotateTaintedExports, detectCrossFileFlows, annotateSinkExports, detectCallbackCrossFileFlows } = require('./scanner/module-graph.js');
26
+ const { buildModuleGraph, annotateTaintedExports, detectCrossFileFlows, annotateSinkExports, detectCallbackCrossFileFlows, detectEventEmitterFlows } = require('./scanner/module-graph.js');
27
27
  const { computeReachableFiles } = require('./scanner/reachability.js');
28
28
  const { runTemporalAnalyses } = require('./temporal-runner.js');
29
29
  const { formatOutput } = require('./output-formatter.js');
30
30
  const { setExtraExcludes, getExtraExcludes, Spinner, listInstalledPackages, clearFileListCache, debugLog } = require('./utils.js');
31
31
  const { SEVERITY_WEIGHTS, RISK_THRESHOLDS, MAX_RISK_SCORE, isPackageLevelThreat, computeGroupScore, applyFPReductions, calculateRiskScore } = require('./scoring.js');
32
+ const { buildIntentPairs } = require('./intent-graph.js');
32
33
 
33
34
  const { MAX_FILE_SIZE, safeParse } = require('./shared/constants.js');
34
35
  const walk = require('acorn-walk');
@@ -356,16 +357,27 @@ async function run(targetPath, options = {}) {
356
357
 
357
358
  // Cross-file module graph analysis (before individual scanners)
358
359
  // Wrapped in yieldThen to unblock spinner animation
360
+ // Bounded: 5s timeout to prevent DoS on large/adversarial packages
361
+ const MODULE_GRAPH_TIMEOUT_MS = 5000;
359
362
  let crossFileFlows = [];
360
363
  if (!options.noModuleGraph) {
361
- try {
364
+ const moduleGraphWork = async () => {
362
365
  const graph = await yieldThen(() => buildModuleGraph(targetPath));
363
366
  const tainted = await yieldThen(() => annotateTaintedExports(graph, targetPath));
364
- crossFileFlows = await yieldThen(() => detectCrossFileFlows(graph, tainted, targetPath));
365
- // Callback-based cross-file flow detection
366
367
  const sinkAnnotations = await yieldThen(() => annotateSinkExports(graph, targetPath));
368
+ crossFileFlows = await yieldThen(() => detectCrossFileFlows(graph, tainted, sinkAnnotations, targetPath));
369
+ // Callback-based cross-file flow detection
367
370
  const callbackFlows = await yieldThen(() => detectCallbackCrossFileFlows(graph, tainted, sinkAnnotations, targetPath));
368
371
  crossFileFlows = crossFileFlows.concat(callbackFlows);
372
+ // EventEmitter cross-module flow detection
373
+ const emitterFlows = await yieldThen(() => detectEventEmitterFlows(graph, tainted, sinkAnnotations, targetPath));
374
+ crossFileFlows = crossFileFlows.concat(emitterFlows);
375
+ };
376
+ const timeout = new Promise((_, reject) =>
377
+ setTimeout(() => reject(new Error('Module graph timeout')), MODULE_GRAPH_TIMEOUT_MS)
378
+ );
379
+ try {
380
+ await Promise.race([moduleGraphWork(), timeout]);
369
381
  } catch (e) {
370
382
  // Graceful fallback — module graph is best-effort
371
383
  debugLog('[MODULE-GRAPH] Error:', e && e.message);
@@ -529,6 +541,23 @@ async function run(targetPath, options = {}) {
529
541
  // A malware package typically has 1-3 occurrences, not dozens.
530
542
  applyFPReductions(deduped, reachableFiles, packageName);
531
543
 
544
+ // Intent coherence analysis: detect source→sink pairs across files
545
+ const intentResult = buildIntentPairs(deduped);
546
+ // Add intent threats to deduped before enrichment so they get rules/playbooks
547
+ if (intentResult.intentThreats) {
548
+ for (const it of intentResult.intentThreats) {
549
+ // Respect reachability: downgrade intent threats in unreachable files
550
+ if (reachableFiles && reachableFiles.size > 0 && it.file) {
551
+ const normalizedFile = it.file.replace(/\\/g, '/');
552
+ if (!reachableFiles.has(normalizedFile)) {
553
+ it.severity = 'LOW';
554
+ it.unreachable = true;
555
+ }
556
+ }
557
+ deduped.push(it);
558
+ }
559
+ }
560
+
532
561
  // Enrich each threat with rules
533
562
  const enrichedThreats = deduped.map(t => {
534
563
  const rule = getRule(t.type);
@@ -550,12 +579,12 @@ async function run(targetPath, options = {}) {
550
579
  .map(t => ({ rule: t.rule_id, type: t.type, points: t.points, reason: t.message }))
551
580
  .sort((a, b) => b.points - a.points);
552
581
 
553
- // Per-file max scoring (v2.2.11)
582
+ // Per-file max scoring (v2.2.11) with intent graph bonus
554
583
  const {
555
584
  riskScore, riskLevel, globalRiskScore,
556
- maxFileScore, packageScore, mostSuspiciousFile, fileScores,
585
+ maxFileScore, packageScore, intentBonus, mostSuspiciousFile, fileScores,
557
586
  criticalCount, highCount, mediumCount, lowCount
558
- } = calculateRiskScore(deduped);
587
+ } = calculateRiskScore(deduped, intentResult);
559
588
 
560
589
  // Python scan metadata
561
590
  const pythonInfo = pythonDeps.length > 0 ? {
@@ -0,0 +1,233 @@
1
+ 'use strict';
2
+
3
+ // ============================================
4
+ // INTENT GRAPH — Intra-File Coherence Analysis
5
+ // ============================================
6
+ // Boosts score when a SINGLE file contains both a high-confidence credential
7
+ // source AND a dangerous sink (eval, exec, network). This is genuinely suspicious
8
+ // because legitimate code rarely reads .npmrc and evals in the same file.
9
+ //
10
+ // DESIGN PRINCIPLES (informed by SpiderScan, Cerebro, taint-slicing research):
11
+ // 1. INTRA-FILE ONLY — cross-file pairing without proven data flow = FP explosion
12
+ // (aws-sdk has process.env in config.js + https.request in http.js = not malicious)
13
+ // 2. Cross-file detection is handled by module-graph.js → cross_file_dataflow threats
14
+ // 3. Sources = ONLY high-confidence credential access (NOT env_access, NOT suspicious_dataflow)
15
+ // 4. Sinks = ONLY threats already identified by scanners (NO content-based scanning)
16
+ // 5. No double-counting — suspicious_dataflow is already a compound detection
17
+
18
+ // ============================================
19
+ // SOURCE CLASSIFICATION
20
+ // ============================================
21
+ const SOURCE_TYPES = {
22
+ sensitive_string: 'credential_read', // .npmrc, .ssh, .env file references
23
+ env_harvesting_dynamic: 'credential_read', // Object.keys(process.env), rest destructuring
24
+ credential_regex_harvest: 'credential_read', // regex patterns for tokens/passwords
25
+ llm_api_key_harvest: 'credential_read', // OPENAI_API_KEY, ANTHROPIC_API_KEY
26
+ credential_cli_steal: 'credential_read', // gh auth token, gcloud auth
27
+ // env_access EXCLUDED — standard config (process.env.PORT, AWS_REGION, NODE_ENV)
28
+ // suspicious_dataflow EXCLUDED — already compound detection
29
+ // cross_file_dataflow EXCLUDED — already scored CRITICAL by module-graph
30
+ };
31
+
32
+ // ============================================
33
+ // SINK CLASSIFICATION (from existing threats only)
34
+ // ============================================
35
+ const THREAT_SINK_TYPES = {
36
+ dangerous_call_eval: 'exec_sink',
37
+ dangerous_call_function: 'exec_sink',
38
+ staged_eval_decode: 'exec_sink',
39
+ vm_code_execution: 'exec_sink',
40
+ module_compile: 'exec_sink',
41
+ module_compile_dynamic: 'exec_sink',
42
+ credential_tampering: 'file_tamper',
43
+ git_hook_injection: 'file_tamper',
44
+ workflow_write: 'file_tamper',
45
+ mcp_config_injection: 'file_tamper',
46
+ ide_persistence: 'file_tamper',
47
+ };
48
+
49
+ // Message-based sink detection for threats not in THREAT_SINK_TYPES
50
+ const SINK_MESSAGE_PATTERNS = [
51
+ { pattern: /https?\.request|dns\.resolve|net\.connect/, type: 'network_external' },
52
+ { pattern: /webhook/i, type: 'network_external' },
53
+ ];
54
+
55
+ // ============================================
56
+ // COHERENCE MATRIX
57
+ // ============================================
58
+ // Only applied to intra-file pairs. Cross-file coherence is handled by module-graph.
59
+ const COHERENCE_MATRIX = {
60
+ credential_read: {
61
+ network_external: { modifier: 30, severity: 'CRITICAL' },
62
+ network_internal: { modifier: 10, severity: 'HIGH' },
63
+ exec_sink: { modifier: 25, severity: 'CRITICAL' },
64
+ file_local: { modifier: 5, severity: 'MEDIUM' },
65
+ file_tamper: { modifier: 20, severity: 'HIGH' },
66
+ },
67
+ fingerprint_read: {
68
+ network_external: { modifier: 0, severity: 'LOW' },
69
+ network_internal: { modifier: 0, severity: 'LOW' },
70
+ exec_sink: { modifier: 10, severity: 'MEDIUM' },
71
+ file_local: { modifier: 0, severity: 'LOW' },
72
+ file_tamper: { modifier: 5, severity: 'LOW' },
73
+ },
74
+ telemetry_read: {
75
+ network_external: { modifier: 0, severity: 'LOW' },
76
+ network_internal: { modifier: 0, severity: 'LOW' },
77
+ exec_sink: { modifier: 0, severity: 'LOW' },
78
+ file_local: { modifier: 0, severity: 'LOW' },
79
+ file_tamper: { modifier: 0, severity: 'LOW' },
80
+ },
81
+ config_read: {
82
+ network_external: { modifier: 5, severity: 'LOW' },
83
+ network_internal: { modifier: 0, severity: 'LOW' },
84
+ exec_sink: { modifier: 5, severity: 'LOW' },
85
+ file_local: { modifier: 0, severity: 'LOW' },
86
+ file_tamper: { modifier: 0, severity: 'LOW' },
87
+ },
88
+ command_output: {
89
+ network_external: { modifier: 20, severity: 'HIGH' },
90
+ network_internal: { modifier: 5, severity: 'MEDIUM' },
91
+ exec_sink: { modifier: 15, severity: 'HIGH' },
92
+ file_local: { modifier: 5, severity: 'MEDIUM' },
93
+ file_tamper: { modifier: 15, severity: 'HIGH' },
94
+ },
95
+ };
96
+
97
+ // Kept for backward compatibility but no longer used in pairing
98
+ // Cross-file detection is handled by module-graph.js (cross_file_dataflow)
99
+ const CROSS_FILE_MULTIPLIER = 0.5;
100
+
101
+ /**
102
+ * Classify a threat as a source type.
103
+ * Only high-confidence credential access patterns.
104
+ */
105
+ function classifySource(threat) {
106
+ if (SOURCE_TYPES[threat.type]) return SOURCE_TYPES[threat.type];
107
+
108
+ // Explicitly excluded types
109
+ if (threat.type === 'suspicious_dataflow') return null;
110
+ if (threat.type === 'env_access') return null;
111
+ if (threat.type === 'cross_file_dataflow') return null;
112
+
113
+ // Message-based: only for threats referencing sensitive file paths
114
+ if (threat.message) {
115
+ const msg = threat.message;
116
+ if (/\.npmrc|\.ssh\/|\.aws\/|id_rsa|\.gitconfig/i.test(msg)) {
117
+ return 'credential_read';
118
+ }
119
+ }
120
+
121
+ return null;
122
+ }
123
+
124
+ /**
125
+ * Classify a threat as a sink type.
126
+ * Only from existing threat types — no content scanning.
127
+ */
128
+ function classifySink(threat) {
129
+ if (THREAT_SINK_TYPES[threat.type]) return THREAT_SINK_TYPES[threat.type];
130
+
131
+ if (threat.message) {
132
+ for (const { pattern, type } of SINK_MESSAGE_PATTERNS) {
133
+ if (pattern.test(threat.message)) return type;
134
+ }
135
+ }
136
+
137
+ return null;
138
+ }
139
+
140
+ /**
141
+ * Build intent pairs from INTRA-FILE co-occurrence only.
142
+ * Cross-file detection is handled by module-graph.js (cross_file_dataflow).
143
+ *
144
+ * @param {Array} threats - deduplicated threat array
145
+ * @returns {Object} { pairs, intentScore, intentThreats }
146
+ */
147
+ function buildIntentPairs(threats) {
148
+ // Only consider MEDIUM+ threats. LOW severity means applyFPReductions already
149
+ // determined this is noise (bundler artifact, dist/ file, count threshold exceeded).
150
+ // Re-elevating LOW threats via intent pairing would undo FP reductions.
151
+ const eligible = threats.filter(t => t.severity !== 'LOW');
152
+
153
+ // Group eligible threats by file
154
+ const byFile = new Map();
155
+ for (const t of eligible) {
156
+ const file = t.file || '(unknown)';
157
+ if (!byFile.has(file)) byFile.set(file, []);
158
+ byFile.get(file).push(t);
159
+ }
160
+
161
+ const pairSet = new Set();
162
+ const pairs = [];
163
+ let intentScore = 0;
164
+
165
+ // Only pair sources and sinks within the SAME file
166
+ for (const [file, fileThreats] of byFile) {
167
+ const sources = [];
168
+ const sinks = [];
169
+
170
+ for (const t of fileThreats) {
171
+ const srcType = classifySource(t);
172
+ const sinkType = classifySink(t);
173
+ if (srcType) sources.push(srcType);
174
+ if (sinkType) sinks.push(sinkType);
175
+ }
176
+
177
+ if (sources.length === 0 || sinks.length === 0) continue;
178
+
179
+ // Deduplicate source×sink combinations within this file
180
+ for (const srcType of new Set(sources)) {
181
+ const srcMatrix = COHERENCE_MATRIX[srcType];
182
+ if (!srcMatrix) continue;
183
+
184
+ for (const sinkType of new Set(sinks)) {
185
+ const entry = srcMatrix[sinkType];
186
+ if (!entry || entry.modifier === 0) continue;
187
+
188
+ const pairKey = `${srcType}:${sinkType}:${file}`;
189
+ if (pairSet.has(pairKey)) continue;
190
+ pairSet.add(pairKey);
191
+
192
+ pairs.push({
193
+ sourceType: srcType,
194
+ sinkType,
195
+ severity: entry.severity,
196
+ modifier: entry.modifier,
197
+ crossFile: false,
198
+ sourceFile: file,
199
+ sinkFile: file
200
+ });
201
+ intentScore += entry.modifier;
202
+ }
203
+ }
204
+ }
205
+
206
+ // Generate intent threats only for high-confidence pairs (modifier >= 25)
207
+ const intentThreats = [];
208
+ for (const pair of pairs) {
209
+ if (pair.modifier >= 25) {
210
+ const type = pair.sourceType === 'credential_read'
211
+ ? 'intent_credential_exfil'
212
+ : pair.sourceType === 'command_output'
213
+ ? 'intent_command_exfil'
214
+ : 'intent_credential_exfil';
215
+ intentThreats.push({
216
+ type,
217
+ severity: pair.severity,
218
+ message: `Intent coherence: ${pair.sourceType} → ${pair.sinkType} (${pair.sourceFile})`,
219
+ file: pair.sourceFile
220
+ });
221
+ }
222
+ }
223
+
224
+ return { pairs, intentScore, intentThreats };
225
+ }
226
+
227
+ module.exports = {
228
+ classifySource,
229
+ classifySink,
230
+ buildIntentPairs,
231
+ COHERENCE_MATRIX,
232
+ CROSS_FILE_MULTIPLIER
233
+ };
@@ -486,6 +486,15 @@ const PLAYBOOKS = {
486
486
  'CRITIQUE: Un Proxy JavaScript avec trap set/get/apply est combine avec un appel reseau. ' +
487
487
  'Technique d\'interception: le Proxy capture toutes les ecritures de proprietes (credentials, tokens, config) ' +
488
488
  'et les exfiltre via HTTPS/fetch/dgram. Supprimer le package. Auditer tous les modules qui importent ce package.',
489
+ intent_credential_exfil:
490
+ 'CRITIQUE: Coherence d\'intention detectee — lecture de credentials combinee avec exfiltration reseau. ' +
491
+ 'Pattern multi-fichier DPRK/Lazarus: chaque fichier semble legitime individuellement mais le package ' +
492
+ 'dans son ensemble collecte des secrets et les envoie sur le reseau. Supprimer le package immediatement. ' +
493
+ 'Regenerer tous les tokens/credentials exposes. Auditer le package.json pour les scripts lifecycle.',
494
+ intent_command_exfil:
495
+ 'Coherence d\'intention detectee — sortie de commande systeme combinee avec exfiltration reseau. ' +
496
+ 'Le package execute des commandes et transmet les resultats. Verifier les commandes executees. ' +
497
+ 'Supprimer le package si non attendu. Auditer les logs reseau pour identifier les donnees exfiltrees.',
489
498
  };
490
499
 
491
500
  function getPlaybook(threatType) {
@@ -1357,6 +1357,31 @@ const RULES = {
1357
1357
  ],
1358
1358
  mitre: 'T1557'
1359
1359
  },
1360
+ // Intent Graph rules (v2.6.0)
1361
+ intent_credential_exfil: {
1362
+ id: 'MUADDIB-INTENT-001',
1363
+ name: 'Intent Credential Exfiltration',
1364
+ severity: 'CRITICAL',
1365
+ confidence: 'high',
1366
+ description: 'Coherence d\'intention: lecture de credentials (fichiers sensibles, env vars) combinee avec un sink reseau ou exec dans le meme package. Pattern typique DPRK/Lazarus: code malveillant fragmente sur plusieurs fichiers avec uniquement des APIs legitimes.',
1367
+ references: [
1368
+ 'https://attack.mitre.org/techniques/T1041/',
1369
+ 'https://www.cisa.gov/news-events/cybersecurity-advisories/aa22-108a'
1370
+ ],
1371
+ mitre: 'T1041'
1372
+ },
1373
+ intent_command_exfil: {
1374
+ id: 'MUADDIB-INTENT-002',
1375
+ name: 'Intent Command Output Exfiltration',
1376
+ severity: 'HIGH',
1377
+ confidence: 'medium',
1378
+ description: 'Coherence d\'intention: sortie de commande systeme combinee avec un sink reseau. Le code execute des commandes et transmet les resultats sur le reseau — reconnaissance ou exfiltration.',
1379
+ references: [
1380
+ 'https://attack.mitre.org/techniques/T1059/',
1381
+ 'https://attack.mitre.org/techniques/T1041/'
1382
+ ],
1383
+ mitre: 'T1059'
1384
+ },
1360
1385
  };
1361
1386
 
1362
1387
  function getRule(type) {
@@ -389,6 +389,32 @@ function handleVariableDeclarator(node, ctx) {
389
389
  if (cn === 'eval' || cn === 'Function') ctx.evalAliases.set(node.id.name, cn);
390
390
  }
391
391
  }
392
+ // B1 fix: const getEval = () => eval; — 0-param arrow/function returning eval/Function as value (not call)
393
+ if ((node.init?.type === 'ArrowFunctionExpression' || node.init?.type === 'FunctionExpression')) {
394
+ const body = node.init.body;
395
+ // () => eval
396
+ if (body?.type === 'Identifier' && (body.name === 'eval' || body.name === 'Function')) {
397
+ ctx.evalAliases.set(node.id.name, body.name + '_factory');
398
+ }
399
+ // () => { return eval; }
400
+ if (body?.type === 'BlockStatement' && body.body?.length === 1 &&
401
+ body.body[0].type === 'ReturnStatement' &&
402
+ body.body[0].argument?.type === 'Identifier' &&
403
+ (body.body[0].argument.name === 'eval' || body.body[0].argument.name === 'Function')) {
404
+ ctx.evalAliases.set(node.id.name, body.body[0].argument.name + '_factory');
405
+ }
406
+ }
407
+
408
+ // B1 fix: const compiler = getCompiler() where getCompiler is eval_factory
409
+ if (node.init?.type === 'CallExpression' &&
410
+ node.init.callee?.type === 'Identifier' &&
411
+ ctx.evalAliases?.has(node.init.callee.name)) {
412
+ const aliased = ctx.evalAliases.get(node.init.callee.name);
413
+ if (aliased.endsWith('_factory')) {
414
+ const baseName = aliased.replace('_factory', '');
415
+ ctx.evalAliases.set(node.id.name, baseName);
416
+ }
417
+ }
392
418
 
393
419
  // B5: Track object literal string properties
394
420
  if (node.init?.type === 'ObjectExpression') {
@@ -558,6 +584,23 @@ function handleCallExpression(node, ctx) {
558
584
  });
559
585
  }
560
586
  }
587
+ // B3 fix: require(/child_process/.source) — RegExpLiteral.source resolution
588
+ else if (arg.type === 'MemberExpression' &&
589
+ arg.object?.type === 'Literal' && arg.object.regex &&
590
+ arg.property?.type === 'Identifier' && arg.property.name === 'source') {
591
+ const regexSource = arg.object.regex.pattern;
592
+ const DANGEROUS_MODS = ['child_process', 'fs', 'net', 'dns', 'http', 'https', 'tls'];
593
+ const norm = regexSource.startsWith('node:') ? regexSource.slice(5) : regexSource;
594
+ if (DANGEROUS_MODS.includes(norm)) {
595
+ ctx.threats.push({ type: 'dynamic_require', severity: 'CRITICAL',
596
+ message: `require(/${regexSource}/.source) resolves to "${norm}" — regex .source evasion.`,
597
+ file: ctx.relFile });
598
+ } else {
599
+ ctx.threats.push({ type: 'dynamic_require', severity: 'HIGH',
600
+ message: `require() with regex .source argument (/${regexSource}/.source) — obfuscation technique.`,
601
+ file: ctx.relFile });
602
+ }
603
+ }
561
604
  // B5: require(obj.prop) — MemberExpression argument
562
605
  else if (arg.type === 'MemberExpression') {
563
606
  const objName = arg.object?.type === 'Identifier' ? arg.object.name : null;
@@ -1013,14 +1056,37 @@ function handleCallExpression(node, ctx) {
1013
1056
  // B1: Alias call — E('code') where E = eval or F = Function
1014
1057
  if (node.callee.type === 'Identifier' && ctx.evalAliases?.has(node.callee.name)) {
1015
1058
  const aliased = ctx.evalAliases.get(node.callee.name);
1016
- ctx.hasEvalInFile = true;
1017
- ctx.hasDynamicExec = true;
1018
- ctx.threats.push({
1019
- type: aliased === 'eval' ? 'dangerous_call_eval' : 'dangerous_call_function',
1020
- severity: 'HIGH',
1021
- message: `Indirect ${aliased} via alias "${node.callee.name}" eval wrapper evasion.`,
1022
- file: ctx.relFile
1023
- });
1059
+ if (aliased.endsWith('_factory')) {
1060
+ // Factory pattern: getEval()('code') — the callee is detected at outer CallExpression level
1061
+ // Mark the identifier so we can detect the outer call
1062
+ } else {
1063
+ ctx.hasEvalInFile = true;
1064
+ ctx.hasDynamicExec = true;
1065
+ ctx.threats.push({
1066
+ type: aliased === 'eval' ? 'dangerous_call_eval' : 'dangerous_call_function',
1067
+ severity: 'HIGH',
1068
+ message: `Indirect ${aliased} via alias "${node.callee.name}" — eval wrapper evasion.`,
1069
+ file: ctx.relFile
1070
+ });
1071
+ }
1072
+ }
1073
+
1074
+ // B1 fix: Factory call — getEval()('code') where getEval = () => eval
1075
+ if (node.callee.type === 'CallExpression' &&
1076
+ node.callee.callee?.type === 'Identifier' &&
1077
+ ctx.evalAliases?.has(node.callee.callee.name)) {
1078
+ const aliased = ctx.evalAliases.get(node.callee.callee.name);
1079
+ if (aliased.endsWith('_factory')) {
1080
+ const baseName = aliased.replace('_factory', '');
1081
+ ctx.hasEvalInFile = true;
1082
+ ctx.hasDynamicExec = true;
1083
+ ctx.threats.push({
1084
+ type: baseName === 'eval' ? 'dangerous_call_eval' : 'dangerous_call_function',
1085
+ severity: 'HIGH',
1086
+ message: `Indirect ${baseName} via factory function "${node.callee.callee.name}()" — eval factory evasion.`,
1087
+ file: ctx.relFile
1088
+ });
1089
+ }
1024
1090
  }
1025
1091
 
1026
1092
  if (callName === 'eval') {
@@ -1119,6 +1185,24 @@ function handleCallExpression(node, ctx) {
1119
1185
  file: ctx.relFile
1120
1186
  });
1121
1187
  }
1188
+ // B2 fix: Function.prototype.call.call(eval, null, code) / X.call.call(eval, ...)
1189
+ // Deep MemberExpression: obj is itself a MemberExpression ending in .call/.apply
1190
+ if (obj?.type === 'MemberExpression' &&
1191
+ obj.property?.type === 'Identifier' &&
1192
+ (obj.property.name === 'call' || obj.property.name === 'apply') &&
1193
+ node.arguments.length >= 2) {
1194
+ const firstArg = node.arguments[0];
1195
+ if (firstArg?.type === 'Identifier' && (firstArg.name === 'eval' || firstArg.name === 'Function')) {
1196
+ ctx.hasEvalInFile = true;
1197
+ ctx.hasDynamicExec = true;
1198
+ ctx.threats.push({
1199
+ type: firstArg.name === 'eval' ? 'dangerous_call_eval' : 'dangerous_call_function',
1200
+ severity: 'HIGH',
1201
+ message: `${firstArg.name} passed to .call.call() — nested call/apply evasion technique.`,
1202
+ file: ctx.relFile
1203
+ });
1204
+ }
1205
+ }
1122
1206
  }
1123
1207
 
1124
1208
  // Detect array access pattern: [require][0]('child_process') or [eval][0](code)
@@ -109,6 +109,31 @@ function buildTaintMap(ast) {
109
109
  }
110
110
  }
111
111
  }
112
+
113
+ // B5 fix: const tools = { read: fs.readFileSync, home: os.homedir }
114
+ // Track object properties that reference tainted module methods as tainted aliases
115
+ if (node.id.type === 'Identifier' && init.type === 'ObjectExpression') {
116
+ for (const prop of init.properties) {
117
+ if (prop.type !== 'Property') continue;
118
+ const key = prop.key?.type === 'Identifier' ? prop.key.name :
119
+ (prop.key?.type === 'Literal' ? String(prop.key.value) : null);
120
+ if (!key) continue;
121
+ // Property value is a MemberExpression on a tainted module: fs.readFileSync, os.homedir
122
+ if (prop.value?.type === 'MemberExpression' &&
123
+ prop.value.object?.type === 'Identifier' &&
124
+ prop.value.property?.type === 'Identifier') {
125
+ const parentTaint = taintMap.get(prop.value.object.name);
126
+ if (parentTaint && TRACKED_MODULES.has(parentTaint.source)) {
127
+ const methodName = prop.value.property.name;
128
+ // Store as "objName.key" so tools.read calls are resolved
129
+ taintMap.set(`${node.id.name}.${key}`, {
130
+ source: parentTaint.source,
131
+ detail: `${parentTaint.source}.${methodName}`
132
+ });
133
+ }
134
+ }
135
+ }
136
+ }
112
137
  }
113
138
  });
114
139
 
@@ -466,6 +491,42 @@ function analyzeFile(content, filePath, basePath) {
466
491
  });
467
492
  }
468
493
  }
494
+
495
+ // B5 fix: object method alias — tools.read(...) where tools.read = fs.readFileSync
496
+ const aliasKey = `${obj.name}.${prop.name}`;
497
+ const aliasTaint = taintMap.get(aliasKey);
498
+ if (aliasTaint && aliasTaint.detail.includes('.')) {
499
+ const [aliasModule, aliasMethod] = aliasTaint.detail.split('.');
500
+ const aliasSourceMethods = MODULE_SOURCE_METHODS[aliasModule];
501
+ if (aliasSourceMethods && aliasSourceMethods[aliasMethod]) {
502
+ // For credential_read: also check if the argument is a sensitive path
503
+ const isCredArg = node.arguments[0] && isCredentialPath(node.arguments[0], sensitivePathVars);
504
+ if (aliasSourceMethods[aliasMethod] === 'credential_read' && isCredArg) {
505
+ sources.push({
506
+ type: 'credential_read',
507
+ name: `${aliasModule}.${aliasMethod}`,
508
+ line: node.loc?.start?.line,
509
+ taint_tracked: true
510
+ });
511
+ } else if (aliasSourceMethods[aliasMethod] !== 'credential_read') {
512
+ sources.push({
513
+ type: aliasSourceMethods[aliasMethod],
514
+ name: `${aliasModule}.${aliasMethod}`,
515
+ line: node.loc?.start?.line,
516
+ taint_tracked: true
517
+ });
518
+ }
519
+ }
520
+ const aliasSinkMethods = MODULE_SINK_METHODS[aliasModule];
521
+ if (aliasSinkMethods && aliasSinkMethods[aliasMethod]) {
522
+ sinks.push({
523
+ type: aliasSinkMethods[aliasMethod],
524
+ name: `${aliasModule}.${aliasMethod}`,
525
+ line: node.loc?.start?.line,
526
+ taint_tracked: true
527
+ });
528
+ }
529
+ }
469
530
  }
470
531
  }
471
532