muaddib-scanner 2.10.96 → 2.10.98

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -30,7 +30,7 @@
30
30
 
31
31
  npm and PyPI supply-chain attacks are exploding. Shai-Hulud compromised 25K+ repos in 2025. Existing tools detect threats but don't help you respond.
32
32
 
33
- MUAD'DIB combines **14 parallel scanners** (200 detection rules), a **deobfuscation engine**, **inter-module dataflow analysis**, **compound scoring**, **ML classifiers** (XGBoost), and gVisor/Docker sandbox to detect known threats and suspicious behavioral patterns in npm and PyPI packages.
33
+ MUAD'DIB combines **14 parallel scanners** (209 detection rules), a **deobfuscation engine**, **inter-module dataflow analysis**, **compound scoring**, **ML classifiers** (XGBoost), and gVisor/Docker sandbox to detect known threats and suspicious behavioral patterns in npm and PyPI packages.
34
34
 
35
35
  ---
36
36
 
@@ -169,7 +169,7 @@ muaddib scrape # Full IOC refresh (~5min)
169
169
  muaddib diff HEAD~1 # Compare threats with previous commit
170
170
  muaddib init-hooks # Pre-commit hooks (husky/pre-commit/git)
171
171
  muaddib scan . --breakdown # Explainable score decomposition
172
- muaddib replay # Ground truth validation (60/64 TPR@3)
172
+ muaddib replay # Ground truth validation (61/65 TPR@3)
173
173
  ```
174
174
 
175
175
  ---
@@ -195,7 +195,7 @@ muaddib replay # Ground truth validation (60/64 TPR@3)
195
195
  | GitHub Actions | Shai-Hulud backdoor detection |
196
196
  | Hash Scanner | Known malicious file hashes |
197
197
 
198
- ### 200 detection rules
198
+ ### 209 detection rules
199
199
 
200
200
  All rules are mapped to MITRE ATT&CK techniques. See [SECURITY.md](SECURITY.md#detection-rules-v21021) for the complete rules reference.
201
201
 
@@ -271,7 +271,7 @@ With pre-commit framework:
271
271
  ```yaml
272
272
  repos:
273
273
  - repo: https://github.com/DNSZLSK/muad-dib
274
- rev: v2.10.57
274
+ rev: v2.10.97
275
275
  hooks:
276
276
  - id: muaddib-scan
277
277
  ```
@@ -285,14 +285,14 @@ repos:
285
285
  | **ML FPR** | **2.85%** (239/8,393 holdout) | XGBoost retrained on 56,564 samples, 64 features, threshold=0.710 |
286
286
  | **ML TPR** | **99.93%** (2,918/2,920 holdout) | 377 confirmed_malicious via OSSF/GHSA/npm correlation |
287
287
  | **Wild TPR** (Datadog 17K) | **92.8%** (13,538/14,587 in-scope) | 17,922 packages. 3,335 skipped (no JS). By category: compromised_lib 97.8%, malicious_intent 92.1% |
288
- | **TPR@3** (detection rate) | **93.75%** (60/64) | 66 real attacks (64 active, 2 out-of-scope). Threshold=3: any signal |
289
- | **TPR@20** (alert rate) | **85.9%** (55/64) | Operational alert threshold=20, aligned with ADR/FPR |
290
- | **FPR rules** (Benign curated) | **14.0%** (74/532) | 532 npm packages, real source via `npm pack` |
291
- | **FPR after ML** | **8.3%** (44/529) | ML filters 30/31 T1 benign, 0 GT/ADR suppressed |
292
- | **FPR** (Benign random) | **7.5%** (15/200) | 200 random npm packages, stratified sampling |
288
+ | **TPR@3** (detection rate) | **93.85%** (61/65) | 67 real attacks (65 active, 2 out-of-scope: GT-005 colors, GT-009 faker — protestware with min_threats=0). Threshold=3: any signal |
289
+ | **TPR@20** (alert rate) | **86.2%** (56/65) | Operational alert threshold=20, aligned with ADR/FPR |
290
+ | **FPR rules** (Benign curated, v2.10.95 measure) | **15.6%** (85/545 scanned, 548 total) | npm packages, real source via `npm pack`; v2.10.74 estimated 6-9% reduction did NOT materialize on rebuilt corpus |
291
+ | **FPR after ML** (v2.10.95 measure) | **10.28%** (56/545 scanned) | ML filters 29/30 T1 benign, 0 GT/ADR suppressed |
292
+ | **FPR** (Benign random, v2.10.95 measure) | **7.0%** (14/200) | 200 random npm packages, stratified sampling |
293
293
  | **ADR** (Adversarial + Holdout) | **96.3%** (103/107) | 67 adversarial + 40 holdout (107 available on disk), global threshold=20 |
294
294
 
295
- **3230 tests** across 66 files. **207 rules** (202 RULES + 5 PARANOID).
295
+ **3280 tests** across 69 files. **209 rules** (204 RULES + 5 PARANOID).
296
296
 
297
297
  > **ML retrain methodology (v2.10.51):**
298
298
  > - Ground truth: 377 confirmed_malicious via auto-labeler (OSSF malicious-packages, GitHub Advisory Database, npm registry takedown correlation)
@@ -301,7 +301,7 @@ repos:
301
301
  > - Leaky feature filter: 23 dead/leaky features removed (source-identity proxies)
302
302
  >
303
303
  > **Static evaluation caveats:**
304
- > - TPR measured on 64 active Node.js attack samples (2 out-of-scope from 66 total)
304
+ > - TPR measured on 65 active Node.js attack samples (2 out-of-scope: GT-005 colors, GT-009 faker, both protestware with min_threats=0; from 67 total)
305
305
  > - TPR@3 = detection rate (any signal); TPR@20 = operational alert threshold
306
306
  > - FPR measured on 532 curated popular npm packages (not a random sample)
307
307
  > - ADR measured with global threshold (score >= 20) as of v2.6.5
@@ -340,11 +340,11 @@ npm test
340
340
 
341
341
  ### Testing
342
342
 
343
- - **3230 tests** across 66 modular test files
343
+ - **3280 tests** across 69 modular test files
344
344
  - **56 fuzz tests** - Malformed inputs, ReDoS, unicode, binary
345
345
  - **Datadog 17K benchmark** - 14,587 confirmed malware samples (in-scope)
346
- - **Ground truth validation** - 67 real-world attacks (93.75% TPR@3, 85.9% TPR@20)
347
- - **False positive validation** - 14.0% FPR rules, 8.3% after ML on 532 curated npm packages, 7.5% on 200 random
346
+ - **Ground truth validation** - 67 real-world attacks (93.85% TPR@3, 86.2% TPR@20 — v2.10.95 measure)
347
+ - **False positive validation** (v2.10.95 measure) - 15.6% FPR rules (85/545 scanned), 10.28% after ML (56/545 scanned), 7.0% on 200 random
348
348
 
349
349
  ---
350
350
 
@@ -361,8 +361,7 @@ npm test
361
361
  - [Documentation Index](docs/INDEX.md) - All documentation in one place
362
362
  - [Evaluation Methodology](docs/EVALUATION_METHODOLOGY.md) - Experimental protocol, holdout scores
363
363
  - [Threat Model](docs/threat-model.md) - What MUAD'DIB detects and doesn't detect
364
- - [Adversarial Evaluation](ADVERSARIAL.md) - Red team samples and ADR results
365
- - [Security Policy](SECURITY.md) - Detection rules reference (207 rules)
364
+ - [Security Policy](SECURITY.md) - Detection rules reference (209 rules)
366
365
  - [Security Audit](docs/SECURITY_AUDIT.md) - Bypass validation report
367
366
  - [FP Analysis](docs/EVALUATION.md) - Historical false positive analysis
368
367
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "muaddib-scanner",
3
- "version": "2.10.96",
3
+ "version": "2.10.98",
4
4
  "description": "Supply-chain threat detection & response for npm & PyPI/Python",
5
5
  "main": "src/index.js",
6
6
  "bin": {
@@ -696,7 +696,11 @@ function extractFeatures(result, meta) {
696
696
  features.typosquat_scoped_package = typosquatScopedPackage(result, meta) ? 1 : 0;
697
697
  features.obfuscation_without_vector = obfuscationWithoutVector(result) ? 1 : 0;
698
698
  features.placeholder_anti_dep_confusion = placeholderAntiDepConfusion(result, meta) ? 1 : 0;
699
- features.install_script_no_network_egress = installScriptNoNetworkEgress(result, meta) ? 1 : 0;
699
+ // F8 disabled for retrain fires on malware due to incomplete EGRESS_TYPES
700
+ // (missing dangerous_exec, lifecycle_dangerous_exec, node_inline_exec).
701
+ // Re-enable in v2.10.97 after EGRESS_TYPES fix + re-validation.
702
+ // See ml-retrain/ml-auc-v2.10.96.md for details.
703
+ features.install_script_no_network_egress = 0; // installScriptNoNetworkEgress(result, meta) ? 1 : 0;
700
704
 
701
705
  return features;
702
706
  }
@@ -56,6 +56,27 @@ const PLATEAU_STREAK_REQUIRED = 2; // must see flat throughput N times before tr
56
56
  * @returns {{ target: number, reason: string }}
57
57
  */
58
58
  function computeTarget(current, queueDepth, stats) {
59
+ // Priority 0: V8 heap pressure — os.freemem() misses this entirely.
60
+ // With --max-old-space-size=8192 on a 12GB VPS, system RAM can show 7GB free
61
+ // while V8 heap is at 90% and GC is thrashing. Use the daemon's circuit breaker
62
+ // level to gate concurrency before system RAM pressure kicks in.
63
+ try {
64
+ const { getMemoryPressureLevel, MEMORY_PRESSURE_LEVELS } = require('./daemon.js');
65
+ const heapPressure = getMemoryPressureLevel();
66
+ if (heapPressure >= MEMORY_PRESSURE_LEVELS.HIGH) {
67
+ const target = clamp(MIN_CONCURRENCY);
68
+ _prevScanned = stats.scanned || 0;
69
+ _prevTimeouts = (stats.errorsByType && stats.errorsByType.static_timeout) || 0;
70
+ return { target, reason: `heap_pressure_high (level=${heapPressure}, dropping to min=${MIN_CONCURRENCY})` };
71
+ }
72
+ if (heapPressure >= MEMORY_PRESSURE_LEVELS.ELEVATED) {
73
+ const target = clamp(Math.min(current, BASE_CONCURRENCY));
74
+ _prevScanned = stats.scanned || 0;
75
+ _prevTimeouts = (stats.errorsByType && stats.errorsByType.static_timeout) || 0;
76
+ return { target, reason: `heap_elevated (level=${heapPressure}, capping at base=${BASE_CONCURRENCY})` };
77
+ }
78
+ } catch { /* daemon.js not loaded yet on first tick — proceed with system RAM check */ }
79
+
59
80
  // Use system RAM, not V8 heap ratio (see MEMORY_FREE_THRESHOLD comment above)
60
81
  const freeMem = os.freemem();
61
82
  const totalMem = os.totalmem();
@@ -57,7 +57,7 @@ const MEMORY_PRESSURE_LEVELS = {
57
57
  const MEMORY_THRESHOLD_ELEVATED = 0.75;
58
58
  const MEMORY_THRESHOLD_HIGH = 0.85;
59
59
  const MEMORY_THRESHOLD_CRITICAL = 0.90;
60
- const MEMORY_THRESHOLD_EMERGENCY = 0.95;
60
+ const MEMORY_THRESHOLD_EMERGENCY = 0.92;
61
61
  // When truncating queue under EMERGENCY, keep the N most recent items.
62
62
  // These are the newest packages — most likely to still be on npm for re-scan.
63
63
  const EMERGENCY_QUEUE_KEEP = 500;
@@ -743,7 +743,7 @@ async function startMonitor(options, stats, dailyAlerts, recentlyScanned, downlo
743
743
  const rssMB = (currentMem.rss / 1024 / 1024).toFixed(0);
744
744
  const pctUsed = (heapRatio * 100).toFixed(0);
745
745
  const levelName = Object.keys(MEMORY_PRESSURE_LEVELS).find(k => MEMORY_PRESSURE_LEVELS[k] === pressureLevel) || 'UNKNOWN';
746
- console.log(`[MONITOR] MEMORY: heap=${heapUsedMB}MB/${heapLimitMB}MB (${pctUsed}%), rss=${rssMB}MB, queue=${scanQueue.length}, dedup=${recentlyScanned.size}, downloads=${downloadsCache.size}, alerts=${alertedPackageRules.size}, pressure=${levelName}`);
746
+ console.log(`[MONITOR] MEMORY: heap=${heapUsedMB}MB/${heapLimitMB}MB (${pctUsed}%), rss=${rssMB}MB, queue=${scanQueue.length}, dedup=${recentlyScanned.size}, downloads=${downloadsCache.size}, alerts=${alertedPackageRules.size}, dailyAlerts=${dailyAlerts.length}, pressure=${levelName}`);
747
747
 
748
748
  // Graduated response at HIGH+
749
749
  if (pressureLevel >= MEMORY_PRESSURE_LEVELS.HIGH) {
@@ -765,12 +765,19 @@ async function startMonitor(options, stats, dailyAlerts, recentlyScanned, downlo
765
765
  await sendDailyReport(stats, dailyAlerts, recentlyScanned, downloadsCache);
766
766
  // Auto-relabel JSONL training data after daily report (once per day).
767
767
  // Checks registry takedown status for unconfirmed packages.
768
+ // Guard: relabel reads the entire JSONL into memory (21-100MB). Skip if
769
+ // heap is already under pressure — will fire tomorrow instead.
768
770
  try {
769
- const { relabelDataset } = require('./auto-labeler.js');
770
- const summary = await relabelDataset({});
771
- const totalRelabeled = summary.relabeled_malicious + summary.relabeled_benign + summary.relabeled_likely_benign;
772
- if (totalRelabeled > 0) {
773
- console.log(`[MONITOR] Auto-relabel: ${summary.relabeled_malicious} malicious, ${summary.relabeled_benign} benign, ${summary.relabeled_likely_benign} likely_benign (${summary.checked} checked)`);
771
+ const relabelPressure = computeMemoryPressure();
772
+ if (relabelPressure.level >= MEMORY_PRESSURE_LEVELS.HIGH) {
773
+ console.log(`[MONITOR] Auto-relabel SKIPPED: memory pressure at ${(relabelPressure.ratio * 100).toFixed(0)}% will retry tomorrow`);
774
+ } else {
775
+ const { relabelDataset } = require('./auto-labeler.js');
776
+ const summary = await relabelDataset({});
777
+ const totalRelabeled = summary.relabeled_malicious + summary.relabeled_benign + summary.relabeled_likely_benign;
778
+ if (totalRelabeled > 0) {
779
+ console.log(`[MONITOR] Auto-relabel: ${summary.relabeled_malicious} malicious, ${summary.relabeled_benign} benign, ${summary.relabeled_likely_benign} likely_benign (${summary.checked} checked)`);
780
+ }
774
781
  }
775
782
  } catch (err) {
776
783
  // Non-fatal: relabel failure must never crash the monitor
@@ -101,6 +101,23 @@ function enqueueDeferred(item) {
101
101
 
102
102
  _deferredQueue.push(item);
103
103
  _deferredSeen.add(key);
104
+ // Strip large fields to reduce in-memory footprint.
105
+ // Keep minimal staticResult for buildAlertData() if sandbox detects something.
106
+ // Disk persistence already strips staticResult (persistDeferredQueue), this
107
+ // does the same in-memory — each item drops from ~10-50KB to ~1-2KB.
108
+ if (item.staticResult) {
109
+ item.staticResult = {
110
+ threats: (item.staticResult.threats || []).map(t => ({
111
+ type: t.type, severity: t.severity, rule_id: t.rule_id, file: t.file
112
+ })),
113
+ summary: item.staticResult.summary ? {
114
+ total: item.staticResult.summary.total,
115
+ riskScore: item.staticResult.summary.riskScore,
116
+ maxSeverity: item.staticResult.summary.maxSeverity
117
+ } : {}
118
+ };
119
+ }
120
+ delete item.npmRegistryMeta;
104
121
  // Sort by riskScore DESC (highest first)
105
122
  _deferredQueue.sort((a, b) => b.riskScore - a.riskScore);
106
123
  console.log(`[DEFERRED] ENQUEUED: ${key} (tier=${item.tier === 2 ? 'T2' : 'T1b'}, score=${item.riskScore}, queue=${_deferredQueue.length})`);
@@ -38,7 +38,8 @@ const {
38
38
  tarballCachePath,
39
39
  appendAlert,
40
40
  getParisHour,
41
- hasReportBeenSentToday
41
+ hasReportBeenSentToday,
42
+ MAX_DAILY_ALERTS
42
43
  } = require('./state.js');
43
44
 
44
45
  // From ./classify.js
@@ -899,7 +900,9 @@ async function scanPackage(name, version, ecosystem, tarballUrl, registryMeta, s
899
900
  }
900
901
 
901
902
  // Record daily alert with post-reputation score for top suspects ranking
902
- dailyAlerts.push({ name, version, ecosystem, findingsCount: result.summary.total, score: adjustedResult.summary.riskScore || 0, tier });
903
+ if (dailyAlerts.length < MAX_DAILY_ALERTS) {
904
+ dailyAlerts.push({ name, version, ecosystem, findingsCount: result.summary.total, score: adjustedResult.summary.riskScore || 0, tier });
905
+ }
903
906
  // LLM Detective: AI-powered analysis for T1a/T1b suspects
904
907
  // Skip for fast-track (large boring packages — LLM analysis adds 10-30s for no value)
905
908
  let llmResult = null;
@@ -20,6 +20,7 @@ const TEMPORAL_DETECTIONS_FILE = path.join(__dirname, '..', '..', 'data', 'tempo
20
20
  // --- Alerts/detections persistence limits ---
21
21
  const ALERTS_MAX_SIZE = 100 * 1024 * 1024; // 100MB rotation threshold (matches ml-training.jsonl)
22
22
  const MAX_DETECTIONS = 10_000; // Cap detections array — oldest entries discarded
23
+ const MAX_DAILY_ALERTS = 50_000; // Cap dailyAlerts array — prevents unbounded growth between daily resets
23
24
 
24
25
  // Local log persistence directories (parallel to Discord webhooks for offline analysis)
25
26
  // Primary: logs/ relative to project root. Fallback: /tmp/ if primary is read-only (EROFS/EACCES).
@@ -736,8 +737,9 @@ function loadDailyStats(stats, dailyAlerts) {
736
737
  stats.llmSuppressed = data.llmSuppressed || 0;
737
738
  stats.changesStreamPackages = data.changesStreamPackages || 0;
738
739
  if (Array.isArray(data.dailyAlerts)) {
740
+ const restored = data.dailyAlerts.slice(-MAX_DAILY_ALERTS);
739
741
  dailyAlerts.length = 0;
740
- dailyAlerts.push(...data.dailyAlerts);
742
+ dailyAlerts.push(...restored);
741
743
  }
742
744
  console.log(`[MONITOR] Restored daily stats: ${stats.scanned} scanned, ${stats.clean} clean, ${stats.suspect} suspect`);
743
745
  }
@@ -892,6 +894,7 @@ module.exports = {
892
894
  DAILY_STATS_PERSIST_INTERVAL,
893
895
  ALERTS_MAX_SIZE,
894
896
  MAX_DETECTIONS,
897
+ MAX_DAILY_ALERTS,
895
898
 
896
899
  // Mutable state getters/setters
897
900
  getScanMemoryCache,
@@ -11,7 +11,7 @@ const { detectSuddenLifecycleChange } = require('../temporal-analysis.js');
11
11
  const { detectSuddenAstChanges } = require('../temporal-ast-diff.js');
12
12
  const { detectPublishAnomaly } = require('../publish-anomaly.js');
13
13
  const { detectMaintainerChange } = require('../maintainer-change.js');
14
- const { appendAlert } = require('./state.js');
14
+ const { appendAlert, MAX_DAILY_ALERTS } = require('./state.js');
15
15
 
16
16
  // ---------------------------------------------------------------------------
17
17
  // Feature-flag helpers
@@ -190,13 +190,15 @@ async function runTemporalCheck(packageName, dailyAlerts) {
190
190
  }))
191
191
  });
192
192
 
193
- dailyAlerts.push({
194
- name: packageName,
195
- version: result.latestVersion,
196
- ecosystem: 'npm',
197
- findingsCount: result.findings.length,
198
- temporal: true
199
- });
193
+ if (dailyAlerts.length < MAX_DAILY_ALERTS) {
194
+ dailyAlerts.push({
195
+ name: packageName,
196
+ version: result.latestVersion,
197
+ ecosystem: 'npm',
198
+ findingsCount: result.findings.length,
199
+ temporal: true
200
+ });
201
+ }
200
202
 
201
203
  // Webhook deferred — sent after sandbox confirms (see resolveTarballAndScan)
202
204
  }
@@ -236,13 +238,15 @@ async function runTemporalAstCheck(packageName, dailyAlerts) {
236
238
  }))
237
239
  });
238
240
 
239
- dailyAlerts.push({
240
- name: packageName,
241
- version: result.latestVersion,
242
- ecosystem: 'npm',
243
- findingsCount: result.findings.length,
244
- temporalAst: true
245
- });
241
+ if (dailyAlerts.length < MAX_DAILY_ALERTS) {
242
+ dailyAlerts.push({
243
+ name: packageName,
244
+ version: result.latestVersion,
245
+ ecosystem: 'npm',
246
+ findingsCount: result.findings.length,
247
+ temporalAst: true
248
+ });
249
+ }
246
250
 
247
251
  // Webhook deferred — sent after sandbox confirms (see resolveTarballAndScan)
248
252
  }
@@ -282,13 +286,15 @@ async function runTemporalPublishCheck(packageName, dailyAlerts) {
282
286
  }))
283
287
  });
284
288
 
285
- dailyAlerts.push({
286
- name: packageName,
287
- version: 'N/A',
288
- ecosystem: 'npm',
289
- findingsCount: result.anomalies.length,
290
- temporalPublish: true
291
- });
289
+ if (dailyAlerts.length < MAX_DAILY_ALERTS) {
290
+ dailyAlerts.push({
291
+ name: packageName,
292
+ version: 'N/A',
293
+ ecosystem: 'npm',
294
+ findingsCount: result.anomalies.length,
295
+ temporalPublish: true
296
+ });
297
+ }
292
298
 
293
299
  // Webhook deferred — sent after sandbox confirms (see resolveTarballAndScan)
294
300
  }
@@ -329,13 +335,15 @@ async function runTemporalMaintainerCheck(packageName, dailyAlerts) {
329
335
  }))
330
336
  });
331
337
 
332
- dailyAlerts.push({
333
- name: packageName,
334
- version: 'N/A',
335
- ecosystem: 'npm',
336
- findingsCount: result.findings.length,
337
- temporalMaintainer: true
338
- });
338
+ if (dailyAlerts.length < MAX_DAILY_ALERTS) {
339
+ dailyAlerts.push({
340
+ name: packageName,
341
+ version: 'N/A',
342
+ ecosystem: 'npm',
343
+ findingsCount: result.findings.length,
344
+ temporalMaintainer: true
345
+ });
346
+ }
339
347
 
340
348
  // Webhook deferred — sent after sandbox confirms (see resolveTarballAndScan)
341
349
  }
@@ -3,7 +3,7 @@ const path = require('path');
3
3
  const { getRule } = require('../rules/index.js');
4
4
  const { getPlaybook } = require('../response/playbooks.js');
5
5
  const { computeReachableFiles } = require('../scanner/reachability.js');
6
- const { applyFPReductions, applyCompoundBoosts, calculateRiskScore, getSeverityWeights } = require('../scoring.js');
6
+ const { applyFPReductions, applyCompoundBoosts, calculateRiskScore, getSeverityWeights, applyContextualFPCaps } = require('../scoring.js');
7
7
  const { buildIntentPairs } = require('../intent-graph.js');
8
8
  const { debugLog } = require('../utils.js');
9
9
 
@@ -100,12 +100,21 @@ async function process(threats, targetPath, options, pythonDeps, warnings, scann
100
100
  // Read package name and dependencies for FP reduction heuristics
101
101
  let packageName = null;
102
102
  let packageDeps = null;
103
+ let _pkgMeta = null; // v2.10.97: full pkg metadata for contextual FP caps
103
104
  try {
104
105
  const pkgPath = path.join(targetPath, 'package.json');
105
106
  if (fs.existsSync(pkgPath)) {
106
107
  const pkgData = JSON.parse(fs.readFileSync(pkgPath, 'utf8'));
107
108
  packageName = pkgData.name || null;
108
109
  packageDeps = pkgData.dependencies || null;
110
+ _pkgMeta = {
111
+ name: pkgData.name,
112
+ scripts: pkgData.scripts || {},
113
+ description: pkgData.description || '',
114
+ homepage: pkgData.homepage || (typeof pkgData.repository === 'string' ? pkgData.repository : (pkgData.repository && pkgData.repository.url) || ''),
115
+ dependencies: pkgData.dependencies,
116
+ devDependencies: pkgData.devDependencies,
117
+ };
109
118
  }
110
119
  } catch { /* graceful fallback */ }
111
120
 
@@ -301,6 +310,15 @@ async function process(threats, targetPath, options, pythonDeps, warnings, scann
301
310
  scannerErrors: scannerErrors.length > 0 ? scannerErrors : undefined
302
311
  };
303
312
 
313
+ // v2.10.97: contextual FP post-filter — deterministic score caps for
314
+ // packages matching well-known FP clusters (100% precision, 302 human labels).
315
+ const fpCaps = applyContextualFPCaps(result, _pkgMeta);
316
+ if (fpCaps.length > 0) {
317
+ debugLog('[FP-CAP] ' + (packageName || targetPath) + ': ' +
318
+ fpCaps.map(c => c.feature + (c.cap > 0 ? '→MAX' + c.cap : '→suppress')).join(', ') +
319
+ ' → score=' + result.summary.riskScore);
320
+ }
321
+
304
322
  return {
305
323
  result,
306
324
  deduped,
package/src/scoring.js CHANGED
@@ -1011,8 +1011,110 @@ function calculateRiskScore(deduped, intentResult) {
1011
1011
  };
1012
1012
  }
1013
1013
 
1014
+ // ============================================
1015
+ // v2.10.97: CONTEXTUAL FP POST-FILTER
1016
+ // ============================================
1017
+ // Deterministic score caps for packages matching well-known FP clusters.
1018
+ // Each feature has 100% precision on 302 human-reviewed packages (zero
1019
+ // malware misclassified). Applied AFTER calculateRiskScore() so that
1020
+ // compound boosts and lifecycle floors have already had their say.
1021
+ const {
1022
+ bundleWithoutInstallScripts,
1023
+ installUrlGithubReleases,
1024
+ networkDestinationFirstParty,
1025
+ gitHookSourceLocal,
1026
+ typosquatScopedPackage,
1027
+ obfuscationWithoutVector,
1028
+ placeholderAntiDepConfusion,
1029
+ } = require('./ml/feature-extractor.js');
1030
+
1031
+ /**
1032
+ * Apply contextual FP score caps to a scan result.
1033
+ * Mutates result.summary.riskScore / riskLevel in-place.
1034
+ * Returns array of { feature, cap } describing applied caps (empty if none).
1035
+ */
1036
+ function applyContextualFPCaps(result, pkgMeta) {
1037
+ if (!result || !result.summary) return [];
1038
+
1039
+ const meta = {
1040
+ name: pkgMeta && pkgMeta.name,
1041
+ registryMeta: {
1042
+ scripts: (pkgMeta && pkgMeta.scripts) || {},
1043
+ description: (pkgMeta && pkgMeta.description) || '',
1044
+ homepage: (pkgMeta && pkgMeta.homepage) || '',
1045
+ dependencies: (pkgMeta && pkgMeta.dependencies),
1046
+ devDependencies: (pkgMeta && pkgMeta.devDependencies),
1047
+ },
1048
+ };
1049
+
1050
+ const applied = [];
1051
+
1052
+ // F7: placeholder anti-dep-confusion → MAX 20
1053
+ if (placeholderAntiDepConfusion(result, meta)) {
1054
+ applied.push({ feature: 'placeholder_anti_dep_confusion', cap: 20 });
1055
+ }
1056
+ // F1: minified bundle without install scripts → MAX 30
1057
+ if (bundleWithoutInstallScripts(result, meta)) {
1058
+ applied.push({ feature: 'bundle_without_install_scripts', cap: 30 });
1059
+ }
1060
+ // F3: credential destination first-party API → MAX 30
1061
+ if (networkDestinationFirstParty(result, meta)) {
1062
+ applied.push({ feature: 'network_destination_first_party', cap: 30 });
1063
+ }
1064
+ // F2: binary installer from GitHub Releases → MAX 35
1065
+ if (installUrlGithubReleases(result)) {
1066
+ applied.push({ feature: 'install_url_github_releases', cap: 35 });
1067
+ }
1068
+ // F4: git hooks from local source → MAX 35
1069
+ if (gitHookSourceLocal(result)) {
1070
+ applied.push({ feature: 'git_hook_source_local', cap: 35 });
1071
+ }
1072
+ // F6: commercial obfuscation without attack vector → MAX 35
1073
+ if (obfuscationWithoutVector(result)) {
1074
+ applied.push({ feature: 'obfuscation_without_vector', cap: 35 });
1075
+ }
1076
+ // F5: typosquat on scoped package → suppress typosquat points
1077
+ if (typosquatScopedPackage(result, meta)) {
1078
+ applied.push({ feature: 'typosquat_scoped_package', cap: -1 });
1079
+ }
1080
+
1081
+ if (applied.length === 0) return applied;
1082
+
1083
+ // Apply the tightest (lowest) cap
1084
+ const caps = applied.filter(a => a.cap > 0);
1085
+ const lowestCap = caps.length > 0 ? Math.min(...caps.map(a => a.cap)) : Infinity;
1086
+
1087
+ if (lowestCap < result.summary.riskScore) {
1088
+ result.summary.riskScore = lowestCap;
1089
+ result.summary.riskLevel =
1090
+ lowestCap >= _riskThresholds.CRITICAL ? 'CRITICAL'
1091
+ : lowestCap >= _riskThresholds.HIGH ? 'HIGH'
1092
+ : lowestCap >= _riskThresholds.MEDIUM ? 'MEDIUM'
1093
+ : lowestCap > 0 ? 'LOW' : 'SAFE';
1094
+ }
1095
+
1096
+ // F5: subtract typosquat points from score
1097
+ if (applied.find(a => a.feature === 'typosquat_scoped_package')) {
1098
+ const typoPoints = result.threats
1099
+ .filter(t => t.type === 'typosquat_detected' || t.type === 'lifecycle_typosquat')
1100
+ .reduce((s, t) => s + (t.points || 0), 0);
1101
+ if (typoPoints > 0) {
1102
+ result.summary.riskScore = Math.max(0, result.summary.riskScore - typoPoints);
1103
+ const rs = result.summary.riskScore;
1104
+ result.summary.riskLevel =
1105
+ rs >= _riskThresholds.CRITICAL ? 'CRITICAL'
1106
+ : rs >= _riskThresholds.HIGH ? 'HIGH'
1107
+ : rs >= _riskThresholds.MEDIUM ? 'MEDIUM'
1108
+ : rs > 0 ? 'LOW' : 'SAFE';
1109
+ }
1110
+ }
1111
+
1112
+ return applied;
1113
+ }
1114
+
1014
1115
  module.exports = {
1015
1116
  SEVERITY_WEIGHTS, RISK_THRESHOLDS, MAX_RISK_SCORE, CONFIDENCE_FACTORS,
1016
1117
  isPackageLevelThreat, computeGroupScore, applyFPReductions, applyCompoundBoosts, calculateRiskScore,
1017
- applyConfigOverrides, resetConfigOverrides, getSeverityWeights, getRiskThresholds
1118
+ applyConfigOverrides, resetConfigOverrides, getSeverityWeights, getRiskThresholds,
1119
+ applyContextualFPCaps
1018
1120
  };