muaddib-scanner 2.11.48 → 2.11.52
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +54 -27
- package/package.json +1 -1
- package/{self-scan-v2.11.48.json → self-scan-v2.11.52.json} +1 -1
- package/src/monitor/ingestion.js +245 -35
- package/src/monitor/queue.js +157 -65
- package/src/monitor/webhook.js +68 -0
- package/src/pipeline/executor.js +59 -20
- package/src/response/playbooks.js +9 -0
- package/src/rules/index.js +40 -0
- package/src/scanner/ast-detectors/handle-call-expression.js +26 -1
- package/src/scanner/ast-detectors/handle-literal.js +37 -0
- package/src/scoring.js +14 -0
package/README.md
CHANGED
|
@@ -30,7 +30,7 @@
|
|
|
30
30
|
|
|
31
31
|
npm and PyPI supply-chain attacks are exploding. Shai-Hulud compromised 25K+ repos in 2025. Existing tools detect threats but don't help you respond.
|
|
32
32
|
|
|
33
|
-
MUAD'DIB combines **
|
|
33
|
+
MUAD'DIB combines **20 parallel scanners** (262 detection rules), a **deobfuscation engine**, **inter-module dataflow analysis**, **compound scoring** (17 compound rules), and a gVisor/Docker sandbox to detect known threats and suspicious behavioral patterns in npm and PyPI packages. An XGBoost classifier exists in the codebase but is **currently inactive** (see [Evaluation Metrics](#evaluation-metrics) → ML Classifier section).
|
|
34
34
|
|
|
35
35
|
---
|
|
36
36
|
|
|
@@ -169,14 +169,14 @@ muaddib scrape # Full IOC refresh (~5min)
|
|
|
169
169
|
muaddib diff HEAD~1 # Compare threats with previous commit
|
|
170
170
|
muaddib init-hooks # Pre-commit hooks (husky/pre-commit/git)
|
|
171
171
|
muaddib scan . --breakdown # Explainable score decomposition
|
|
172
|
-
muaddib replay # Ground truth validation (
|
|
172
|
+
muaddib replay # Ground truth validation (90/94 TPR@3, v2.11.48)
|
|
173
173
|
```
|
|
174
174
|
|
|
175
175
|
---
|
|
176
176
|
|
|
177
177
|
## Features
|
|
178
178
|
|
|
179
|
-
###
|
|
179
|
+
### 20 parallel scanners
|
|
180
180
|
|
|
181
181
|
| Scanner | Detection |
|
|
182
182
|
|---------|-----------|
|
|
@@ -198,10 +198,13 @@ muaddib replay # Ground truth validation (61/65 TPR@3)
|
|
|
198
198
|
| Anti-Forensic AST (intel-triage P1.2) | XOR loop + self-delete + decoy write compound (csec autodelete) |
|
|
199
199
|
| Stub Package (intel-triage P1.3) | Tiny main file + external dep URL + lifecycle hook (ltidi chain) |
|
|
200
200
|
| Monorepo Scanner | Lerna/pnpm-workspace/turbo detection (Sprint 1 audit MR-C2 fix) |
|
|
201
|
+
| Trusted-Dep-Diff (opt-in) | Diff against trusted dep tarballs from registry (v2.10.x) |
|
|
202
|
+
| Python Source (PYSRC) | Import-time / install-time RCE patterns in `__init__.py` / `setup.py` (v2.11.41 — closes TrapDoor PyPI gap) |
|
|
203
|
+
| Python AST (PYAST) | Tree-sitter-Python AST with taint-aware detectors (v2.11.42+) |
|
|
201
204
|
|
|
202
|
-
###
|
|
205
|
+
### 259 detection rules
|
|
203
206
|
|
|
204
|
-
All rules (
|
|
207
|
+
All rules (254 RULES + 5 PARANOID) are mapped to MITRE ATT&CK techniques. See [SECURITY.md](SECURITY.md#detection-rules-v21147) for the complete rules reference.
|
|
205
208
|
|
|
206
209
|
### Detected campaigns
|
|
207
210
|
|
|
@@ -275,7 +278,7 @@ With pre-commit framework:
|
|
|
275
278
|
```yaml
|
|
276
279
|
repos:
|
|
277
280
|
- repo: https://github.com/DNSZLSK/muad-dib
|
|
278
|
-
rev: v2.11.
|
|
281
|
+
rev: v2.11.48
|
|
279
282
|
hooks:
|
|
280
283
|
- id: muaddib-scan
|
|
281
284
|
```
|
|
@@ -284,33 +287,57 @@ repos:
|
|
|
284
287
|
|
|
285
288
|
## Evaluation Metrics
|
|
286
289
|
|
|
290
|
+
Latest measurement: **v2.11.48** (2026-05-26, Track D + PyPI download fix). Ground truth holds 96 samples (94 in-scope, 2 out-of-scope protestware). This run measures the full 94 in-scope set after the 2026-05-25 enrichment (Track C synthetic for the new PYSRC/PYAST/AST-092/AICONF-004/PKG-022 rules, Track A real-world tarballs recovered from VPS archive, Track B reconstructions from the in-house security-review benchmark).
|
|
291
|
+
|
|
292
|
+
### Operational metrics (what an operator actually gets)
|
|
293
|
+
|
|
294
|
+
These are the numbers a user gets when running `muaddib scan` against npm or PyPI packages. The pipeline executes scanners + FP caps only — no ML filter is applied (see ML Classifier note below).
|
|
295
|
+
|
|
287
296
|
| Metric | Result | Details |
|
|
288
297
|
|--------|--------|---------|
|
|
298
|
+
| **Wild TPR** (Datadog 17K) | **92.8%** (13,538/14,587 in-scope) | 17,922 packages. 3,335 skipped (no JS). By category: compromised_lib 97.8%, malicious_intent 92.1% — last measurement v2.9.4, independent of GT. |
|
|
299
|
+
| **TPR@3** (detection rate, v2.11.48) | **95.74%** (90/94 in-scope) | Full GT re-measurement. Threshold=3: any signal. 13 PyPI samples (was 0). 4 misses incl. 3 browser-only (lottie-player, polyfill-io, trojanized-jquery). |
|
|
300
|
+
| **TPR@20** (alert rate, v2.11.48) | **88.30%** (83/94 in-scope) | Operational alert threshold=20. **+3.1pp vs v2.11.47** — Track D `recon_exfil_direct_ip` compound (MUADDIB-COMPOUND-016) closed the GT-095 gap (risk 3→50) and boosted GT-091 byvendors / GT-092 heloo131313 through `linux_fingerprint_exec`. |
|
|
301
|
+
| **FPR rules** (Benign curated, v2.11.48 measure) | **1.10%** (6/545 scanned, 548 total) | **Unchanged after Track D** — the new compound + types created zero new FPs (sameFile gate + public-IP-only filter). Drop from 15.6% (v2.10.95) is attributable to FP caps F1-F14 (v2.10.97 → v2.11.31). 6 remaining FPs are real (meteor, prisma, @prisma/client, drizzle-orm, scrypt, liquid). |
|
|
302
|
+
| **FPR** (Benign random, v2.11.48) | **2.50%** (5/200) | 200 random npm packages, unchanged. |
|
|
303
|
+
| **FPR PyPI** (v2.11.48, first honest measurement) | **9.68%** (12/124 scanned, 132 total) | **Track D fixed the PyPI downloader** — removed `pip --no-binary :all:` flag (forced compile of wheel-only packages, timed out 38% of the time) + added `.whl` extraction via `extractArchive()`. Brought 42 previously-skipped giants (numpy/pandas/django/matplotlib/scikit-learn/...) into scope. All 12 FPs cluster at score 25-35: this is the cap-PyPI-35 artifact, not new rule misfires. Lifting the cap (Track E) would drop FPR PyPI to ≈0%. 8 residual fails are >500MB packages (torch, tensorflow, scipy, opencv-python, ansible…) hitting the 30s `PACK_TIMEOUT_MS`. |
|
|
304
|
+
| **ADR** (Adversarial + Holdout, v2.11.48) | **96.26%** (103/107) | 67 adversarial + 40 holdout, global threshold=20. Stable vs v2.10.95. |
|
|
305
|
+
|
|
306
|
+
**3913 tests** across 109 files. **262 rules** (257 RULES + 5 PARANOID — Track D added 3: AST-093, AST-094, COMPOUND-016).
|
|
307
|
+
|
|
308
|
+
**Known issues (v2.11.48):**
|
|
309
|
+
- *Cap PyPI à 35/100*: Python samples plafonnent à `riskScore=35` even when `globalRiskScore=100`. Confirmed empirically — all 12 PyPI FPs at score 25-35 (flask 32, django 35, tornado 35, bottle 30, pandas 25, matplotlib 25, plotly 25, bokeh 25, pymongo 35, coverage 32, fabric 35, websockets 35). Lifting the cap will simultaneously drop FPR PyPI to ≈0% and unblock PyPI MALWARE detection at higher thresholds. Track E target.
|
|
310
|
+
|
|
311
|
+
### ML Classifier (offline only)
|
|
312
|
+
|
|
313
|
+
`src/ml/classifier.js` is **not wired into `muaddib scan`**. The XGBoost model is currently exercised only by `muaddib evaluate` (offline metric replay) and `muaddib monitor` (LOG-ONLY since 2026-04-08, model collapsed pending retrain — see `src/monitor/queue.js:628`). The v2.11.48 evaluate-time replay shows the same 1.10% FPR (no additional FPs filtered) — kept as a reference for retrain validation, but the published operational FPR is the rules-only number above.
|
|
314
|
+
|
|
315
|
+
> **Static evaluation caveats:**
|
|
316
|
+
> - TPR measured on the full 94 in-scope samples from the 96-sample ground truth (2 out-of-scope protestware GT-005/GT-009 with `min_threats=0`)
|
|
317
|
+
> - TPR@3 = detection rate (any signal); TPR@20 = operational alert threshold
|
|
318
|
+
> - FPR rules measured on 548 curated popular npm packages (not a random sample)
|
|
319
|
+
> - FPR PyPI: 124/132 scanned (8 download fails on >500MB giants — torch/tensorflow/ansible/…). Smaller N than npm.
|
|
320
|
+
> - ADR measured with global threshold (score >= 20) as of v2.6.5
|
|
321
|
+
|
|
322
|
+
See [Evaluation Methodology](docs/EVALUATION_METHODOLOGY.md) for the full experimental protocol, holdout history, and Datadog benchmark details.
|
|
323
|
+
|
|
324
|
+
### ML Classifier — R&D, currently inactive
|
|
325
|
+
|
|
326
|
+
> **Status (2026-04-08 → present):** The XGBoost classifier (`src/ml/classifier.js`) is **not wired into `muaddib scan`** at all, and in `muaddib monitor` it runs in **LOG-ONLY mode** since 2026-04-08 — the trained model collapsed (predicts p≈0.002 for every input, including clearly malicious lifecycle+exec+staged_payload patterns) and was disabled pending retrain on balanced JSONL data. The metrics below come from offline `muaddib evaluate` replay against a frozen bench. They describe what the model *would* contribute if it worked, **not** what an operator gets today.
|
|
327
|
+
|
|
328
|
+
| Metric (offline `evaluate` replay) | Result | Details |
|
|
329
|
+
|--------|--------|---------|
|
|
289
330
|
| **ML FPR** | **2.85%** (239/8,393 holdout) | XGBoost retrained on 56,564 samples, 64 features, threshold=0.710 |
|
|
290
331
|
| **ML TPR** | **99.93%** (2,918/2,920 holdout) | 377 confirmed_malicious via OSSF/GHSA/npm correlation |
|
|
291
|
-
| **
|
|
292
|
-
| **TPR@3** (detection rate) | **93.85%** (61/65) | 67 real attacks (65 active, 2 out-of-scope: GT-005 colors, GT-009 faker — protestware with min_threats=0). Threshold=3: any signal |
|
|
293
|
-
| **TPR@20** (alert rate) | **86.2%** (56/65) | Operational alert threshold=20, aligned with ADR/FPR |
|
|
294
|
-
| **FPR rules** (Benign curated, v2.10.95 measure) | **15.6%** (85/545 scanned, 548 total) | npm packages, real source via `npm pack`; v2.10.74 estimated 6-9% reduction did NOT materialize on rebuilt corpus |
|
|
295
|
-
| **FPR after ML** (v2.10.95 measure) | **10.28%** (56/545 scanned) | ML filters 29/30 T1 benign, 0 GT/ADR suppressed |
|
|
296
|
-
| **FPR** (Benign random, v2.10.95 measure) | **7.0%** (14/200) | 200 random npm packages, stratified sampling |
|
|
297
|
-
| **ADR** (Adversarial + Holdout) | **96.3%** (103/107) | 67 adversarial + 40 holdout (107 available on disk), global threshold=20 |
|
|
298
|
-
|
|
299
|
-
**3664 tests** across 93 files. **234 rules** (229 RULES + 5 PARANOID).
|
|
332
|
+
| **FPR after ML T1** (offline replay, v2.11.48) | **1.10%** (6/545 scanned) | Classifier filters 0/6 raw FPs in this run (filtered 1 at v2.11.47). Not applied during real scans — `muaddib scan` never invokes the classifier. |
|
|
300
333
|
|
|
301
|
-
> **
|
|
334
|
+
> **Retrain methodology (v2.10.51):**
|
|
302
335
|
> - Ground truth: 377 confirmed_malicious via auto-labeler (OSSF malicious-packages, GitHub Advisory Database, npm registry takedown correlation)
|
|
303
336
|
> - Dataset: 56,564 samples (14,602 malicious, 41,962 clean). Stratified 80/20 split
|
|
304
337
|
> - Grid search: depth=4, estimators=300, lr=0.05. AUC-ROC=0.999, F1=0.960
|
|
305
338
|
> - Leaky feature filter: 23 dead/leaky features removed (source-identity proxies)
|
|
306
339
|
>
|
|
307
|
-
>
|
|
308
|
-
> - TPR measured on 65 active Node.js attack samples (2 out-of-scope: GT-005 colors, GT-009 faker, both protestware with min_threats=0; from 67 total)
|
|
309
|
-
> - TPR@3 = detection rate (any signal); TPR@20 = operational alert threshold
|
|
310
|
-
> - FPR measured on 532 curated popular npm packages (not a random sample)
|
|
311
|
-
> - ADR measured with global threshold (score >= 20) as of v2.6.5
|
|
312
|
-
|
|
313
|
-
See [Evaluation Methodology](docs/EVALUATION_METHODOLOGY.md) for the full experimental protocol, holdout history, and Datadog benchmark details.
|
|
340
|
+
> The shadow model continues to log predictions in `muaddib monitor` for retraining validation. When the next model passes shadow validation, the LOG-ONLY guard in `src/monitor/queue.js:660` will be flipped and the metrics above will move back into the operational table.
|
|
314
341
|
|
|
315
342
|
---
|
|
316
343
|
|
|
@@ -344,11 +371,11 @@ npm test
|
|
|
344
371
|
|
|
345
372
|
### Testing
|
|
346
373
|
|
|
347
|
-
- **
|
|
374
|
+
- **3913 tests** across 109 modular test files
|
|
348
375
|
- **56 fuzz tests** - Malformed inputs, ReDoS, unicode, binary
|
|
349
376
|
- **Datadog 17K benchmark** - 14,587 confirmed malware samples (in-scope)
|
|
350
|
-
- **Ground truth validation** -
|
|
351
|
-
- **False positive validation** (v2.
|
|
377
|
+
- **Ground truth validation** - 96 real-world attacks (95.74% TPR@3, 88.30% TPR@20 — v2.11.48 full measure on 94 in-scope)
|
|
378
|
+
- **False positive validation** (v2.11.48 measure) - 1.10% FPR rules (6/545 scanned), 2.50% on 200 random, 9.68% on 124/132 PyPI (first honest measurement post-Track-D download fix). ML classifier currently inactive — see Evaluation Metrics → ML Classifier.
|
|
352
379
|
|
|
353
380
|
---
|
|
354
381
|
|
|
@@ -365,7 +392,7 @@ npm test
|
|
|
365
392
|
- [Documentation Index](docs/INDEX.md) - All documentation in one place
|
|
366
393
|
- [Evaluation Methodology](docs/EVALUATION_METHODOLOGY.md) - Experimental protocol, holdout scores
|
|
367
394
|
- [Threat Model](docs/threat-model.md) - What MUAD'DIB detects and doesn't detect
|
|
368
|
-
- [Security Policy](SECURITY.md) - Detection rules reference (
|
|
395
|
+
- [Security Policy](SECURITY.md) - Detection rules reference (259 rules)
|
|
369
396
|
- [Security Audit](docs/SECURITY_AUDIT.md) - Bypass validation report
|
|
370
397
|
- [FP Analysis](docs/EVALUATION.md) - Historical false positive analysis
|
|
371
398
|
|
package/package.json
CHANGED
package/src/monitor/ingestion.js
CHANGED
|
@@ -141,7 +141,10 @@ async function getWeeklyDownloads(packageName) {
|
|
|
141
141
|
}
|
|
142
142
|
try {
|
|
143
143
|
const url = `https://api.npmjs.org/downloads/point/last-week/${encodeURIComponent(packageName)}`;
|
|
144
|
-
|
|
144
|
+
// Routed via _deps so tests can stub the downloads endpoint independently
|
|
145
|
+
// of the registry endpoint (Stage 2.1 added parallel-fetch from
|
|
146
|
+
// preResolveNpmBatch).
|
|
147
|
+
const body = await _deps.httpsGet(url, 3000);
|
|
145
148
|
const data = JSON.parse(body);
|
|
146
149
|
const downloads = typeof data.downloads === 'number' ? data.downloads : -1;
|
|
147
150
|
downloadsCache.set(packageName, { downloads, fetchedAt: Date.now() });
|
|
@@ -158,12 +161,15 @@ function getNpmTarballUrl(pkgData) {
|
|
|
158
161
|
}
|
|
159
162
|
|
|
160
163
|
async function getPyPITarballUrl(packageName, packageVersion = '') {
|
|
161
|
-
//
|
|
162
|
-
//
|
|
163
|
-
//
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
|
|
164
|
+
// Always hit the package-level endpoint. It contains:
|
|
165
|
+
// - info.version → latest version
|
|
166
|
+
// - urls → files for the latest version
|
|
167
|
+
// - releases → files for ALL versions (so we can find packageVersion's
|
|
168
|
+
// exact artifact, same anti-race guarantee as the per-
|
|
169
|
+
// version endpoint used to provide)
|
|
170
|
+
// We extract triage metadata (age_days, version_count) from `releases` in
|
|
171
|
+
// the same round-trip — keeps Stage 2's PyPI cost at 1 HTTP call.
|
|
172
|
+
const url = `https://pypi.org/pypi/${encodeURIComponent(packageName)}/json`;
|
|
167
173
|
const body = await _deps.httpsGet(url);
|
|
168
174
|
let data;
|
|
169
175
|
try {
|
|
@@ -171,20 +177,58 @@ async function getPyPITarballUrl(packageName, packageVersion = '') {
|
|
|
171
177
|
} catch (e) {
|
|
172
178
|
throw new Error(`Invalid JSON from PyPI for ${packageName}: ${e.message}`);
|
|
173
179
|
}
|
|
174
|
-
|
|
175
|
-
const
|
|
176
|
-
|
|
177
|
-
const
|
|
178
|
-
|
|
179
|
-
//
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
180
|
+
|
|
181
|
+
const latestVersion = (data.info && data.info.version) || '';
|
|
182
|
+
const version = packageVersion || latestVersion;
|
|
183
|
+
const releases = (data && data.releases) || {};
|
|
184
|
+
|
|
185
|
+
// Pick files for the requested version (preserves the original anti-race
|
|
186
|
+
// guarantee — we scan the exact version flagged by the changelog). If
|
|
187
|
+
// absent (e.g. lazy resolution without a known version), use latest urls.
|
|
188
|
+
const files = (packageVersion && Array.isArray(releases[packageVersion]))
|
|
189
|
+
? releases[packageVersion]
|
|
190
|
+
: (Array.isArray(data.urls) ? data.urls : []);
|
|
191
|
+
|
|
192
|
+
// Tarball selection priority unchanged: sdist > .tar.gz > .whl/.zip.
|
|
193
|
+
// Legacy .egg / .tar.bz2 / .exe intentionally not returned (they were the
|
|
194
|
+
// cause of ~2773 tar_failed/day before the original fix).
|
|
195
|
+
let tarballUrl = null;
|
|
196
|
+
const sdist = files.find(u => u && u.packagetype === 'sdist' && u.url);
|
|
197
|
+
if (sdist) {
|
|
198
|
+
tarballUrl = sdist.url;
|
|
199
|
+
} else {
|
|
200
|
+
const tarGz = files.find(u => u && u.url && u.url.endsWith('.tar.gz'));
|
|
201
|
+
if (tarGz) {
|
|
202
|
+
tarballUrl = tarGz.url;
|
|
203
|
+
} else {
|
|
204
|
+
const wheel = files.find(u => u && u.url && (u.url.endsWith('.whl') || u.url.endsWith('.zip')));
|
|
205
|
+
if (wheel) tarballUrl = wheel.url;
|
|
206
|
+
}
|
|
207
|
+
}
|
|
208
|
+
|
|
209
|
+
// Stage 2 triage metadata: derived from `releases` once per fetch.
|
|
210
|
+
const versionCount = Object.keys(releases).length;
|
|
211
|
+
let earliestUpload = Number.MAX_SAFE_INTEGER;
|
|
212
|
+
for (const v of Object.keys(releases)) {
|
|
213
|
+
const versionFiles = releases[v];
|
|
214
|
+
if (!Array.isArray(versionFiles)) continue;
|
|
215
|
+
for (const f of versionFiles) {
|
|
216
|
+
if (f && f.upload_time) {
|
|
217
|
+
const ts = Date.parse(f.upload_time);
|
|
218
|
+
if (Number.isFinite(ts) && ts < earliestUpload) earliestUpload = ts;
|
|
219
|
+
}
|
|
220
|
+
}
|
|
221
|
+
}
|
|
222
|
+
const ageDays = earliestUpload !== Number.MAX_SAFE_INTEGER
|
|
223
|
+
? Math.floor((Date.now() - earliestUpload) / 86_400_000)
|
|
224
|
+
: null;
|
|
225
|
+
|
|
226
|
+
return {
|
|
227
|
+
url: tarballUrl,
|
|
228
|
+
version,
|
|
229
|
+
age_days: ageDays,
|
|
230
|
+
version_count: versionCount,
|
|
231
|
+
};
|
|
188
232
|
}
|
|
189
233
|
|
|
190
234
|
// --- RSS parsing ---
|
|
@@ -372,7 +416,7 @@ async function getNpmLatestTarball(packageName) {
|
|
|
372
416
|
await acquireRegistrySlot();
|
|
373
417
|
let body;
|
|
374
418
|
try {
|
|
375
|
-
body = await httpsGet(url);
|
|
419
|
+
body = await _deps.httpsGet(url);
|
|
376
420
|
} finally {
|
|
377
421
|
releaseRegistrySlot();
|
|
378
422
|
}
|
|
@@ -388,11 +432,153 @@ async function getNpmLatestTarball(packageName) {
|
|
|
388
432
|
version: '', tarball: null, unpackedSize: 0, scripts: {},
|
|
389
433
|
homepage: '', description: '',
|
|
390
434
|
latestTagVersion: null, recentVersions: [],
|
|
435
|
+
age_days: null, version_count: 0,
|
|
391
436
|
};
|
|
392
437
|
}
|
|
438
|
+
// Stage 2.1 — extract reputation signals from the packument we already have,
|
|
439
|
+
// so triageRisk in queue.js doesn't have to refetch metadata via
|
|
440
|
+
// getPackageMetadata. Two fields are derivable from the packument alone:
|
|
441
|
+
// - age_days : time.created (package creation timestamp)
|
|
442
|
+
// - version_count : Object.keys(versions).length (excludes unpublished
|
|
443
|
+
// tombstones kept only in `time`)
|
|
444
|
+
// weekly_downloads requires a separate api.npmjs.org call and is fetched in
|
|
445
|
+
// parallel by preResolveNpmBatch (it has its own cache + no semaphore).
|
|
446
|
+
const createdAt = (packument && packument.time && packument.time.created) || null;
|
|
447
|
+
result.age_days = createdAt
|
|
448
|
+
? Math.floor((Date.now() - new Date(createdAt).getTime()) / 86_400_000)
|
|
449
|
+
: null;
|
|
450
|
+
result.version_count = (packument && packument.versions)
|
|
451
|
+
? Object.keys(packument.versions).length : 0;
|
|
393
452
|
return result;
|
|
394
453
|
}
|
|
395
454
|
|
|
455
|
+
// --- Pre-resolution helpers ---
|
|
456
|
+
//
|
|
457
|
+
// Resolve tarball URLs and metadata at ingestion time so scan workers do not
|
|
458
|
+
// each pay a separate registry round-trip. Best-effort: any failure leaves
|
|
459
|
+
// item.tarballUrl untouched (null) so resolveTarballAndScan() in queue.js
|
|
460
|
+
// falls back to its existing lazy-resolution path (zero scan loss).
|
|
461
|
+
//
|
|
462
|
+
// HTTP throttling: getNpmLatestTarball / getPyPITarballUrl already acquire
|
|
463
|
+
// the shared REGISTRY_SEMAPHORE_MAX=20 slot + 30 req/sec token bucket, so
|
|
464
|
+
// fan-out is naturally bounded — bursts queue up rather than overrun the
|
|
465
|
+
// registry. We still chunk explicitly below so the Promise closures don't
|
|
466
|
+
// pile up on a 1000-item catch-up batch (each waiting on the semaphore
|
|
467
|
+
// holds ~10KB of state; 1000 of them is a needless heap spike).
|
|
468
|
+
const PRE_RESOLVE_CHUNK_SIZE = 50;
|
|
469
|
+
|
|
470
|
+
// If a scanQueue is provided, items are pushed onto it as soon as their chunk
|
|
471
|
+
// finishes resolution — so a crash mid-batch only loses the current chunk's
|
|
472
|
+
// in-flight work, not all the chunks that already completed. When scanQueue
|
|
473
|
+
// is omitted (unit tests, lib usage), items are only mutated in place and the
|
|
474
|
+
// caller decides when to push.
|
|
475
|
+
async function preResolveNpmBatch(items, stats, scanQueue) {
|
|
476
|
+
if (!items || items.length === 0) return;
|
|
477
|
+
const start = Date.now();
|
|
478
|
+
let resolved = 0;
|
|
479
|
+
let alreadyResolved = 0;
|
|
480
|
+
let failed = 0;
|
|
481
|
+
for (let i = 0; i < items.length; i += PRE_RESOLVE_CHUNK_SIZE) {
|
|
482
|
+
const chunk = items.slice(i, i + PRE_RESOLVE_CHUNK_SIZE);
|
|
483
|
+
await Promise.all(chunk.map(async (item) => {
|
|
484
|
+
if (item.tarballUrl) { alreadyResolved++; return; }
|
|
485
|
+
try {
|
|
486
|
+
// Stage 2.1 — fetch downloads in parallel with the packument. The
|
|
487
|
+
// downloads endpoint (api.npmjs.org) is not on the registry semaphore
|
|
488
|
+
// and has its own internal cache, so this is effectively free in the
|
|
489
|
+
// warm-cache case and adds at most one parallel HTTP otherwise.
|
|
490
|
+
const [npmInfo, weeklyDownloads] = await Promise.all([
|
|
491
|
+
getNpmLatestTarball(item.name),
|
|
492
|
+
getWeeklyDownloads(item.name).catch(() => null)
|
|
493
|
+
]);
|
|
494
|
+
if (npmInfo && npmInfo.tarball) {
|
|
495
|
+
item.tarballUrl = npmInfo.tarball;
|
|
496
|
+
if (!item.version) item.version = npmInfo.version || '';
|
|
497
|
+
if (!item.unpackedSize) item.unpackedSize = npmInfo.unpackedSize || 0;
|
|
498
|
+
if (!item.registryScripts) item.registryScripts = npmInfo.scripts || null;
|
|
499
|
+
// weekly_downloads is best-effort. getWeeklyDownloads returns -1 on
|
|
500
|
+
// failure; normalize that to null so triageRisk treats it as missing
|
|
501
|
+
// (rather than silently biasing the reputation factor toward "suspect").
|
|
502
|
+
npmInfo.weekly_downloads = (typeof weeklyDownloads === 'number' && weeklyDownloads >= 0)
|
|
503
|
+
? weeklyDownloads : null;
|
|
504
|
+
// Stash full packument-derived metadata for resolveTarballAndScan so
|
|
505
|
+
// the worker can run ATO-signature, burst-extras, and fast-track logic
|
|
506
|
+
// without a second registry call. Stage 2.1 enriches this with
|
|
507
|
+
// age_days / version_count (from getNpmLatestTarball) and
|
|
508
|
+
// weekly_downloads (from getWeeklyDownloads) so the triage block in
|
|
509
|
+
// queue.js can read meta directly without re-fetching.
|
|
510
|
+
item._npmInfo = npmInfo;
|
|
511
|
+
resolved++;
|
|
512
|
+
} else {
|
|
513
|
+
failed++;
|
|
514
|
+
}
|
|
515
|
+
} catch {
|
|
516
|
+
// Silent: worker will retry via lazy resolution. Logging here would
|
|
517
|
+
// double-count errors that the worker already surfaces.
|
|
518
|
+
failed++;
|
|
519
|
+
}
|
|
520
|
+
}));
|
|
521
|
+
// Crash resilience: surface this chunk to the queue now, before the next
|
|
522
|
+
// chunk starts. If the process dies between chunks we still keep the work
|
|
523
|
+
// already done. Items keep their original order because chunks complete
|
|
524
|
+
// sequentially.
|
|
525
|
+
if (scanQueue) {
|
|
526
|
+
for (const item of chunk) scanQueue.push(item);
|
|
527
|
+
}
|
|
528
|
+
}
|
|
529
|
+
if (stats) {
|
|
530
|
+
stats.npmPreResolved = (stats.npmPreResolved || 0) + resolved;
|
|
531
|
+
stats.npmPreResolveFailed = (stats.npmPreResolveFailed || 0) + failed;
|
|
532
|
+
}
|
|
533
|
+
if (items.length >= 5) {
|
|
534
|
+
const elapsed = Date.now() - start;
|
|
535
|
+
console.log(`[MONITOR] PRE-RESOLVE npm: ${resolved}/${items.length} in ${elapsed}ms (${failed} → lazy fallback${alreadyResolved ? `, ${alreadyResolved} already resolved` : ''})`);
|
|
536
|
+
}
|
|
537
|
+
}
|
|
538
|
+
|
|
539
|
+
async function preResolvePyPIBatch(items, stats, scanQueue) {
|
|
540
|
+
if (!items || items.length === 0) return;
|
|
541
|
+
const start = Date.now();
|
|
542
|
+
let resolved = 0;
|
|
543
|
+
let alreadyResolved = 0;
|
|
544
|
+
let failed = 0;
|
|
545
|
+
for (let i = 0; i < items.length; i += PRE_RESOLVE_CHUNK_SIZE) {
|
|
546
|
+
const chunk = items.slice(i, i + PRE_RESOLVE_CHUNK_SIZE);
|
|
547
|
+
await Promise.all(chunk.map(async (item) => {
|
|
548
|
+
if (item.tarballUrl) { alreadyResolved++; return; }
|
|
549
|
+
try {
|
|
550
|
+
const pypiInfo = await getPyPITarballUrl(item.name, item.version || '');
|
|
551
|
+
if (pypiInfo && pypiInfo.url) {
|
|
552
|
+
item.tarballUrl = pypiInfo.url;
|
|
553
|
+
if (!item.version && pypiInfo.version) item.version = pypiInfo.version;
|
|
554
|
+
// Stage 2 triage signals: stash age_days + version_count for
|
|
555
|
+
// triageRisk() to read in queue.js without a second registry call.
|
|
556
|
+
item._pypiInfo = {
|
|
557
|
+
age_days: pypiInfo.age_days,
|
|
558
|
+
version_count: pypiInfo.version_count,
|
|
559
|
+
};
|
|
560
|
+
resolved++;
|
|
561
|
+
} else {
|
|
562
|
+
failed++;
|
|
563
|
+
}
|
|
564
|
+
} catch {
|
|
565
|
+
failed++;
|
|
566
|
+
}
|
|
567
|
+
}));
|
|
568
|
+
if (scanQueue) {
|
|
569
|
+
for (const item of chunk) scanQueue.push(item);
|
|
570
|
+
}
|
|
571
|
+
}
|
|
572
|
+
if (stats) {
|
|
573
|
+
stats.pypiPreResolved = (stats.pypiPreResolved || 0) + resolved;
|
|
574
|
+
stats.pypiPreResolveFailed = (stats.pypiPreResolveFailed || 0) + failed;
|
|
575
|
+
}
|
|
576
|
+
if (items.length >= 5) {
|
|
577
|
+
const elapsed = Date.now() - start;
|
|
578
|
+
console.log(`[MONITOR] PRE-RESOLVE pypi: ${resolved}/${items.length} in ${elapsed}ms (${failed} → lazy fallback${alreadyResolved ? `, ${alreadyResolved} already resolved` : ''})`);
|
|
579
|
+
}
|
|
580
|
+
}
|
|
581
|
+
|
|
396
582
|
// --- npm polling ---
|
|
397
583
|
|
|
398
584
|
/**
|
|
@@ -481,6 +667,10 @@ async function pollNpmChanges(state, scanQueue, stats) {
|
|
|
481
667
|
stats.npmPublishEventsSeen = (stats.npmPublishEventsSeen || 0) + data.results.length;
|
|
482
668
|
|
|
483
669
|
let queued = 0;
|
|
670
|
+
// Collect items into a local batch so we can pre-resolve tarball URLs in
|
|
671
|
+
// parallel before pushing to scanQueue. Items reach workers with metadata
|
|
672
|
+
// already attached → workers skip the per-scan registry round-trip.
|
|
673
|
+
const newItems = [];
|
|
484
674
|
for (const change of data.results) {
|
|
485
675
|
// Skip deleted packages
|
|
486
676
|
if (change.deleted) continue;
|
|
@@ -547,11 +737,10 @@ async function pollNpmChanges(state, scanQueue, stats) {
|
|
|
547
737
|
// Layer 3: Evaluate if this package should be cached
|
|
548
738
|
const cacheTrigger = evaluateCacheTrigger(name, docMeta, change.doc || null);
|
|
549
739
|
|
|
550
|
-
//
|
|
551
|
-
//
|
|
552
|
-
//
|
|
553
|
-
|
|
554
|
-
scanQueue.push({
|
|
740
|
+
// Post-May 2025: change.doc is always null, so docMeta is null and tarballUrl
|
|
741
|
+
// starts as null. preResolveNpmBatch below fills tarballUrl + metadata via
|
|
742
|
+
// a parallel registry fetch so workers do not pay the round-trip per scan.
|
|
743
|
+
newItems.push({
|
|
555
744
|
name,
|
|
556
745
|
version: docMeta ? docMeta.version : '',
|
|
557
746
|
ecosystem: 'npm',
|
|
@@ -564,6 +753,11 @@ async function pollNpmChanges(state, scanQueue, stats) {
|
|
|
564
753
|
queued++;
|
|
565
754
|
}
|
|
566
755
|
|
|
756
|
+
// Parallel pre-resolution, pushed chunk by chunk for crash resilience.
|
|
757
|
+
// Failures leave tarballUrl=null so the existing lazy-resolution path in
|
|
758
|
+
// resolveTarballAndScan() picks up the slack — zero scan loss.
|
|
759
|
+
await preResolveNpmBatch(newItems, stats, scanQueue);
|
|
760
|
+
|
|
567
761
|
// Update seq in memory only — disk persistence is handled by daemon.js
|
|
568
762
|
// after both queue and seq are saved atomically (prevents data loss on crash).
|
|
569
763
|
if (data.last_seq != null) {
|
|
@@ -623,6 +817,7 @@ async function pollNpmRss(state, scanQueue, stats) {
|
|
|
623
817
|
// falls back to RSS.
|
|
624
818
|
stats.npmPublishEventsSeen = (stats.npmPublishEventsSeen || 0) + newPackages.length;
|
|
625
819
|
|
|
820
|
+
const newItems = [];
|
|
626
821
|
for (const name of newPackages) {
|
|
627
822
|
if (name === SELF_PACKAGE_NAME) {
|
|
628
823
|
console.log(`[MONITOR] SKIPPED (self): ${name}`);
|
|
@@ -666,15 +861,18 @@ async function pollNpmRss(state, scanQueue, stats) {
|
|
|
666
861
|
}
|
|
667
862
|
}
|
|
668
863
|
|
|
669
|
-
|
|
670
|
-
scanQueue.push({
|
|
864
|
+
newItems.push({
|
|
671
865
|
name,
|
|
672
866
|
version: '',
|
|
673
867
|
ecosystem: 'npm',
|
|
674
|
-
tarballUrl: null // resolved
|
|
868
|
+
tarballUrl: null // pre-resolved below; lazy fallback preserved on failure
|
|
675
869
|
});
|
|
676
870
|
}
|
|
677
871
|
|
|
872
|
+
// Parallel pre-resolution with per-chunk push → crash-resilient and saves
|
|
873
|
+
// the worker's per-scan registry round-trip when it succeeds.
|
|
874
|
+
await preResolveNpmBatch(newItems, stats, scanQueue);
|
|
875
|
+
|
|
678
876
|
// Remember the most recent package (first in RSS)
|
|
679
877
|
if (packages.length > 0) {
|
|
680
878
|
state.npmLastPackage = packages[0];
|
|
@@ -901,6 +1099,7 @@ async function pollPyPIChangelog(state, scanQueue, stats) {
|
|
|
901
1099
|
const seen = new Set();
|
|
902
1100
|
let queued = 0;
|
|
903
1101
|
let maxSerial = lastSerial;
|
|
1102
|
+
const newItems = [];
|
|
904
1103
|
|
|
905
1104
|
for (const ev of events) {
|
|
906
1105
|
if (ev.serial > maxSerial) maxSerial = ev.serial;
|
|
@@ -932,16 +1131,20 @@ async function pollPyPIChangelog(state, scanQueue, stats) {
|
|
|
932
1131
|
}
|
|
933
1132
|
} catch { /* IOC load failure is non-fatal */ }
|
|
934
1133
|
|
|
935
|
-
|
|
1134
|
+
newItems.push({
|
|
936
1135
|
name: ev.name,
|
|
937
1136
|
version: ev.version,
|
|
938
1137
|
ecosystem: 'pypi',
|
|
939
|
-
tarballUrl: null, // resolved
|
|
1138
|
+
tarballUrl: null, // pre-resolved below; lazy fallback preserved
|
|
940
1139
|
isIOCMatch: isKnownIOC
|
|
941
1140
|
});
|
|
942
1141
|
queued++;
|
|
943
1142
|
}
|
|
944
1143
|
|
|
1144
|
+
// Parallel pre-resolution with per-chunk push to scanQueue. Failures keep
|
|
1145
|
+
// tarballUrl=null so resolveTarballAndScan() falls back to lazy lookup.
|
|
1146
|
+
await preResolvePyPIBatch(newItems, stats, scanQueue);
|
|
1147
|
+
|
|
945
1148
|
// Persist the serial both in memory and on disk before returning.
|
|
946
1149
|
// daemon.js also flushes state.json after the queue is saved, but writing the
|
|
947
1150
|
// dedicated serial file here means a crash between the two flush points costs
|
|
@@ -996,17 +1199,22 @@ async function pollPyPIRss(state, scanQueue) {
|
|
|
996
1199
|
}
|
|
997
1200
|
}
|
|
998
1201
|
|
|
1202
|
+
const newItems = [];
|
|
999
1203
|
for (const name of newPackages) {
|
|
1000
1204
|
console.log(`[MONITOR] New pypi (rss): ${name}`);
|
|
1001
|
-
|
|
1002
|
-
scanQueue.push({
|
|
1205
|
+
newItems.push({
|
|
1003
1206
|
name,
|
|
1004
1207
|
version: '',
|
|
1005
1208
|
ecosystem: 'pypi',
|
|
1006
|
-
tarballUrl: null // resolved
|
|
1209
|
+
tarballUrl: null // pre-resolved below; lazy fallback preserved
|
|
1007
1210
|
});
|
|
1008
1211
|
}
|
|
1009
1212
|
|
|
1213
|
+
// pollPyPIRss does not have a stats arg today; pass {} so the helper still
|
|
1214
|
+
// runs but per-poll counters are dropped. The PRE-RESOLVE log line gives
|
|
1215
|
+
// operational visibility regardless. scanQueue is passed for per-chunk push.
|
|
1216
|
+
await preResolvePyPIBatch(newItems, {}, scanQueue);
|
|
1217
|
+
|
|
1010
1218
|
// Remember the most recent package (first in RSS)
|
|
1011
1219
|
if (packages.length > 0) {
|
|
1012
1220
|
state.pypiLastPackage = packages[0];
|
|
@@ -1119,6 +1327,8 @@ module.exports = {
|
|
|
1119
1327
|
getNpmTarballUrl,
|
|
1120
1328
|
getPyPITarballUrl,
|
|
1121
1329
|
getNpmLatestTarball,
|
|
1330
|
+
preResolveNpmBatch,
|
|
1331
|
+
preResolvePyPIBatch,
|
|
1122
1332
|
|
|
1123
1333
|
// RSS parsing
|
|
1124
1334
|
parseNpmRss,
|
package/src/monitor/queue.js
CHANGED
|
@@ -73,6 +73,7 @@ const {
|
|
|
73
73
|
buildCanaryExfiltrationWebhookEmbed,
|
|
74
74
|
getWebhookUrl,
|
|
75
75
|
computeReputationFactor,
|
|
76
|
+
triageRisk,
|
|
76
77
|
computeRiskLevel,
|
|
77
78
|
sendDailyReport,
|
|
78
79
|
alertedPackageRules,
|
|
@@ -127,6 +128,22 @@ const LARGE_PACKAGE_SIZE = 10 * 1024 * 1024; // 10MB
|
|
|
127
128
|
const FIRST_PUBLISH_SANDBOX_MAX_QUEUE = parseInt(process.env.MUADDIB_FIRST_PUBLISH_SANDBOX_MAX_QUEUE, 10) || 10;
|
|
128
129
|
const FIRST_PUBLISH_SANDBOX_ENABLED = process.env.MUADDIB_FIRST_PUBLISH_SANDBOX !== '0';
|
|
129
130
|
|
|
131
|
+
// Stage 3 — sandbox gate. Static-score threshold below which T1b/T2 packages
|
|
132
|
+
// are NOT sandboxed (static result alone is authoritative). Tightens the prior
|
|
133
|
+
// "T1b sandbox if score >= 25 or queue < 20" to remove low-signal sandbox runs
|
|
134
|
+
// that consume slots without producing actionable findings (the dominant cost
|
|
135
|
+
// in the queue-saturation diagnostic). Validated by axon-enterprise@1.0.0
|
|
136
|
+
// (static 52, sandbox confirmed 100) — gate >= 40 still catches it.
|
|
137
|
+
// T1a (high-confidence malice) bypasses this gate; it's mandatory.
|
|
138
|
+
// Override via env var to widen the gate (lower threshold) for a short
|
|
139
|
+
// rollback window without redeploying. Clamped to [0, 100].
|
|
140
|
+
function computeSandboxScoreThreshold(envValue) {
|
|
141
|
+
const parsed = parseInt(envValue, 10);
|
|
142
|
+
const value = Number.isFinite(parsed) ? parsed : 40;
|
|
143
|
+
return Math.max(0, Math.min(100, value));
|
|
144
|
+
}
|
|
145
|
+
const SANDBOX_SCORE_THRESHOLD = computeSandboxScoreThreshold(process.env.MUADDIB_SANDBOX_SCORE_THRESHOLD);
|
|
146
|
+
|
|
130
147
|
// --- Bundled tooling false-positive filter ---
|
|
131
148
|
|
|
132
149
|
const KNOWN_BUNDLED_FILES = ['yarn.js', 'webpack.js', 'terser.js', 'esbuild.js', 'polyfills.js'];
|
|
@@ -444,7 +461,11 @@ async function scanPackage(name, version, ecosystem, tarballUrl, registryMeta, s
|
|
|
444
461
|
version,
|
|
445
462
|
ecosystem,
|
|
446
463
|
monitorMode: true,
|
|
447
|
-
trustedDepDiff: true
|
|
464
|
+
trustedDepDiff: true,
|
|
465
|
+
// Stage 2: set by processQueueItem when MUADDIB_TRIAGE_MODE=enforce.
|
|
466
|
+
// Defaults to 'full' so any CLI/test caller that bypasses triage gets
|
|
467
|
+
// the full 20-scanner pipeline (unchanged behaviour).
|
|
468
|
+
scanMode: (meta && meta.scanMode) || 'full'
|
|
448
469
|
};
|
|
449
470
|
result = await runScanInWorker(extractedDir, STATIC_SCAN_TIMEOUT_MS, scanContext);
|
|
450
471
|
} catch (staticErr) {
|
|
@@ -733,14 +754,16 @@ async function scanPackage(name, version, ecosystem, tarballUrl, registryMeta, s
|
|
|
733
754
|
}
|
|
734
755
|
|
|
735
756
|
// T1a: mandatory sandbox (HC malice types, TIER1_TYPES non-LOW, lifecycle + intent compound)
|
|
736
|
-
// T1b: conditional sandbox
|
|
737
|
-
//
|
|
738
|
-
//
|
|
757
|
+
// T1b: conditional sandbox — gated by SANDBOX_SCORE_THRESHOLD (Stage 3).
|
|
758
|
+
// Previously gated at >= 25 OR queue < 20; tightened to >= 40 by
|
|
759
|
+
// default because the 25-39 band produced no decisive sandbox
|
|
760
|
+
// findings in 4 months of prod data (axon-enterprise was at 52).
|
|
761
|
+
// T2: conditional sandbox — same score gate AND queue < 50.
|
|
739
762
|
let sandboxResult = null;
|
|
740
763
|
const shouldSandbox = !skipSandboxLargePackage && isSandboxEnabled() && sandboxAvailable && (
|
|
741
764
|
tier === '1a' ||
|
|
742
|
-
(tier === '1b' &&
|
|
743
|
-
(tier === 2 && scanQueue.length < 50)
|
|
765
|
+
(tier === '1b' && riskScore >= SANDBOX_SCORE_THRESHOLD) ||
|
|
766
|
+
(tier === 2 && riskScore >= SANDBOX_SCORE_THRESHOLD && scanQueue.length < 50)
|
|
744
767
|
);
|
|
745
768
|
|
|
746
769
|
if (shouldSandbox) {
|
|
@@ -808,8 +831,12 @@ async function scanPackage(name, version, ecosystem, tarballUrl, registryMeta, s
|
|
|
808
831
|
} catch (err) {
|
|
809
832
|
console.error(`[MONITOR] SANDBOX error for ${name}@${version}: ${err.message}`);
|
|
810
833
|
}
|
|
811
|
-
} else if (tier === '1b' && sandboxAvailable) {
|
|
812
|
-
|
|
834
|
+
} else if (tier === '1b' && sandboxAvailable && riskScore >= SANDBOX_SCORE_THRESHOLD) {
|
|
835
|
+
// Stage 3 — defer only when the score crosses the gate. Below the
|
|
836
|
+
// threshold, sandbox is skipped entirely (static result is final).
|
|
837
|
+
// This stops the deferred-queue from filling with low-score items
|
|
838
|
+
// that would never produce decisive sandbox findings.
|
|
839
|
+
console.log(`[MONITOR] SANDBOX DEFERRED (T1b, score=${riskScore}, queue ${scanQueue.length}): ${name}@${version}`);
|
|
813
840
|
enqueueDeferred({
|
|
814
841
|
name, version, ecosystem, tier, riskScore, tarballUrl,
|
|
815
842
|
enqueuedAt: Date.now(),
|
|
@@ -818,10 +845,14 @@ async function scanPackage(name, version, ecosystem, tarballUrl, registryMeta, s
|
|
|
818
845
|
retries: 0
|
|
819
846
|
});
|
|
820
847
|
stats.sandboxDeferred = (stats.sandboxDeferred || 0) + 1;
|
|
848
|
+
} else if (tier === '1b' && sandboxAvailable) {
|
|
849
|
+
// Below SANDBOX_SCORE_THRESHOLD — no sandbox, no defer.
|
|
850
|
+
console.log(`[MONITOR] SANDBOX GATED (T1b, score=${riskScore} < ${SANDBOX_SCORE_THRESHOLD}): ${name}@${version}`);
|
|
851
|
+
stats.sandboxGated = (stats.sandboxGated || 0) + 1;
|
|
821
852
|
} else if (tier === '1b') {
|
|
822
853
|
console.log(`[MONITOR] SANDBOX SKIPPED (T1b, no Docker): ${name}@${version}`);
|
|
823
|
-
} else if (tier === 2 && sandboxAvailable) {
|
|
824
|
-
console.log(`[MONITOR] SANDBOX DEFERRED (T2, queue ${scanQueue.length}
|
|
854
|
+
} else if (tier === 2 && sandboxAvailable && riskScore >= SANDBOX_SCORE_THRESHOLD) {
|
|
855
|
+
console.log(`[MONITOR] SANDBOX DEFERRED (T2, score=${riskScore}, queue ${scanQueue.length}): ${name}@${version}`);
|
|
825
856
|
enqueueDeferred({
|
|
826
857
|
name, version, ecosystem, tier, riskScore, tarballUrl,
|
|
827
858
|
enqueuedAt: Date.now(),
|
|
@@ -830,6 +861,11 @@ async function scanPackage(name, version, ecosystem, tarballUrl, registryMeta, s
|
|
|
830
861
|
retries: 0
|
|
831
862
|
});
|
|
832
863
|
stats.sandboxDeferred = (stats.sandboxDeferred || 0) + 1;
|
|
864
|
+
} else if (tier === 2 && sandboxAvailable) {
|
|
865
|
+
// Below SANDBOX_SCORE_THRESHOLD — T2 was already passive; staying
|
|
866
|
+
// static-only matches the existing T3 behaviour.
|
|
867
|
+
console.log(`[MONITOR] SANDBOX GATED (T2, score=${riskScore} < ${SANDBOX_SCORE_THRESHOLD}): ${name}@${version}`);
|
|
868
|
+
stats.sandboxGated = (stats.sandboxGated || 0) + 1;
|
|
833
869
|
} else if (tier === 2) {
|
|
834
870
|
console.log(`[MONITOR] SANDBOX SKIPPED (T2, no Docker): ${name}@${version}`);
|
|
835
871
|
}
|
|
@@ -1114,65 +1150,78 @@ async function processQueue(scanQueue, stats, dailyAlerts, recentlyScanned, down
|
|
|
1114
1150
|
async function resolveTarballAndScan(item, stats, dailyAlerts, recentlyScanned, downloadsCache, scanQueue, sandboxAvailable, signal) {
|
|
1115
1151
|
if (signal && signal.aborted) return;
|
|
1116
1152
|
|
|
1117
|
-
if (item.ecosystem === 'npm'
|
|
1153
|
+
if (item.ecosystem === 'npm') {
|
|
1154
|
+
// Pre-resolve at ingestion (ingestion.js:preResolveNpmBatch) attaches
|
|
1155
|
+
// _npmInfo when it succeeds. Lazy path runs only when pre-resolve was
|
|
1156
|
+
// skipped or failed — in which case _npmInfo is absent and tarballUrl is
|
|
1157
|
+
// null. Either way, ATO / burst-extras / fast-track logic below runs on
|
|
1158
|
+
// whichever npmInfo we have, preserving full behavior.
|
|
1159
|
+
let npmInfo = item._npmInfo || null;
|
|
1118
1160
|
try {
|
|
1119
|
-
|
|
1120
|
-
|
|
1121
|
-
|
|
1122
|
-
|
|
1123
|
-
|
|
1124
|
-
|
|
1125
|
-
|
|
1126
|
-
|
|
1127
|
-
|
|
1128
|
-
|
|
1129
|
-
// ATO signature: most-recently-published version differs from current
|
|
1130
|
-
// dist-tags.latest. Pattern observed in TeamPCP / @antv 2026-05-19:
|
|
1131
|
-
// attacker publishes 1-2 versions per package but does NOT bump the latest
|
|
1132
|
-
// tag. semver resolution on `npm install <pkg>@^x.y` still pulls the
|
|
1133
|
-
// malicious version. The mismatch is a strong ATO signal — legitimate
|
|
1134
|
-
// maintainers almost always move latest when publishing.
|
|
1135
|
-
if (npmInfo.latestTagVersion && npmInfo.version && npmInfo.version !== npmInfo.latestTagVersion) {
|
|
1136
|
-
item.atoSignal = true;
|
|
1137
|
-
console.log(`[MONITOR] ATO SIGNAL: ${item.name}@${item.version} published but dist-tags.latest=${npmInfo.latestTagVersion}`);
|
|
1161
|
+
if (!item.tarballUrl) {
|
|
1162
|
+
npmInfo = await getNpmLatestTarball(item.name);
|
|
1163
|
+
if (!npmInfo.tarball) {
|
|
1164
|
+
console.log(`[MONITOR] SKIP: ${item.name} — no tarball URL found on npm`);
|
|
1165
|
+
return;
|
|
1166
|
+
}
|
|
1167
|
+
item.tarballUrl = npmInfo.tarball;
|
|
1168
|
+
if (npmInfo.version) item.version = npmInfo.version;
|
|
1169
|
+
if (npmInfo.unpackedSize) item.unpackedSize = npmInfo.unpackedSize;
|
|
1170
|
+
if (npmInfo.scripts) item.registryScripts = npmInfo.scripts;
|
|
1138
1171
|
}
|
|
1139
1172
|
|
|
1140
|
-
|
|
1141
|
-
|
|
1142
|
-
|
|
1143
|
-
|
|
1144
|
-
|
|
1145
|
-
|
|
1146
|
-
|
|
1147
|
-
|
|
1148
|
-
|
|
1149
|
-
|
|
1150
|
-
|
|
1151
|
-
scanQueue.push({
|
|
1152
|
-
name: item.name,
|
|
1153
|
-
version: recent.version,
|
|
1154
|
-
ecosystem: 'npm',
|
|
1155
|
-
tarballUrl: recent.tarball,
|
|
1156
|
-
unpackedSize: recent.unpackedSize || 0,
|
|
1157
|
-
registryScripts: recent.scripts || null,
|
|
1158
|
-
atoSignal: item.atoSignal === true,
|
|
1159
|
-
isATOBurstExtra: true,
|
|
1160
|
-
});
|
|
1161
|
-
}
|
|
1173
|
+
if (npmInfo) {
|
|
1174
|
+
// ATO signature: most-recently-published version differs from current
|
|
1175
|
+
// dist-tags.latest. Pattern observed in TeamPCP / @antv 2026-05-19:
|
|
1176
|
+
// attacker publishes 1-2 versions per package but does NOT bump the latest
|
|
1177
|
+
// tag. semver resolution on `npm install <pkg>@^x.y` still pulls the
|
|
1178
|
+
// malicious version. The mismatch is a strong ATO signal — legitimate
|
|
1179
|
+
// maintainers almost always move latest when publishing.
|
|
1180
|
+
if (npmInfo.latestTagVersion && item.version && item.version !== npmInfo.latestTagVersion) {
|
|
1181
|
+
item.atoSignal = true;
|
|
1182
|
+
console.log(`[MONITOR] ATO SIGNAL: ${item.name}@${item.version} published but dist-tags.latest=${npmInfo.latestTagVersion}`);
|
|
1183
|
+
}
|
|
1162
1184
|
|
|
1163
|
-
|
|
1164
|
-
|
|
1165
|
-
|
|
1166
|
-
|
|
1167
|
-
|
|
1168
|
-
|
|
1169
|
-
|
|
1170
|
-
|
|
1171
|
-
|
|
1172
|
-
|
|
1173
|
-
|
|
1174
|
-
|
|
1185
|
+
// Burst-publish coverage: enqueue extra versions published in the same
|
|
1186
|
+
// recent window. Single change event in the CouchDB feed can correspond
|
|
1187
|
+
// to multiple version publishes when the attacker fires several in a
|
|
1188
|
+
// burst (TeamPCP averaged ~2 versions per package). Without this we'd
|
|
1189
|
+
// only scan whichever version happened to be the most recent at resolution
|
|
1190
|
+
// time, racing the publish stream.
|
|
1191
|
+
const recents = Array.isArray(npmInfo.recentVersions) ? npmInfo.recentVersions : [];
|
|
1192
|
+
for (const recent of recents) {
|
|
1193
|
+
if (!recent || !recent.tarball || !recent.version) continue;
|
|
1194
|
+
const dedupeKey = `${item.name}@${recent.version}`;
|
|
1195
|
+
if (recentlyScanned.has(dedupeKey)) continue;
|
|
1196
|
+
scanQueue.push({
|
|
1197
|
+
name: item.name,
|
|
1198
|
+
version: recent.version,
|
|
1199
|
+
ecosystem: 'npm',
|
|
1200
|
+
tarballUrl: recent.tarball,
|
|
1201
|
+
unpackedSize: recent.unpackedSize || 0,
|
|
1202
|
+
registryScripts: recent.scripts || null,
|
|
1203
|
+
atoSignal: item.atoSignal === true,
|
|
1204
|
+
isATOBurstExtra: true,
|
|
1205
|
+
});
|
|
1206
|
+
}
|
|
1207
|
+
|
|
1208
|
+
// Fast-track decision: large packages (>15MB) with no lifecycle scripts and no IOC match.
|
|
1209
|
+
// Fast-track packages get: quick static scan (package.json + shell only), no AST,
|
|
1210
|
+
// no sandbox, no LLM, no archiving. Exits in ~2-3s instead of 30-300s.
|
|
1211
|
+
// ATO-signalled packages bypass fast-track regardless of size — we want
|
|
1212
|
+
// the full pipeline (AST + sandbox) on anything that smells like an ATO.
|
|
1213
|
+
const FAST_TRACK_SIZE_BYTES = 15 * 1024 * 1024;
|
|
1214
|
+
if (!item.isIOCMatch && !item.atoSignal && (item.unpackedSize || 0) > FAST_TRACK_SIZE_BYTES) {
|
|
1215
|
+
const scripts = item.registryScripts || {};
|
|
1216
|
+
if (!scripts.preinstall && !scripts.postinstall && !scripts.install) {
|
|
1217
|
+
item.fastTrack = true;
|
|
1218
|
+
}
|
|
1175
1219
|
}
|
|
1220
|
+
|
|
1221
|
+
// Free the packument-derived metadata once the per-item decisions are
|
|
1222
|
+
// made — keeps queue items lean (a 28k-item queue × full packument JSON
|
|
1223
|
+
// would be tens of MB of useless heap).
|
|
1224
|
+
if (item._npmInfo) delete item._npmInfo;
|
|
1176
1225
|
}
|
|
1177
1226
|
} catch (err) {
|
|
1178
1227
|
console.error(`[MONITOR] ERROR resolving npm tarball for ${item.name}: ${err.message}`);
|
|
@@ -1265,11 +1314,52 @@ async function resolveTarballAndScan(item, stats, dailyAlerts, recentlyScanned,
|
|
|
1265
1314
|
// Abort check: if timeout fired during temporal checks, skip the expensive scan
|
|
1266
1315
|
if (signal && signal.aborted) return;
|
|
1267
1316
|
|
|
1317
|
+
// Stage 2 — Pass A triage. Decides whether the static scan runs all 20
|
|
1318
|
+
// scanners or a quick_scan subset. Defaults to full when:
|
|
1319
|
+
// - env MUADDIB_TRIAGE_MODE !== 'enforce' (off | shadow | unset)
|
|
1320
|
+
// - the item is fastTrack-elected (already a more aggressive subset)
|
|
1321
|
+
// - any suspect signal flips triageRisk to 'full'
|
|
1322
|
+
// Shadow mode computes + logs the decision but still runs full — safe way
|
|
1323
|
+
// to observe classification share before flipping enforce.
|
|
1324
|
+
const triageMode = (process.env.MUADDIB_TRIAGE_MODE || 'off').toLowerCase();
|
|
1325
|
+
let effectiveScanMode = 'full';
|
|
1326
|
+
if (triageMode !== 'off' && !item.fastTrack) {
|
|
1327
|
+
let triageMeta = null;
|
|
1328
|
+
if (item.ecosystem === 'npm') {
|
|
1329
|
+
// Stage 2.1 — Stage 1 pre-resolve already fetched the packument and
|
|
1330
|
+
// (Stage 2.1) computed age_days + version_count, plus parallel-fetched
|
|
1331
|
+
// weekly_downloads. Read those directly to skip the second
|
|
1332
|
+
// registry round-trip via getPackageMetadata. Fallback to the lazy
|
|
1333
|
+
// metadata fetch only when _npmInfo is absent (lazy-resolve path).
|
|
1334
|
+
if (item._npmInfo) {
|
|
1335
|
+
triageMeta = {
|
|
1336
|
+
age_days: item._npmInfo.age_days,
|
|
1337
|
+
version_count: item._npmInfo.version_count,
|
|
1338
|
+
weekly_downloads: item._npmInfo.weekly_downloads,
|
|
1339
|
+
};
|
|
1340
|
+
} else {
|
|
1341
|
+
try {
|
|
1342
|
+
const { getPackageMetadata } = require('../scanner/npm-registry.js');
|
|
1343
|
+
triageMeta = await getPackageMetadata(item.name);
|
|
1344
|
+
} catch { /* metadata unavailable → triageRisk will see null and pick 'full' */ }
|
|
1345
|
+
}
|
|
1346
|
+
} else if (item.ecosystem === 'pypi') {
|
|
1347
|
+
triageMeta = item._pypiInfo || null;
|
|
1348
|
+
}
|
|
1349
|
+
const triage = triageRisk(item, triageMeta);
|
|
1350
|
+
item.scanMode = triage.mode;
|
|
1351
|
+
stats.triageQuick = (stats.triageQuick || 0) + (triage.mode === 'quick' ? 1 : 0);
|
|
1352
|
+
stats.triageFull = (stats.triageFull || 0) + (triage.mode === 'full' ? 1 : 0);
|
|
1353
|
+
console.log(`[TRIAGE] ${item.name}@${item.version || '?'}: mode=${triage.mode} reasons=[${triage.reasons.join(',') || 'none'}]`);
|
|
1354
|
+
if (triageMode === 'enforce') effectiveScanMode = triage.mode;
|
|
1355
|
+
}
|
|
1356
|
+
|
|
1268
1357
|
const scanResult = await scanPackage(item.name, item.version, item.ecosystem, item.tarballUrl, {
|
|
1269
1358
|
unpackedSize: item.unpackedSize || 0,
|
|
1270
1359
|
registryScripts: item.registryScripts || null,
|
|
1271
1360
|
_cacheTrigger: item._cacheTrigger || null,
|
|
1272
|
-
fastTrack: item.fastTrack || false
|
|
1361
|
+
fastTrack: item.fastTrack || false,
|
|
1362
|
+
scanMode: effectiveScanMode
|
|
1273
1363
|
}, stats, dailyAlerts, recentlyScanned, downloadsCache, scanQueue, sandboxAvailable);
|
|
1274
1364
|
const sandboxResult = scanResult && scanResult.sandboxResult;
|
|
1275
1365
|
const staticClean = scanResult && scanResult.staticClean;
|
|
@@ -1367,6 +1457,8 @@ module.exports = {
|
|
|
1367
1457
|
LARGE_PACKAGE_SIZE,
|
|
1368
1458
|
FIRST_PUBLISH_SANDBOX_MAX_QUEUE,
|
|
1369
1459
|
FIRST_PUBLISH_SANDBOX_ENABLED,
|
|
1460
|
+
SANDBOX_SCORE_THRESHOLD,
|
|
1461
|
+
computeSandboxScoreThreshold,
|
|
1370
1462
|
KNOWN_BUNDLED_FILES,
|
|
1371
1463
|
KNOWN_BUNDLED_PATHS,
|
|
1372
1464
|
ML_EXCLUDED_DIRS,
|
package/src/monitor/webhook.js
CHANGED
|
@@ -304,6 +304,72 @@ function computeReputationFactor(metadata) {
|
|
|
304
304
|
return Math.max(0.10, Math.min(1.5, factor));
|
|
305
305
|
}
|
|
306
306
|
|
|
307
|
+
/**
|
|
308
|
+
* True if the package declares an install-time lifecycle script that executes
|
|
309
|
+
* code on `npm install`. These hooks are the principal vehicle for malicious
|
|
310
|
+
* payloads (preinstall / postinstall / install). PyPI's setup.py equivalent is
|
|
311
|
+
* handled separately via `meta.has_setup_py` in triageRisk.
|
|
312
|
+
*
|
|
313
|
+
* Reads from both `item.registryScripts` (set by changes-stream docMeta when
|
|
314
|
+
* available) and `item._npmInfo.scripts` (set by Stage 1's preResolveNpmBatch).
|
|
315
|
+
*
|
|
316
|
+
* @param {Object} item - queue item
|
|
317
|
+
* @returns {boolean}
|
|
318
|
+
*/
|
|
319
|
+
function hasDangerousLifecycle(item) {
|
|
320
|
+
if (!item) return false;
|
|
321
|
+
const direct = item.registryScripts;
|
|
322
|
+
if (direct && (direct.preinstall || direct.postinstall || direct.install)) return true;
|
|
323
|
+
const stashed = item._npmInfo && item._npmInfo.scripts;
|
|
324
|
+
if (stashed && (stashed.preinstall || stashed.postinstall || stashed.install)) return true;
|
|
325
|
+
return false;
|
|
326
|
+
}
|
|
327
|
+
|
|
328
|
+
/**
|
|
329
|
+
* Pass A triage: choose between full pipeline (20 scanners) and quick_scan
|
|
330
|
+
* subset for a queued package. Default is `quick`; any suspect signal flips
|
|
331
|
+
* to `full`. Used by the monitor only — CLI scans default to full elsewhere.
|
|
332
|
+
*
|
|
333
|
+
* Tiers (any reason → full):
|
|
334
|
+
* T0 IOC match / ATO signal / install-time lifecycle → known or high-prob threat
|
|
335
|
+
* T1 No registry metadata available → cannot establish trust, default safe
|
|
336
|
+
* T2 (npm) computeReputationFactor(meta) >= 1.0 → composite signal of new /
|
|
337
|
+
* low-download / few-versions package, subsumes individual checks
|
|
338
|
+
* T3 (PyPI) direct age < 30d or version_count < 5 → PyPI has no download
|
|
339
|
+
* stats, so we cannot reuse the npm composite; use the direct fields the
|
|
340
|
+
* PyPI JSON API exposes.
|
|
341
|
+
*
|
|
342
|
+
* Returning the reasons list (not just the mode) makes shadow-mode logs
|
|
343
|
+
* actionable for tuning.
|
|
344
|
+
*
|
|
345
|
+
* @param {Object} item - queue item
|
|
346
|
+
* @param {Object|null} meta - registry metadata {age_days, version_count, weekly_downloads, has_setup_py?}
|
|
347
|
+
* @returns {{mode: 'full'|'quick', reasons: string[]}}
|
|
348
|
+
*/
|
|
349
|
+
function triageRisk(item, meta) {
|
|
350
|
+
const reasons = [];
|
|
351
|
+
const ecosystem = (item && item.ecosystem) || null;
|
|
352
|
+
|
|
353
|
+
if (item && item.isIOCMatch) reasons.push('ioc_match');
|
|
354
|
+
if (item && item.atoSignal) reasons.push('ato_signal');
|
|
355
|
+
if (hasDangerousLifecycle(item)) reasons.push('lifecycle_scripts');
|
|
356
|
+
|
|
357
|
+
if (!meta) {
|
|
358
|
+
reasons.push('no_metadata');
|
|
359
|
+
} else if (ecosystem === 'npm') {
|
|
360
|
+
const factor = computeReputationFactor(meta);
|
|
361
|
+
if (factor >= 1.0) reasons.push(`reputation_factor=${factor.toFixed(2)}`);
|
|
362
|
+
} else if (ecosystem === 'pypi') {
|
|
363
|
+
// PyPI has no weekly_downloads source today, so we cannot reuse
|
|
364
|
+
// computeReputationFactor as-is. Use direct signals instead.
|
|
365
|
+
if ((meta.age_days || 0) < 30) reasons.push('pypi_age<30d');
|
|
366
|
+
if ((meta.version_count || 0) < 5) reasons.push('pypi_version_count<5');
|
|
367
|
+
if (meta.has_setup_py === true) reasons.push('pypi_setup_py');
|
|
368
|
+
}
|
|
369
|
+
|
|
370
|
+
return { mode: reasons.length ? 'full' : 'quick', reasons };
|
|
371
|
+
}
|
|
372
|
+
|
|
307
373
|
/**
|
|
308
374
|
* Persist a CRITICAL/HIGH alert to logs/alerts/YYYY-MM-DD-HH-mm-ss-<package>.json
|
|
309
375
|
* Same payload as webhook — enables offline FPR/TPR trend analysis.
|
|
@@ -1237,6 +1303,8 @@ module.exports = {
|
|
|
1237
1303
|
computeRiskLevel,
|
|
1238
1304
|
computeRiskScore,
|
|
1239
1305
|
computeReputationFactor,
|
|
1306
|
+
hasDangerousLifecycle,
|
|
1307
|
+
triageRisk,
|
|
1240
1308
|
persistAlert,
|
|
1241
1309
|
persistDailyReport,
|
|
1242
1310
|
computeAlertPriority,
|
package/src/pipeline/executor.js
CHANGED
|
@@ -227,41 +227,80 @@ async function execute(targetPath, options, pythonDeps, warnings) {
|
|
|
227
227
|
'scanPythonAST'
|
|
228
228
|
];
|
|
229
229
|
|
|
230
|
+
// Stage 2 quick_scan subset (monitor-only, set via options.scanMode='quick'
|
|
231
|
+
// by queue.js when MUADDIB_TRIAGE_MODE=enforce). The subset keeps the heavy
|
|
232
|
+
// detectors that anchor TPR on the 96-sample GT (analyzeAST covers 70/96,
|
|
233
|
+
// analyzeDataFlow covers 31/96 — non-negotiable), the cheap high-signal
|
|
234
|
+
// lifecycle/IOC scanners, and the Python detectors (PyPI samples need them;
|
|
235
|
+
// npm exit immediately on a depth-1 readdir, so the cost is negligible).
|
|
236
|
+
// Excluded: scanAntiForensic (45s timeout, never the unique trigger on GT),
|
|
237
|
+
// scanHashes (cheap but GT samples are rebuilt — hashes drift), scanAIConfig,
|
|
238
|
+
// scanStubPackage, scanMonorepo, scanTrustedDepDiff (opt-in registry diff),
|
|
239
|
+
// checkPyPITyposquatting (subsumed by scanTyposquatting for npm; PyPI
|
|
240
|
+
// typosquats already get full via triage signals). CLI mode and shadow mode
|
|
241
|
+
// never set scanMode so the default branch runs all 20 scanners — fully
|
|
242
|
+
// backwards-compatible.
|
|
243
|
+
const QUICK_SCAN_ALLOWLIST = new Set([
|
|
244
|
+
'scanPackageJson',
|
|
245
|
+
'scanShellScripts',
|
|
246
|
+
'analyzeAST',
|
|
247
|
+
'detectObfuscation',
|
|
248
|
+
'scanDependencies',
|
|
249
|
+
'analyzeDataFlow',
|
|
250
|
+
'scanTyposquatting',
|
|
251
|
+
'scanGitHubActions',
|
|
252
|
+
'matchPythonIOCs',
|
|
253
|
+
'scanEntropy',
|
|
254
|
+
'scanIocStrings',
|
|
255
|
+
'scanPythonSource',
|
|
256
|
+
'scanPythonAST',
|
|
257
|
+
'scanAIConfig'
|
|
258
|
+
]);
|
|
259
|
+
const isQuick = options.scanMode === 'quick';
|
|
260
|
+
function ifEnabled(name, fn) {
|
|
261
|
+
if (isQuick && !QUICK_SCAN_ALLOWLIST.has(name)) return Promise.resolve([]);
|
|
262
|
+
return fn();
|
|
263
|
+
}
|
|
264
|
+
if (isQuick) {
|
|
265
|
+
const skipped = SCANNER_NAMES.filter(n => !QUICK_SCAN_ALLOWLIST.has(n));
|
|
266
|
+
debugLog(`[EXECUTOR] scanMode=quick — skipping ${skipped.length} scanners: ${skipped.join(', ')}`);
|
|
267
|
+
}
|
|
268
|
+
|
|
230
269
|
const settledResults = await Promise.allSettled([
|
|
231
|
-
yieldThen(() => scanPackageJson(targetPath)),
|
|
232
|
-
yieldThen(() => scanShellScripts(targetPath)),
|
|
233
|
-
withTimeout(() => analyzeAST(targetPath, { deobfuscate: deobfuscateFn }), 'analyzeAST'),
|
|
234
|
-
yieldThen(() => detectObfuscation(targetPath)),
|
|
235
|
-
yieldThen(() => scanDependencies(targetPath)),
|
|
236
|
-
yieldThen(() => scanHashes(targetPath)),
|
|
237
|
-
withTimeout(() => analyzeDataFlow(targetPath, { deobfuscate: deobfuscateFn }), 'analyzeDataFlow'),
|
|
238
|
-
yieldThen(() => scanTyposquatting(targetPath)),
|
|
239
|
-
yieldThen(() => scanGitHubActions(targetPath)),
|
|
240
|
-
yieldThen(() => matchPythonIOCs(pythonDeps, targetPath)),
|
|
241
|
-
yieldThen(() => checkPyPITyposquatting(pythonDeps, targetPath)),
|
|
242
|
-
withTimeout(() => scanEntropy(targetPath, { entropyThreshold: options.entropyThreshold || undefined }), 'scanEntropy'),
|
|
243
|
-
yieldThen(() => scanAIConfig(targetPath)),
|
|
244
|
-
yieldThen(() => scanIocStrings(targetPath)),
|
|
245
|
-
withTimeout(() => scanAntiForensic(targetPath), 'scanAntiForensic'),
|
|
246
|
-
yieldThen(() => scanStubPackage(targetPath)),
|
|
247
|
-
yieldThen(() => scanMonorepo(targetPath)),
|
|
270
|
+
ifEnabled('scanPackageJson', () => yieldThen(() => scanPackageJson(targetPath))),
|
|
271
|
+
ifEnabled('scanShellScripts', () => yieldThen(() => scanShellScripts(targetPath))),
|
|
272
|
+
ifEnabled('analyzeAST', () => withTimeout(() => analyzeAST(targetPath, { deobfuscate: deobfuscateFn }), 'analyzeAST')),
|
|
273
|
+
ifEnabled('detectObfuscation', () => yieldThen(() => detectObfuscation(targetPath))),
|
|
274
|
+
ifEnabled('scanDependencies', () => yieldThen(() => scanDependencies(targetPath))),
|
|
275
|
+
ifEnabled('scanHashes', () => yieldThen(() => scanHashes(targetPath))),
|
|
276
|
+
ifEnabled('analyzeDataFlow', () => withTimeout(() => analyzeDataFlow(targetPath, { deobfuscate: deobfuscateFn }), 'analyzeDataFlow')),
|
|
277
|
+
ifEnabled('scanTyposquatting', () => yieldThen(() => scanTyposquatting(targetPath))),
|
|
278
|
+
ifEnabled('scanGitHubActions', () => yieldThen(() => scanGitHubActions(targetPath))),
|
|
279
|
+
ifEnabled('matchPythonIOCs', () => yieldThen(() => matchPythonIOCs(pythonDeps, targetPath))),
|
|
280
|
+
ifEnabled('checkPyPITyposquatting', () => yieldThen(() => checkPyPITyposquatting(pythonDeps, targetPath))),
|
|
281
|
+
ifEnabled('scanEntropy', () => withTimeout(() => scanEntropy(targetPath, { entropyThreshold: options.entropyThreshold || undefined }), 'scanEntropy')),
|
|
282
|
+
ifEnabled('scanAIConfig', () => yieldThen(() => scanAIConfig(targetPath))),
|
|
283
|
+
ifEnabled('scanIocStrings', () => yieldThen(() => scanIocStrings(targetPath))),
|
|
284
|
+
ifEnabled('scanAntiForensic', () => withTimeout(() => scanAntiForensic(targetPath), 'scanAntiForensic')),
|
|
285
|
+
ifEnabled('scanStubPackage', () => yieldThen(() => scanStubPackage(targetPath))),
|
|
286
|
+
ifEnabled('scanMonorepo', () => yieldThen(() => scanMonorepo(targetPath))),
|
|
248
287
|
// Opt-in scanner — short-circuits to [] unless options.trustedDepDiff or
|
|
249
288
|
// options.monitorMode is set. CLI runs without flags pay no cost (no I/O).
|
|
250
289
|
// Wrapped in withTimeout as defense in depth: scanner has its own 10s + 5s × N
|
|
251
290
|
// internal timeouts, but a registry slowdown with many added deps could exceed
|
|
252
291
|
// the static-scan budget without this cap.
|
|
253
|
-
withTimeout(() => scanTrustedDepDiff(targetPath, options), 'scanTrustedDepDiff'),
|
|
292
|
+
ifEnabled('scanTrustedDepDiff', () => withTimeout(() => scanTrustedDepDiff(targetPath, options), 'scanTrustedDepDiff')),
|
|
254
293
|
// PYSRC-001..008 (v2.11.25, TrapDoor PyPI gap). Detect import-time RCE
|
|
255
294
|
// in __init__.py / setup.py / top-level .py files. Runs always — not gated
|
|
256
295
|
// on detectPythonProject() because an attacker can ship a malicious __init__.py
|
|
257
296
|
// without a requirements.txt. Walker is cheap (just a depth-1 readdir).
|
|
258
|
-
yieldThen(() => scanPythonSource(targetPath)),
|
|
297
|
+
ifEnabled('scanPythonSource', () => yieldThen(() => scanPythonSource(targetPath))),
|
|
259
298
|
// PYAST-001..008 (v2.11.42+, npm/PyPI parity Phase 1). Full Python CST
|
|
260
299
|
// analysis via tree-sitter-python WASM. Scope-aware module-level detection
|
|
261
300
|
// of cmdclass override, exec, subprocess shell=True, pickle.loads,
|
|
262
301
|
// __import__ dangerous, entry_points. Parser init happens at pre-analysis
|
|
263
302
|
// stage above; this call is sync from the caller's POV.
|
|
264
|
-
yieldThen(() => scanPythonAST(targetPath))
|
|
303
|
+
ifEnabled('scanPythonAST', () => yieldThen(() => scanPythonAST(targetPath)))
|
|
265
304
|
]);
|
|
266
305
|
|
|
267
306
|
// Extract results: use empty array for rejected scanners, log errors
|
|
@@ -86,6 +86,15 @@ const PLAYBOOKS = {
|
|
|
86
86
|
detached_process:
|
|
87
87
|
'spawn/fork avec {detached: true} detecte. Le processus enfant survit a la fin de npm install et execute le payload en arriere-plan. Verifier les processus en cours: ps aux | grep node. Tuer le processus suspect.',
|
|
88
88
|
|
|
89
|
+
linux_fingerprint_exec:
|
|
90
|
+
'execSync/spawn d\'une commande de reconnaissance Linux (id, uname, lsb_release, hostname, whoami). Seule, peut etre du telemetry legit. Combinee avec un envoi reseau, c\'est du fingerprint pour C2 grouping — verifier le contexte (compound recon_exfil_direct_ip si IP literal publique present dans le meme fichier).',
|
|
91
|
+
|
|
92
|
+
direct_ip_exfil:
|
|
93
|
+
'Endpoint C2 hardcode comme IPv4 literal publique (bypass DNS resolution). Verifier le fichier qui contient l\'IP : si combine avec linux_fingerprint_exec ou credential_regex_harvest, c\'est tres probablement un C2 attaquant. Geolocaliser l\'IP, croiser avec threat intel.',
|
|
94
|
+
|
|
95
|
+
recon_exfil_direct_ip:
|
|
96
|
+
'CRITIQUE: Linux system fingerprint (id/uname/lsb_release/hostname/whoami) + exfil vers IPv4 publique literal dans le meme fichier. Pattern targeted C2 grouping (campagne marginfi mai 2026, design-system-coopeuch). Isoler la machine, blocker l\'IP au firewall, capturer trafic sortant pour forensic.',
|
|
97
|
+
|
|
89
98
|
known_malicious_package:
|
|
90
99
|
'CRITIQUE: Supprimer immediatement. rm -rf node_modules && npm cache clean --force && npm install',
|
|
91
100
|
|
package/src/rules/index.js
CHANGED
|
@@ -783,6 +783,19 @@ const RULES = {
|
|
|
783
783
|
references: ['https://attack.mitre.org/techniques/T1195/002/'],
|
|
784
784
|
mitre: 'T1195.002'
|
|
785
785
|
},
|
|
786
|
+
recon_exfil_direct_ip: {
|
|
787
|
+
id: 'MUADDIB-COMPOUND-016',
|
|
788
|
+
name: 'Linux Fingerprint + Direct-IP Exfil',
|
|
789
|
+
severity: 'CRITICAL',
|
|
790
|
+
confidence: 'high',
|
|
791
|
+
domain: 'malware',
|
|
792
|
+
description: 'execSync(id|uname|lsb_release|hostname|whoami) + http/https vers IPv4 literal publique dans le meme fichier — fingerprint device pour groupement C2 cible. Pattern observe sur la campagne marginfi (mai 2026) et design-system-coopeuch reconstruction. Track D — ferme la gap surfacee par GT-095.',
|
|
793
|
+
references: [
|
|
794
|
+
'https://attack.mitre.org/techniques/T1082/',
|
|
795
|
+
'https://attack.mitre.org/techniques/T1041/'
|
|
796
|
+
],
|
|
797
|
+
mitre: 'T1082'
|
|
798
|
+
},
|
|
786
799
|
|
|
787
800
|
// Package.json script patterns
|
|
788
801
|
curl_pipe_sh: {
|
|
@@ -1113,6 +1126,33 @@ const RULES = {
|
|
|
1113
1126
|
],
|
|
1114
1127
|
mitre: 'T1564'
|
|
1115
1128
|
},
|
|
1129
|
+
linux_fingerprint_exec: {
|
|
1130
|
+
id: 'MUADDIB-AST-093',
|
|
1131
|
+
name: 'Linux System Reconnaissance Exec',
|
|
1132
|
+
severity: 'HIGH',
|
|
1133
|
+
confidence: 'high',
|
|
1134
|
+
domain: 'malware',
|
|
1135
|
+
description: 'execSync/exec/spawn d\'une commande de reconnaissance Linux (id, uname, lsb_release, hostname, whoami). Pattern observe sur les MALWARE direct-IP-exfil (marginfi cluster, design-system-coopeuch) qui collectent un fingerprint device avant exfil C2. HIGH seul (telemetry SDKs peuvent appeler hostname legit) — escalade CRITICAL en compound avec direct_ip_exfil dans le meme fichier.',
|
|
1136
|
+
references: [
|
|
1137
|
+
'https://attack.mitre.org/techniques/T1082/',
|
|
1138
|
+
'https://attack.mitre.org/techniques/T1592/'
|
|
1139
|
+
],
|
|
1140
|
+
mitre: 'T1082'
|
|
1141
|
+
},
|
|
1142
|
+
direct_ip_exfil: {
|
|
1143
|
+
id: 'MUADDIB-AST-094',
|
|
1144
|
+
name: 'Direct IP Exfiltration Endpoint',
|
|
1145
|
+
severity: 'HIGH',
|
|
1146
|
+
confidence: 'high',
|
|
1147
|
+
domain: 'malware',
|
|
1148
|
+
description: 'Literal IPv4 publique utilise comme endpoint C2 (URL http://1.2.3.4:port/path ou IP nue dans un host:/hostname: option). Bypass DNS resolution = pattern attaque ciblee. Plages skip: 127/8 (localhost), 169.254/16 (link-local incl. IMDS), 10/8 + 172.16/12 + 192.168/16 (RFC 1918 prive). RFC 5737 documentation flagge (aucun usage runtime legit).',
|
|
1149
|
+
references: [
|
|
1150
|
+
'https://attack.mitre.org/techniques/T1071/001/',
|
|
1151
|
+
'https://attack.mitre.org/techniques/T1041/',
|
|
1152
|
+
'https://datatracker.ietf.org/doc/html/rfc5737'
|
|
1153
|
+
],
|
|
1154
|
+
mitre: 'T1041'
|
|
1155
|
+
},
|
|
1116
1156
|
dangerous_call_function: {
|
|
1117
1157
|
id: 'MUADDIB-AST-005',
|
|
1118
1158
|
name: 'new Function() Constructor',
|
|
@@ -349,6 +349,21 @@ function handleCallExpression(node, ctx) {
|
|
|
349
349
|
file: ctx.relFile
|
|
350
350
|
});
|
|
351
351
|
}
|
|
352
|
+
|
|
353
|
+
// AST-NNN: linux_fingerprint_exec (Track D, v2.11.48+) — recon command
|
|
354
|
+
// pattern observed on direct-IP-exfil malware (marginfi cluster, GT-095
|
|
355
|
+
// design-system-coopeuch). HIGH alone (telemetry SDKs may legitimately
|
|
356
|
+
// call hostname); CRITICAL when compounded with direct_ip_exfil in the
|
|
357
|
+
// same file (`recon_exfil_direct_ip` in SCORING_COMPOUNDS).
|
|
358
|
+
if (/^\s*(id|uname|lsb_release|hostname|whoami)(\s|$)/.test(cmdStr)) {
|
|
359
|
+
const firstTok = cmdStr.trim().split(/\s+/)[0];
|
|
360
|
+
ctx.threats.push({
|
|
361
|
+
type: 'linux_fingerprint_exec',
|
|
362
|
+
severity: 'HIGH',
|
|
363
|
+
message: `${execName || memberExec}("${cmdStr.slice(0, 60)}") — Linux system reconnaissance (${firstTok}) used for device fingerprinting / C2 grouping.`,
|
|
364
|
+
file: ctx.relFile
|
|
365
|
+
});
|
|
366
|
+
}
|
|
352
367
|
}
|
|
353
368
|
}
|
|
354
369
|
|
|
@@ -424,7 +439,7 @@ function handleCallExpression(node, ctx) {
|
|
|
424
439
|
}
|
|
425
440
|
|
|
426
441
|
// Detect spawn/execFile of shell processes
|
|
427
|
-
if ((callName === 'spawn' || callName === 'execFile') && node.arguments.length >= 1) {
|
|
442
|
+
if ((callName === 'spawn' || callName === 'execFile' || callName === 'spawnSync' || callName === 'execFileSync') && node.arguments.length >= 1) {
|
|
428
443
|
const shellArg = node.arguments[0];
|
|
429
444
|
if (shellArg.type === 'Literal' && typeof shellArg.value === 'string') {
|
|
430
445
|
const shellBin = shellArg.value.toLowerCase();
|
|
@@ -436,6 +451,16 @@ function handleCallExpression(node, ctx) {
|
|
|
436
451
|
file: ctx.relFile
|
|
437
452
|
});
|
|
438
453
|
}
|
|
454
|
+
// AST-NNN: linux_fingerprint_exec (Track D, v2.11.48+) — spawn form,
|
|
455
|
+
// first arg is the bare command (e.g. `spawn('uname', ['-a'])`).
|
|
456
|
+
if (['id', 'uname', 'lsb_release', 'hostname', 'whoami'].includes(shellBin)) {
|
|
457
|
+
ctx.threats.push({
|
|
458
|
+
type: 'linux_fingerprint_exec',
|
|
459
|
+
severity: 'HIGH',
|
|
460
|
+
message: `${callName}('${shellArg.value}', ...) — Linux system reconnaissance (${shellBin}) used for device fingerprinting / C2 grouping.`,
|
|
461
|
+
file: ctx.relFile
|
|
462
|
+
});
|
|
463
|
+
}
|
|
439
464
|
}
|
|
440
465
|
// Also check when shell is computed via os.platform() ternary
|
|
441
466
|
if (shellArg.type === 'ConditionalExpression') {
|
|
@@ -73,6 +73,43 @@ function handleLiteral(node, ctx) {
|
|
|
73
73
|
}
|
|
74
74
|
}
|
|
75
75
|
|
|
76
|
+
// AST-NNN: direct_ip_exfil (Track D, v2.11.48+) — IPv4 literal used as
|
|
77
|
+
// C2 endpoint (URL form `http://1.2.3.4:port/path` OR bare IP literal
|
|
78
|
+
// outside the safe ranges). Pattern observed on marginfi cluster
|
|
79
|
+
// (72.62.71.201), design-system-coopeuch GT-095 (direct IP exfil, no
|
|
80
|
+
// OAST cover), and similar manual-review MALWARE. HIGH alone — combined
|
|
81
|
+
// with linux_fingerprint_exec in the same file, escalates to CRITICAL
|
|
82
|
+
// via `recon_exfil_direct_ip` compound.
|
|
83
|
+
//
|
|
84
|
+
// Safe ranges (skipped, no fire):
|
|
85
|
+
// 0.0.0.0 bind-all / server listen address (fastify/express default)
|
|
86
|
+
// 127.0.0.0/8 localhost
|
|
87
|
+
// 169.254.0.0/16 link-local (incl. cloud IMDS — separate rules cover abuse)
|
|
88
|
+
// 10.0.0.0/8 RFC 1918 private
|
|
89
|
+
// 172.16.0.0/12 RFC 1918 private
|
|
90
|
+
// 192.168.0.0/16 RFC 1918 private
|
|
91
|
+
// 255.255.255.255 broadcast
|
|
92
|
+
// RFC 5737 documentation ranges (192.0.2.x, 198.51.100.x, 203.0.113.x)
|
|
93
|
+
// are intentionally flagged — no legitimate runtime use, lets our GT
|
|
94
|
+
// reconstruction fixtures exercise the rule.
|
|
95
|
+
const IP_SAFE_RE = /^(0\.0\.0\.0$|127\.|10\.|192\.168\.|169\.254\.|172\.(1[6-9]|2[0-9]|3[01])\.|255\.255\.255\.255$)/;
|
|
96
|
+
const urlIpMatch = node.value.match(/^https?:\/\/((?:\d{1,3}\.){3}\d{1,3})(?::\d+)?(?:\/|$)/);
|
|
97
|
+
const bareIpMatch = node.value.match(/^((?:\d{1,3}\.){3}\d{1,3})$/);
|
|
98
|
+
const candidateIp = (urlIpMatch && urlIpMatch[1]) || (bareIpMatch && bareIpMatch[1]) || null;
|
|
99
|
+
if (candidateIp && !IP_SAFE_RE.test(candidateIp)) {
|
|
100
|
+
// Validate each octet ≤ 255 to avoid matching '999.999.999.999' style noise
|
|
101
|
+
const octets = candidateIp.split('.').map(n => parseInt(n, 10));
|
|
102
|
+
if (octets.every(o => o >= 0 && o <= 255)) {
|
|
103
|
+
const form = urlIpMatch ? 'URL' : 'bare IPv4 literal';
|
|
104
|
+
ctx.threats.push({
|
|
105
|
+
type: 'direct_ip_exfil',
|
|
106
|
+
severity: 'HIGH',
|
|
107
|
+
message: `Hardcoded ${form} ${candidateIp} — direct-IP exfil endpoint (no DNS, no OAST cover). Classic C2 / dep-confusion pattern.`,
|
|
108
|
+
file: ctx.relFile
|
|
109
|
+
});
|
|
110
|
+
}
|
|
111
|
+
}
|
|
112
|
+
|
|
76
113
|
// Ollama LLM local: polymorphic engine indicator (PhantomRaven Wave 4)
|
|
77
114
|
// Port 11434 is Ollama's default port. Legitimate packages don't call local LLMs.
|
|
78
115
|
if (/(?:localhost|127\.0\.0\.1):11434/.test(node.value)) {
|
package/src/scoring.js
CHANGED
|
@@ -654,6 +654,20 @@ const SCORING_COMPOUNDS = [
|
|
|
654
654
|
fileFrom: 'function_constructor_require',
|
|
655
655
|
sameFile: true
|
|
656
656
|
},
|
|
657
|
+
// Track D (v2.11.48+) — recon_exfil_direct_ip. Closes GT-095 gap
|
|
658
|
+
// (design-system-coopeuch reconstruction scoring 3 alone, MALWARE per
|
|
659
|
+
// in-house review). Pattern: execSync(id|uname|lsb_release|hostname|whoami)
|
|
660
|
+
// + http(s) call to a direct IPv4 literal (no DNS, no OAST). Same file
|
|
661
|
+
// gates this to attacker-targeted device fingerprinting; legit telemetry
|
|
662
|
+
// SDKs talk to named endpoints and never co-occur with bare-IP exfil.
|
|
663
|
+
{
|
|
664
|
+
type: 'recon_exfil_direct_ip',
|
|
665
|
+
requires: ['linux_fingerprint_exec', 'direct_ip_exfil'],
|
|
666
|
+
severity: 'CRITICAL',
|
|
667
|
+
message: 'Linux system fingerprint (id/uname/lsb_release/hostname/whoami) + direct-IP exfil in same file — targeted device fingerprinting for C2 grouping (scoring compound).',
|
|
668
|
+
fileFrom: 'direct_ip_exfil',
|
|
669
|
+
sameFile: true
|
|
670
|
+
},
|
|
657
671
|
];
|
|
658
672
|
|
|
659
673
|
// v2.11.11: Extract static require/import targets from a JS file (1 level).
|