@fanboynz/network-scanner 3.1.0 → 3.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +76 -0
- package/CLAUDE.md +2 -1
- package/README.md +33 -5
- package/eslint.config.mjs +13 -1
- package/lib/browserhealth.js +28 -94
- package/lib/dns.js +238 -0
- package/lib/domain-cache.js +14 -127
- package/lib/fingerprint.js +220 -97
- package/lib/fingerprint.md +94 -0
- package/lib/ghost-cursor.js +29 -11
- package/lib/interaction.js +4 -0
- package/lib/nettools.js +154 -51
- package/lib/output.js +24 -13
- package/lib/proxy.js +6 -2
- package/lib/redirect.js +4 -1
- package/lib/smart-cache.js +9 -1
- package/lib/socks-relay.js +14 -9
- package/lib/validate_rules.js +16 -1
- package/nwss.1 +76 -15
- package/nwss.js +389 -113
- package/package.json +1 -1
|
@@ -0,0 +1,94 @@
|
|
|
1
|
+
# `lib/fingerprint.js` — Fingerprint Spoofing Coverage
|
|
2
|
+
|
|
3
|
+
Bot-detection evasion for the scanner's headless Chromium. The goal is to make a
|
|
4
|
+
scanned page see a coherent, real-Chrome **Stable** desktop profile rather than a
|
|
5
|
+
headless/automation signature — and, just as important, to keep every spoofed
|
|
6
|
+
value **internally consistent** (JS ↔ HTTP, claimed-value ↔ observable reality)
|
|
7
|
+
so a detector cross-checking two surfaces can't catch a mismatch.
|
|
8
|
+
|
|
9
|
+
## How it works
|
|
10
|
+
|
|
11
|
+
Spoofing is applied per page, before navigation, by `applyAllFingerprintSpoofing(page, siteConfig, …)`, which runs three stages:
|
|
12
|
+
|
|
13
|
+
| Stage | Gate (siteConfig) | What it covers |
|
|
14
|
+
|---|---|---|
|
|
15
|
+
| `applyUserAgentSpoofing` | **`userAgent`** (defaults to `"chrome"`) | Browser identity, automation/headless tells, and the bulk of the navigator/JS-API suite |
|
|
16
|
+
| `applyBraveSpoofing` | Brave-mode only | Brave-specific surfaces |
|
|
17
|
+
| `applyFingerprintProtection` | **`fingerprint_protection`** (`true` \| `"random"`) | Hardware fingerprint *values* (canvas/WebGL/audio noise, screen, memory) + CDP timezone. `"random"` seeds them per-domain (stable per site, varies across sites) |
|
|
18
|
+
|
|
19
|
+
HTTP **Client Hints** request headers are set separately in `nwss.js` (gated on a `chrome` userAgent). Identity is pinned to **Stable Chrome** via two constants in `fingerprint.js` (`CHROME_BUILD`, `CHROME_GREASE_BRAND`) + the major in `USER_AGENT_COLLECTIONS` — see `feedback_chrome_spoof_version_bump`.
|
|
20
|
+
|
|
21
|
+
**Gate legend:** `UA` = runs with `userAgent` set (on by default) · `FP` = runs with `fingerprint_protection` · `HTTP` = request header set in nwss.js.
|
|
22
|
+
|
|
23
|
+
## Browser identity
|
|
24
|
+
|
|
25
|
+
| Surface | Mitigation | Gate |
|
|
26
|
+
|---|---|---|
|
|
27
|
+
| `navigator.userAgent` / `appVersion` | Pinned to Stable Chrome 148 desktop UA | UA |
|
|
28
|
+
| `navigator.userAgentData` (brands, platform, mobile) | Spoofed; brand order + GREASE string match real Chrome of the major exactly | UA |
|
|
29
|
+
| `getHighEntropyValues()` | Full set: architecture, bitness, model, **wow64**, platformVersion, **uaFullVersion**, fullVersionList, **formFactors** — build from `CHROME_BUILD`, consistent with HTTP | UA |
|
|
30
|
+
| `navigator.platform` / `vendor` / `productSub` / `vendorSub` | Spoofed UA-consistent (`Win32`, `Google Inc.`, `20030107`, `""`) | UA |
|
|
31
|
+
| `Sec-CH-UA`, `-Platform`, `-Platform-Version`, `-Mobile`, `-Arch`, `-Bitness`, `-WoW64`, `-Model`, `-Full-Version`, `-Full-Version-List`, `-Form-Factors` | Set to match the JS values (same brand order/grease/build) | HTTP |
|
|
32
|
+
|
|
33
|
+
## Automation & headless tells
|
|
34
|
+
|
|
35
|
+
| Surface | Mitigation | Gate |
|
|
36
|
+
|---|---|---|
|
|
37
|
+
| `navigator.webdriver` | Forced `false` (launch flag + JS) | UA |
|
|
38
|
+
| `cdc_…` / `$cdc_…` / selenium / phantom props | Removed | UA |
|
|
39
|
+
| `window.chrome` + `chrome.runtime` | Provided / simulated | UA |
|
|
40
|
+
| `<html webdriver>` attribute | Stripped | UA |
|
|
41
|
+
| `navigator.plugins` / `mimeTypes` | Native 5-PDF set preserved (matches real Chrome) | UA |
|
|
42
|
+
| `navigator.bluetooth` | Stub added (`getAvailability()→false`) — real Chrome always exposes it | UA |
|
|
43
|
+
| `navigator.share` / `canShare` | Stubs added (Web Share; absent in headless) | UA |
|
|
44
|
+
| `speechSynthesis.getVoices()` | Claimed-OS voice set (Windows → Microsoft + Google, 22 voices) | UA |
|
|
45
|
+
| `Notification.permission` / `permissions.query` | `default` / consistent results | UA |
|
|
46
|
+
| `navigator.userActivation` / `getInstalledRelatedApps` / `document.hasStorageAccess` | Stubs (present in real Chrome) | UA |
|
|
47
|
+
|
|
48
|
+
## Hardware & rendering
|
|
49
|
+
|
|
50
|
+
| Surface | Mitigation | Gate |
|
|
51
|
+
|---|---|---|
|
|
52
|
+
| WebGL `UNMASKED_VENDOR/RENDERER` | Spoofed GPU from an OS-appropriate pool (per-domain seeded) | UA + FP |
|
|
53
|
+
| Canvas (`toDataURL`/`getImageData`) | Per-canvas noise (WeakMap-cached) | UA + FP |
|
|
54
|
+
| AudioContext / `AudioBuffer` | `getChannelData`/`copyFromChannel` intercepted to defeat audio fingerprint | UA + FP |
|
|
55
|
+
| Fonts (`measureText`/offset probes) | Normalized font metrics | UA |
|
|
56
|
+
| `screen.*` (width/height/avail/colorDepth) | Spoofed (1920×1080, colorDepth 24) | UA + FP |
|
|
57
|
+
| `navigator.hardwareConcurrency` | Spoofed down to 4–8 (hides datacenter core count; no HTTP counterpart) | FP |
|
|
58
|
+
| `navigator.deviceMemory` (JS) + `Sec-CH-Device-Memory` (HTTP) | Both pinned to **8** (hides 32 GB host; JS = HTTP, gated together on FP) | FP / HTTP |
|
|
59
|
+
| `PerformanceNavigationTiming` | Jittered to defeat timing fingerprint | UA |
|
|
60
|
+
|
|
61
|
+
## Sensors, locale & network
|
|
62
|
+
|
|
63
|
+
| Surface | Mitigation | Gate |
|
|
64
|
+
|---|---|---|
|
|
65
|
+
| Battery Status API | Plugged-in default (`charging:true, level:1, dischargingTime:Infinity`) — blends with the majority | UA |
|
|
66
|
+
| `navigator.connection` (rtt/downlink/effectiveType) | **Native** (left untouched when present) — truthful to the real network so it survives a timing cross-check | — |
|
|
67
|
+
| `navigator.languages` / `language` | `["en-US","en"]` / `en-US` | UA |
|
|
68
|
+
| **Timezone** (`Date`, `Intl`, `getTimezoneOffset`) | CDP `emulateTimezone()` — makes all three consistent + DST-correct (replaced broken JS overrides) | FP |
|
|
69
|
+
| `matchMedia` hover/pointer/color-scheme | Desktop-consistent (`hover`, `fine` pointer) | UA |
|
|
70
|
+
| `maxTouchPoints` | UA-consistent (`0` on desktop) | UA |
|
|
71
|
+
| WebRTC ICE candidates | All candidates stripped → no STUN public-IP leak past the proxy | UA |
|
|
72
|
+
| `mediaDevices.enumerateDevices` | Plausible device set | UA |
|
|
73
|
+
|
|
74
|
+
## Anti-introspection
|
|
75
|
+
|
|
76
|
+
| Surface | Mitigation | Gate |
|
|
77
|
+
|---|---|---|
|
|
78
|
+
| `Function.prototype.toString` | Every overridden function masked to `function X() { [native code] }` (bulk + per-instance) | UA |
|
|
79
|
+
| `Error.stack` / `prepareStackTrace` | Sanitized so injected frames don't leak | UA |
|
|
80
|
+
| Console error noise from spoofs | Suppressed | UA |
|
|
81
|
+
|
|
82
|
+
## Known limitations (not fixable at the browser layer)
|
|
83
|
+
|
|
84
|
+
| Vector | Why it's out of scope | Mitigation |
|
|
85
|
+
|---|---|---|
|
|
86
|
+
| **IP reputation** | A datacenter IP is the single biggest tell; no JS/header spoof touches it | Residential **proxy/VPN** (`lib/proxy.js`, `lib/wireguard_vpn.js`, `lib/openvpn_vpn.js`) |
|
|
87
|
+
| **TLS (JA3/JA4) + HTTP/2 fingerprint** | Negotiated below the JS layer | Puppeteer's Chromium already presents a genuine Chrome stack; a MITM proxy can alter it |
|
|
88
|
+
| **Timezone vs exit-IP geolocation** | Timezone is now internally consistent, but the *chosen* zone should match the proxy's country | Per-proxy geo config (not yet wired) |
|
|
89
|
+
| **Behavioural / mouse dynamics** | Statistical, not a property | `interact` / `ghost-cursor` config (`lib/interaction.js`) |
|
|
90
|
+
|
|
91
|
+
## Verification
|
|
92
|
+
|
|
93
|
+
- **`scripts/test-stealth.js`** — automated smoke test against sannysoft / creepjs / browserleaks. Run before/after a spoof change and diff.
|
|
94
|
+
- **Manual reference diff** — launch with the spoof applied and compare each surface against a real Chrome of the pinned major (the coverage above was validated field-for-field against a live Chrome 148 desktop). The unspoofed deviations are deliberate: `hardwareConcurrency`/`deviceMemory` downscaled to hide the host, and `connection` left native.
|
package/lib/ghost-cursor.js
CHANGED
|
@@ -15,6 +15,11 @@
|
|
|
15
15
|
// npm install ghost-cursor (optional dependency)
|
|
16
16
|
|
|
17
17
|
const { formatLogMessage, messageColors } = require('./colorize');
|
|
18
|
+
// humanClick gives the coordinate-click path the same press realism as the
|
|
19
|
+
// built-in content clicks (hover dwell + mousedown/hold/mouseup, optional
|
|
20
|
+
// hand-tremor + mouseup drift) instead of a 0ms page.mouse.click. One-way
|
|
21
|
+
// require — interaction.js does not depend on ghost-cursor, so no cycle.
|
|
22
|
+
const { humanClick } = require('./interaction');
|
|
18
23
|
const GHOST_CURSOR_TAG = messageColors.processing('[ghost-cursor]');
|
|
19
24
|
|
|
20
25
|
let ghostCursorModule = null;
|
|
@@ -56,7 +61,7 @@ function createGhostCursor(page, options = {}) {
|
|
|
56
61
|
const cursor = ghostCursorModule.createCursor(page, { x: startX, y: startY });
|
|
57
62
|
|
|
58
63
|
if (forceDebug) {
|
|
59
|
-
console.log(formatLogMessage('debug',
|
|
64
|
+
console.log(formatLogMessage('debug', `${GHOST_CURSOR_TAG} Cursor instance created`));
|
|
60
65
|
}
|
|
61
66
|
|
|
62
67
|
return cursor;
|
|
@@ -98,7 +103,7 @@ async function ghostMove(cursor, toX, toY, options = {}) {
|
|
|
98
103
|
const moveOpts = {};
|
|
99
104
|
if (moveSpeed !== undefined) moveOpts.moveSpeed = moveSpeed;
|
|
100
105
|
if (moveDelay > 0) moveOpts.moveDelay = moveDelay;
|
|
101
|
-
|
|
106
|
+
moveOpts.randomizeMoveDelay = randomizeMoveDelay; // always defined (defaults to true)
|
|
102
107
|
if (overshootThreshold !== undefined) moveOpts.overshootThreshold = overshootThreshold;
|
|
103
108
|
|
|
104
109
|
await cursor.moveTo({ x: toX, y: toY }, moveOpts);
|
|
@@ -126,6 +131,8 @@ async function ghostMove(cursor, toX, toY, options = {}) {
|
|
|
126
131
|
* @param {number} options.waitForClick - Delay (ms) between mousedown/mouseup (default: auto)
|
|
127
132
|
* @param {number} options.moveDelay - Delay (ms) after moving to target
|
|
128
133
|
* @param {number} options.paddingPercentage - Click point within element (0=edge, 100=center)
|
|
134
|
+
* @param {import('puppeteer').Page} options.page - Page for coordinate clicks (falls back to cursor.page)
|
|
135
|
+
* @param {boolean} options.realistic - Coordinate clicks: emit hand-tremor + mouseup drift (default: false)
|
|
129
136
|
* @param {boolean} options.forceDebug - Enable debug logging
|
|
130
137
|
* @returns {Promise<boolean>} true if click succeeded
|
|
131
138
|
*/
|
|
@@ -137,6 +144,8 @@ async function ghostClick(cursor, target, options = {}) {
|
|
|
137
144
|
waitForClick,
|
|
138
145
|
moveDelay,
|
|
139
146
|
paddingPercentage,
|
|
147
|
+
page,
|
|
148
|
+
realistic = false,
|
|
140
149
|
forceDebug
|
|
141
150
|
} = options;
|
|
142
151
|
|
|
@@ -149,16 +158,25 @@ async function ghostClick(cursor, target, options = {}) {
|
|
|
149
158
|
if (typeof target === 'string') {
|
|
150
159
|
await cursor.click(target, clickOpts);
|
|
151
160
|
} else {
|
|
152
|
-
//
|
|
161
|
+
// Coordinate click: ghost-cursor's bezier moveTo brings the cursor to the
|
|
162
|
+
// point, then humanClick does the realistic press (hover dwell, mousedown
|
|
163
|
+
// → hold → mouseup, plus hand-tremor + down≠up drift when realistic). This
|
|
164
|
+
// replaces a 0ms page.mouse.click, so the ghost path gets the same click
|
|
165
|
+
// realism as built-in content clicks.
|
|
153
166
|
await cursor.moveTo(target);
|
|
154
|
-
//
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
167
|
+
// Prefer the caller-supplied page; fall back to the cursor's own page
|
|
168
|
+
// (ghost-cursor exposes it as cursor.page) so we don't depend on internals.
|
|
169
|
+
// Return false (not silent success) if there's no usable page — otherwise
|
|
170
|
+
// the "Clicked" log + return true below would lie about a click that
|
|
171
|
+
// never fired.
|
|
172
|
+
const clickPage = page || cursor.page;
|
|
173
|
+
if (!clickPage || typeof clickPage.mouse?.down !== 'function') {
|
|
174
|
+
if (forceDebug) {
|
|
175
|
+
console.log(formatLogMessage('debug', `${GHOST_CURSOR_TAG} Coordinate click skipped: no usable page`));
|
|
176
|
+
}
|
|
177
|
+
return false;
|
|
161
178
|
}
|
|
179
|
+
await humanClick(clickPage, target.x, target.y, { realistic, forceDebug });
|
|
162
180
|
}
|
|
163
181
|
|
|
164
182
|
if (forceDebug) {
|
|
@@ -189,7 +207,7 @@ async function ghostRandomMove(cursor, options = {}) {
|
|
|
189
207
|
try {
|
|
190
208
|
await cursor.randomMove();
|
|
191
209
|
if (options.forceDebug) {
|
|
192
|
-
console.log(formatLogMessage('debug',
|
|
210
|
+
console.log(formatLogMessage('debug', `${GHOST_CURSOR_TAG} Random movement performed`));
|
|
193
211
|
}
|
|
194
212
|
return true;
|
|
195
213
|
} catch (err) {
|
package/lib/interaction.js
CHANGED
|
@@ -1333,5 +1333,9 @@ module.exports = {
|
|
|
1333
1333
|
simulateScrolling,
|
|
1334
1334
|
interactWithElements,
|
|
1335
1335
|
performContentClicks,
|
|
1336
|
+
// Realistic timed click (hover dwell + mousedown/hold/mouseup, optional
|
|
1337
|
+
// hand-tremor + mouseup drift). Reused by lib/ghost-cursor.js so the ghost
|
|
1338
|
+
// coordinate click gets the same press realism as built-in content clicks.
|
|
1339
|
+
humanClick,
|
|
1336
1340
|
generateRandomCoordinates
|
|
1337
1341
|
};
|
package/lib/nettools.js
CHANGED
|
@@ -124,7 +124,6 @@ function loadDiskCache(filePath, cache, ttl, maxSize) {
|
|
|
124
124
|
// Surface the event so the user knows they lost their warm cache;
|
|
125
125
|
// previously this was a silent reset, which made "why did my dns
|
|
126
126
|
// cache stop helping?" hard to diagnose.
|
|
127
|
-
// eslint-disable-next-line no-console
|
|
128
127
|
console.warn(`${messageColors.highlight('[dns-cache]')} ${path.basename(filePath)} was unreadable (${err.message}); starting fresh`);
|
|
129
128
|
try { fs.unlinkSync(filePath); } catch {}
|
|
130
129
|
}
|
|
@@ -256,6 +255,38 @@ function getDnsCacheStats() {
|
|
|
256
255
|
// Disk cache is opt-in via --dns-cache flag
|
|
257
256
|
let diskCacheEnabled = false;
|
|
258
257
|
|
|
258
|
+
// Optional dig resolver(s), set from --dns. When non-empty, dig queries
|
|
259
|
+
// `@<one of these>` (round-robin) instead of the system resolver — so dig uses
|
|
260
|
+
// the same reliable servers as the pre-check rather than a flaky /etc/resolv.conf
|
|
261
|
+
// (the cause of `dig: Command timeout` drops on Cloudflare-fronted ad domains).
|
|
262
|
+
let digResolvers = [];
|
|
263
|
+
let digResolverCursor = 0;
|
|
264
|
+
// dig's `@server` wants a bare IP; strip any `ipv4:port` / `[ipv6]:port` form.
|
|
265
|
+
function digServerFromSpec(spec) {
|
|
266
|
+
const s = String(spec);
|
|
267
|
+
const br = s.match(/^\[([0-9a-fA-F:]+)\]/);
|
|
268
|
+
if (br) return br[1];
|
|
269
|
+
const v4p = s.match(/^(\d{1,3}(?:\.\d{1,3}){3}):\d+$/);
|
|
270
|
+
if (v4p) return v4p[1];
|
|
271
|
+
return s;
|
|
272
|
+
}
|
|
273
|
+
function setDigResolvers(servers) {
|
|
274
|
+
digResolvers = (Array.isArray(servers) ? servers : []).filter(Boolean).map(digServerFromSpec);
|
|
275
|
+
}
|
|
276
|
+
// Ordered `@server` attempt list for ONE dig lookup: starts at the round-robin
|
|
277
|
+
// cursor (advanced once per lookup, preserving the old fairness) then falls
|
|
278
|
+
// through the remaining resolvers as failover. Returns [null] when no --dns
|
|
279
|
+
// resolvers are configured — a single attempt via the system resolver.
|
|
280
|
+
function digServerAttemptList() {
|
|
281
|
+
if (digResolvers.length === 0) return [null];
|
|
282
|
+
const start = digResolverCursor++ % digResolvers.length;
|
|
283
|
+
const list = [];
|
|
284
|
+
for (let i = 0; i < digResolvers.length; i++) {
|
|
285
|
+
list.push('@' + digResolvers[(start + i) % digResolvers.length]);
|
|
286
|
+
}
|
|
287
|
+
return list;
|
|
288
|
+
}
|
|
289
|
+
|
|
259
290
|
/**
|
|
260
291
|
* Enable persistent disk caching for dig/whois results.
|
|
261
292
|
* Call this when --dns-cache flag is set. Idempotent — repeated calls
|
|
@@ -293,7 +324,6 @@ function enableDiskCache() {
|
|
|
293
324
|
// Debug log only if anything was actually warmed; silent on fresh
|
|
294
325
|
// installs / empty disk caches.
|
|
295
326
|
if (digWarm > 0 || whoisWarm > 0) {
|
|
296
|
-
// eslint-disable-next-line no-console
|
|
297
327
|
console.log(`${messageColors.highlight('[dns-cache]')} Warmed resolved-hostnames index from disk: ${digWarm} dig + ${whoisWarm} whois entries`);
|
|
298
328
|
}
|
|
299
329
|
|
|
@@ -994,50 +1024,103 @@ async function whoisLookupWithRetry(domain = '', timeout = 10000, whoisServer =
|
|
|
994
1024
|
* @returns {Promise<Object>} Object with success status and output/error
|
|
995
1025
|
*/
|
|
996
1026
|
async function digLookup(domain = '', recordType = 'A', timeout = 5000) {
|
|
997
|
-
|
|
998
|
-
|
|
999
|
-
|
|
1027
|
+
// Clean domain (defensive — callers usually pass an already-clean digDomain).
|
|
1028
|
+
const cleanDomain = domain.replace(/^https?:\/\//, '').replace(/\/.*$/, '').replace(/:\d+$/, '');
|
|
1029
|
+
|
|
1030
|
+
// dig argv-injection guard. dig parses @/-/+ -leading tokens as options
|
|
1031
|
+
// (`@host` redirects the query to an arbitrary server, `-f path` reads a
|
|
1032
|
+
// file as a query batch) and has no `--` end-of-options marker like whois.
|
|
1033
|
+
// Reject anything not hostname-shaped before shelling out — success:false so
|
|
1034
|
+
// it's treated as no-match and not cached. (Charset blocks @ + / space etc;
|
|
1035
|
+
// the leading-`-` check blocks `-f` and friends, since `-` is valid mid-host.)
|
|
1036
|
+
if (!cleanDomain || /[^a-zA-Z0-9._-]/.test(cleanDomain) || cleanDomain.startsWith('-')) {
|
|
1037
|
+
return { success: false, error: `invalid domain shape: ${cleanDomain}`, domain: cleanDomain, recordType };
|
|
1038
|
+
}
|
|
1039
|
+
|
|
1040
|
+
// Resolver failover: try the round-robin resolver first, then fall through
|
|
1041
|
+
// the remaining --dns resolvers on timeout / no-reply / REFUSED / SERVFAIL —
|
|
1042
|
+
// the same resilience the whois path already has via whoisLookupWithRetry,
|
|
1043
|
+
// and the DNS pre-check has via its rotation. Capped at 3 attempts (matches
|
|
1044
|
+
// whois maxRetries default) so a host that's dead on every resolver can't
|
|
1045
|
+
// burn the whole nettools budget.
|
|
1046
|
+
const attempts = digServerAttemptList();
|
|
1047
|
+
// Only do JS-level failover when --dns gave us pinned resolvers. Without it,
|
|
1048
|
+
// attempts is [null]: a SINGLE system-resolver invocation that keeps dig's
|
|
1049
|
+
// native resolv.conf rotation + retries (forcing +tries=1 there would strip
|
|
1050
|
+
// that built-in resilience — the whole point is to be MORE resilient).
|
|
1051
|
+
const usingResolvers = attempts[0] !== null;
|
|
1052
|
+
const maxAttempts = usingResolvers ? Math.min(3, attempts.length) : 1;
|
|
1053
|
+
// Pinned-resolver attempts use +time=2 +tries=1 (the JS loop owns failover)
|
|
1054
|
+
// under a 4s SIGTERM ceiling. The system-resolver path keeps the full budget
|
|
1055
|
+
// and dig's own retry behaviour, matching the pre-failover semantics exactly.
|
|
1056
|
+
const perAttemptTimeout = usingResolvers ? Math.min(timeout, 4000) : timeout;
|
|
1057
|
+
|
|
1058
|
+
let lastError = 'no resolver attempts made';
|
|
1059
|
+
|
|
1060
|
+
for (let i = 0; i < maxAttempts; i++) {
|
|
1061
|
+
const digServerArg = attempts[i];
|
|
1062
|
+
// With a pinned resolver: one fast try (+time=2 +tries=1), then the JS loop
|
|
1063
|
+
// moves to the next resolver. Without --dns: bare `dig name type` so dig
|
|
1064
|
+
// applies its native resolv.conf rotation. execFile (no shell) => args
|
|
1065
|
+
// can't be injected.
|
|
1066
|
+
const digArgs = digServerArg
|
|
1067
|
+
? [digServerArg, '+time=2', '+tries=1', cleanDomain, recordType]
|
|
1068
|
+
: [cleanDomain, recordType];
|
|
1069
|
+
const resolverLabel = digServerArg ? digServerArg.slice(1) : 'system resolver';
|
|
1070
|
+
|
|
1071
|
+
try {
|
|
1072
|
+
const { stdout: fullOutput } = await execFileWithTimeout('dig', digArgs, perAttemptTimeout);
|
|
1073
|
+
|
|
1074
|
+
// Judge success by RCODE, not by stderr. dig exits 0 for ANY server
|
|
1075
|
+
// response, so non-zero exit (timeout / no-reply) already rejected above.
|
|
1076
|
+
// REFUSED/SERVFAIL are resolver-SIDE failures another resolver may not
|
|
1077
|
+
// share — fail over instead of accepting an answerless response (the
|
|
1078
|
+
// EREFUSED-storm case). NOERROR/NXDOMAIN are definitive => accept.
|
|
1079
|
+
const statusMatch = fullOutput.match(/status:\s*([A-Z]+)/i);
|
|
1080
|
+
const rcode = statusMatch ? statusMatch[1].toUpperCase() : 'NOERROR';
|
|
1081
|
+
if (rcode === 'REFUSED' || rcode === 'SERVFAIL') {
|
|
1082
|
+
lastError = `dig ${rcode} from ${resolverLabel}`;
|
|
1083
|
+
continue; // try next resolver in the failover list
|
|
1084
|
+
}
|
|
1085
|
+
|
|
1086
|
+
// Non-empty stderr is intentionally NOT treated as failure here: dig
|
|
1087
|
+
// prints `;; communications error ... timed out` warnings to stderr while
|
|
1088
|
+
// still returning a valid ANSWER SECTION and exit 0. The old code failed
|
|
1089
|
+
// the whole lookup on any stderr, discarding good answers — the exact
|
|
1090
|
+
// missed-match pattern under flaky resolvers.
|
|
1091
|
+
const answerMatch = fullOutput.match(/;; ANSWER SECTION:\n([\s\S]*?)(?:\n;;|\n*$)/);
|
|
1092
|
+
let shortOutput = '';
|
|
1093
|
+
if (answerMatch) {
|
|
1094
|
+
shortOutput = answerMatch[1]
|
|
1095
|
+
.split('\n')
|
|
1096
|
+
.map(line => line.split(/\s+/).pop())
|
|
1097
|
+
.filter(Boolean)
|
|
1098
|
+
.join('\n');
|
|
1099
|
+
}
|
|
1000
1100
|
|
|
1001
|
-
// Single dig command — full output contains everything including short
|
|
1002
|
-
// answers. execFile (no shell) so cleanDomain / recordType can contain
|
|
1003
|
-
// any chars without injection risk.
|
|
1004
|
-
const { stdout: fullOutput, stderr } = await execFileWithTimeout('dig', [cleanDomain, recordType], timeout);
|
|
1005
|
-
|
|
1006
|
-
if (stderr && stderr.trim()) {
|
|
1007
1101
|
return {
|
|
1008
|
-
success:
|
|
1009
|
-
|
|
1102
|
+
success: true,
|
|
1103
|
+
output: fullOutput,
|
|
1104
|
+
shortOutput,
|
|
1010
1105
|
domain: cleanDomain,
|
|
1011
|
-
recordType
|
|
1106
|
+
recordType,
|
|
1107
|
+
resolver: resolverLabel
|
|
1012
1108
|
};
|
|
1109
|
+
} catch (error) {
|
|
1110
|
+
// Timeout or non-zero exit (e.g. dig exit 9 = no reply from this server).
|
|
1111
|
+
// Record and fall through to the next resolver.
|
|
1112
|
+
lastError = error.message;
|
|
1013
1113
|
}
|
|
1014
|
-
|
|
1015
|
-
// Extract short output from ANSWER SECTION of full dig output
|
|
1016
|
-
const answerMatch = fullOutput.match(/;; ANSWER SECTION:\n([\s\S]*?)(?:\n;;|\n*$)/);
|
|
1017
|
-
let shortOutput = '';
|
|
1018
|
-
if (answerMatch) {
|
|
1019
|
-
shortOutput = answerMatch[1]
|
|
1020
|
-
.split('\n')
|
|
1021
|
-
.map(line => line.split(/\s+/).pop())
|
|
1022
|
-
.filter(Boolean)
|
|
1023
|
-
.join('\n');
|
|
1024
|
-
}
|
|
1025
|
-
|
|
1026
|
-
return {
|
|
1027
|
-
success: true,
|
|
1028
|
-
output: fullOutput,
|
|
1029
|
-
shortOutput,
|
|
1030
|
-
domain: cleanDomain,
|
|
1031
|
-
recordType
|
|
1032
|
-
};
|
|
1033
|
-
} catch (error) {
|
|
1034
|
-
return {
|
|
1035
|
-
success: false,
|
|
1036
|
-
error: error.message,
|
|
1037
|
-
domain: domain,
|
|
1038
|
-
recordType
|
|
1039
|
-
};
|
|
1040
1114
|
}
|
|
1115
|
+
|
|
1116
|
+
// Every attempt timed out / was refused. success:false so the handler does
|
|
1117
|
+
// NOT cache it (transient — caching would poison the domain for the TTL).
|
|
1118
|
+
return {
|
|
1119
|
+
success: false,
|
|
1120
|
+
error: lastError,
|
|
1121
|
+
domain: cleanDomain,
|
|
1122
|
+
recordType
|
|
1123
|
+
};
|
|
1041
1124
|
}
|
|
1042
1125
|
|
|
1043
1126
|
/**
|
|
@@ -1170,15 +1253,20 @@ function createNetToolsHandler(config) {
|
|
|
1170
1253
|
// Determine which domain will be used for dig lookup
|
|
1171
1254
|
const digDomain = digSubdomain && originalDomain ? originalDomain : domain;
|
|
1172
1255
|
|
|
1173
|
-
// For whois: use root domain only (whois data is consistent for entire
|
|
1174
|
-
|
|
1175
|
-
|
|
1256
|
+
// For whois: use root domain only (whois data is consistent for entire
|
|
1257
|
+
// domain). Only compute it when whois is actually configured — getRootDomain
|
|
1258
|
+
// does a domain parse, so on a dig-only config (no whois/whois-or) this skips
|
|
1259
|
+
// a parse + string build on every single request. whoisRootDomain is only
|
|
1260
|
+
// ever read inside the whois branch, so the `domain` fallback is never used.
|
|
1261
|
+
const wantWhois = hasWhois || hasWhoisOr;
|
|
1262
|
+
const whoisRootDomain = wantWhois ? (getRootDomain ? getRootDomain(`http://${domain}`) : domain) : domain;
|
|
1263
|
+
|
|
1176
1264
|
// Check if we need to perform any lookups with appropriate deduplication
|
|
1177
1265
|
// Whois: root domain + config (whois data same for sub.example.com and example.com)
|
|
1178
|
-
const whoisDedupeKey = `${whoisRootDomain}:${whoisConfigKey}
|
|
1266
|
+
const whoisDedupeKey = wantWhois ? `${whoisRootDomain}:${whoisConfigKey}` : '';
|
|
1179
1267
|
// Dig: specific subdomain + config (DNS records can differ between subdomains)
|
|
1180
1268
|
const digDedupeKey = `${digDomain}:${digConfigKey}`;
|
|
1181
|
-
const needsWhoisLookup =
|
|
1269
|
+
const needsWhoisLookup = wantWhois && !processedWhoisDomains.has(whoisDedupeKey);
|
|
1182
1270
|
const needsDigLookup = (hasDig || hasDigOr) && !processedDigDomains.has(digDedupeKey);
|
|
1183
1271
|
|
|
1184
1272
|
// Claim the dedupe keys NOW, synchronously, before executeNetToolsLookup
|
|
@@ -1606,11 +1694,20 @@ function createNetToolsHandler(config) {
|
|
|
1606
1694
|
// backwards-compat additive: old code reading new cache
|
|
1607
1695
|
// ignores it; new code reading old cache (no field) falls
|
|
1608
1696
|
// back to lazy on-hit population in the cache-hit branch.
|
|
1609
|
-
|
|
1610
|
-
|
|
1611
|
-
|
|
1612
|
-
|
|
1613
|
-
|
|
1697
|
+
//
|
|
1698
|
+
// Only cache a SUCCESSFUL dig. A timeout/error (success:false) is
|
|
1699
|
+
// transient — caching it would poison the domain for the full
|
|
1700
|
+
// cache TTL (20h when persisted via --dns-cache), so a host that
|
|
1701
|
+
// resolves fine on the next attempt keeps getting dropped. (An
|
|
1702
|
+
// NXDOMAIN is success:true with NXDOMAIN in the body — a real
|
|
1703
|
+
// answer — so it's correctly still cached.)
|
|
1704
|
+
if (digResult.success) {
|
|
1705
|
+
globalDigResultCache.set(digCacheKey, {
|
|
1706
|
+
result: digResult,
|
|
1707
|
+
timestamp: now,
|
|
1708
|
+
hostname: digDomain
|
|
1709
|
+
});
|
|
1710
|
+
}
|
|
1614
1711
|
dnsCacheStats.digMisses++;
|
|
1615
1712
|
pushFreshSample(dnsCacheStats.freshDig, `${digDomain} (${digRecordType})`);
|
|
1616
1713
|
// Index hostname IF dig actually proved resolution -- NXDOMAIN
|
|
@@ -1662,7 +1759,7 @@ function createNetToolsHandler(config) {
|
|
|
1662
1759
|
if (hasDig) logToConsoleAndFile(`${messageColors.highlight('[dig-and]')} Terms checked: ${digTerms.join(' AND ')}, matched: ${digMatched}`);
|
|
1663
1760
|
if (hasDigOr) logToConsoleAndFile(`${messageColors.highlight('[dig-or]')} Terms checked: ${digOrTerms.join(' OR ')}, matched: ${digOrMatched}`);
|
|
1664
1761
|
}
|
|
1665
|
-
logToConsoleAndFile(`${messageColors.highlight('[dig]')} Lookup completed for ${digDomain}, dig-and: ${digMatched}, dig-or: ${digOrMatched}`);
|
|
1762
|
+
logToConsoleAndFile(`${messageColors.highlight('[dig]')} Lookup completed for ${digDomain}${digResult.resolver ? ` via ${digResult.resolver}` : ''}, dig-and: ${digMatched}, dig-or: ${digOrMatched}`);
|
|
1666
1763
|
if (siteConfig.verbose === 1) {
|
|
1667
1764
|
if (hasDig) logToConsoleAndFile(`${messageColors.highlight('[dig]')} AND terms: ${digTerms.join(', ')}`);
|
|
1668
1765
|
if (hasDigOr) logToConsoleAndFile(`${messageColors.highlight('[dig]')} OR terms: ${digOrTerms.join(', ')}`);
|
|
@@ -1813,6 +1910,12 @@ module.exports = {
|
|
|
1813
1910
|
validateDigAvailability,
|
|
1814
1911
|
enableDiskCache,
|
|
1815
1912
|
getDnsCacheStats,
|
|
1913
|
+
// Route dig through the --dns resolver(s) instead of the system resolver.
|
|
1914
|
+
setDigResolvers,
|
|
1915
|
+
// Generic disk-cache primitives (atomic write, TTL/size-bounded) — reused by
|
|
1916
|
+
// nwss.js to persist the DNS pre-check negative cache under --dns-cache.
|
|
1917
|
+
loadDiskCache,
|
|
1918
|
+
saveDiskCache,
|
|
1816
1919
|
// Resolved-hostnames index for the DNS pre-check optimization.
|
|
1817
1920
|
// nwss.js's per-task pre-check consults this BEFORE calling resolve4
|
|
1818
1921
|
// so hosts already proven live by dig or whois (within their 20h
|
package/lib/output.js
CHANGED
|
@@ -133,32 +133,43 @@ function formatDomain(domain, options = {}) {
|
|
|
133
133
|
if (!domain || domain.length <= 6 || !domain.includes('.')) {
|
|
134
134
|
return null;
|
|
135
135
|
}
|
|
136
|
-
|
|
137
|
-
//
|
|
136
|
+
|
|
137
|
+
// Path-prefix rules (from output_regex) are stored as "host/path/" — they
|
|
138
|
+
// contain a '/'. Only adblock can express a path; every domain-only format
|
|
139
|
+
// (dnsmasq/unbound/pihole/hosts/privoxy/plain) falls back to the bare host
|
|
140
|
+
// (everything before the first '/') so output stays valid in all formats.
|
|
141
|
+
const slash = domain.indexOf('/');
|
|
142
|
+
const isPathRule = slash !== -1;
|
|
143
|
+
const host = isPathRule ? domain.slice(0, slash) : domain;
|
|
144
|
+
|
|
145
|
+
// If plain is true, always return just the host regardless of other options
|
|
138
146
|
if (plain) {
|
|
139
|
-
return
|
|
147
|
+
return host;
|
|
140
148
|
}
|
|
141
|
-
|
|
149
|
+
|
|
142
150
|
// Apply specific format based on output mode
|
|
143
151
|
if (pihole) {
|
|
144
152
|
// Escape dots for regex and use Pi-hole format: (^|\.)domain\.com$
|
|
145
|
-
const escapedDomain =
|
|
153
|
+
const escapedDomain = host.replace(/\./g, '\\.');
|
|
146
154
|
return `(^|\\.)${escapedDomain}$`;
|
|
147
155
|
} else if (privoxy) {
|
|
148
|
-
return `{ +block } .${
|
|
156
|
+
return `{ +block } .${host}`;
|
|
149
157
|
} else if (dnsmasq) {
|
|
150
|
-
return `local=/${
|
|
158
|
+
return `local=/${host}/`;
|
|
151
159
|
} else if (dnsmasqOld) {
|
|
152
|
-
return `server=/${
|
|
160
|
+
return `server=/${host}/`;
|
|
153
161
|
} else if (unbound) {
|
|
154
|
-
return `local-zone: "${
|
|
162
|
+
return `local-zone: "${host}." always_null`;
|
|
155
163
|
} else if (localhostIP) {
|
|
156
|
-
return `${localhostIP} ${
|
|
164
|
+
return `${localhostIP} ${host}`;
|
|
157
165
|
} else if (adblockRules && resourceType) {
|
|
158
|
-
//
|
|
159
|
-
|
|
166
|
+
// Adblock with resource-type modifier. A path rule self-anchors via its
|
|
167
|
+
// trailing '/', so it takes no '^' separator; a domain rule needs '^'.
|
|
168
|
+
return isPathRule ? `||${domain}${resourceType}` : `||${domain}^${resourceType}`;
|
|
160
169
|
} else {
|
|
161
|
-
|
|
170
|
+
// Default adblock: ||host^ for a domain, ||host/path/ for a path rule
|
|
171
|
+
// (the path already anchors, so no trailing '^').
|
|
172
|
+
return isPathRule ? `||${domain}` : `||${domain}^`;
|
|
162
173
|
}
|
|
163
174
|
}
|
|
164
175
|
|
package/lib/proxy.js
CHANGED
|
@@ -253,8 +253,12 @@ function getProxyArgs(siteConfig, forceDebug = false) {
|
|
|
253
253
|
console.warn(formatLogMessage('proxy', `proxy_remote_dns ignored: SOCKS4 cannot do proxy-side DNS resolution (use SOCKS5)`));
|
|
254
254
|
}
|
|
255
255
|
|
|
256
|
-
// Bypass list: domains that skip the proxy
|
|
257
|
-
|
|
256
|
+
// Bypass list: domains that skip the proxy. Accept either an array (the
|
|
257
|
+
// documented form) or a single string — a bare "localhost" used to throw
|
|
258
|
+
// `bypass.join is not a function` here, in the browser-launch path. Same
|
|
259
|
+
// string-or-array tolerance as the dig/whois siteConfig fields.
|
|
260
|
+
const rawBypass = siteConfig.proxy_bypass || siteConfig.socks5_bypass || [];
|
|
261
|
+
const bypass = Array.isArray(rawBypass) ? rawBypass : [rawBypass];
|
|
258
262
|
if (bypass.length > 0) {
|
|
259
263
|
args.push(`--proxy-bypass-list=${bypass.join(';')}`);
|
|
260
264
|
}
|
package/lib/redirect.js
CHANGED
|
@@ -19,7 +19,10 @@ async function navigateWithRedirectHandling(page, currentUrl, siteConfig, gotoOp
|
|
|
19
19
|
let httpStatus = null;
|
|
20
20
|
let cfRay = null;
|
|
21
21
|
const jsRedirectTimeout = siteConfig.js_redirect_timeout || 5000; // Wait 5s for JS redirects
|
|
22
|
-
|
|
22
|
+
// Use a number check, not || , so max_redirects: 0 (follow none) isn't
|
|
23
|
+
// swallowed as falsy and silently bumped to 10. Only absent/negative/non-number defaults.
|
|
24
|
+
const maxRedirects = (typeof siteConfig.max_redirects === 'number' && siteConfig.max_redirects >= 0)
|
|
25
|
+
? siteConfig.max_redirects : 10;
|
|
23
26
|
const detectJSPatterns = siteConfig.detect_js_patterns !== false; // Default to true
|
|
24
27
|
|
|
25
28
|
// Monitor frame navigations to detect redirects
|
package/lib/smart-cache.js
CHANGED
|
@@ -93,10 +93,16 @@ class SmartCache {
|
|
|
93
93
|
this._setupAutoSave();
|
|
94
94
|
}
|
|
95
95
|
|
|
96
|
-
// Set up memory monitoring
|
|
96
|
+
// Set up memory monitoring. unref'd so this always-on housekeeping timer
|
|
97
|
+
// can never hold the event loop open past scan completion — destroy()
|
|
98
|
+
// clears it promptly on the normal path, but unref guarantees a clean
|
|
99
|
+
// exit on any path that skips destroy() (e.g. an unhandled throw before
|
|
100
|
+
// nwss reaches its cleanup). Matches the unref convention applied to
|
|
101
|
+
// every other Node-side timer in the codebase.
|
|
97
102
|
this.memoryCheckInterval = setInterval(() => {
|
|
98
103
|
this._checkMemoryPressure();
|
|
99
104
|
}, this.options.memoryCheckInterval);
|
|
105
|
+
if (typeof this.memoryCheckInterval.unref === 'function') this.memoryCheckInterval.unref();
|
|
100
106
|
}
|
|
101
107
|
|
|
102
108
|
/**
|
|
@@ -1137,9 +1143,11 @@ class SmartCache {
|
|
|
1137
1143
|
* @private
|
|
1138
1144
|
*/
|
|
1139
1145
|
_setupAutoSave() {
|
|
1146
|
+
// unref'd for the same reason as memoryCheckInterval — never block exit.
|
|
1140
1147
|
this.autoSaveInterval = setInterval(() => {
|
|
1141
1148
|
this.savePersistentCache();
|
|
1142
1149
|
}, this.options.autoSaveInterval);
|
|
1150
|
+
if (typeof this.autoSaveInterval.unref === 'function') this.autoSaveInterval.unref();
|
|
1143
1151
|
}
|
|
1144
1152
|
|
|
1145
1153
|
/**
|
package/lib/socks-relay.js
CHANGED
|
@@ -227,13 +227,11 @@ function handleClient(client, upstream, forceDebug, relay) {
|
|
|
227
227
|
|
|
228
228
|
upstreamSock = info.socket;
|
|
229
229
|
// Safety net: if cleanup() ran while we were awaiting the upstream
|
|
230
|
-
// connect
|
|
231
|
-
//
|
|
232
|
-
//
|
|
233
|
-
//
|
|
234
|
-
//
|
|
235
|
-
// the 'connecting' transition this is currently unreachable, but
|
|
236
|
-
// cheap to keep as defense-in-depth against future code paths.
|
|
230
|
+
// connect, settled is true and cleanup's settled guard would
|
|
231
|
+
// short-circuit a future call, orphaning this freshly-connected
|
|
232
|
+
// upstream socket — so destroy it here directly. Reachable when the
|
|
233
|
+
// client emits 'error' or 'close' during the await (both wired to
|
|
234
|
+
// cleanup at handler setup), e.g. Chromium disconnects mid-connect.
|
|
237
235
|
if (settled) {
|
|
238
236
|
try { upstreamSock.destroy(); } catch (_) {}
|
|
239
237
|
return;
|
|
@@ -250,8 +248,8 @@ function handleClient(client, upstream, forceDebug, relay) {
|
|
|
250
248
|
try { upstreamSock.setKeepAlive(true, 60000); } catch (_) {}
|
|
251
249
|
upstreamSock.on('error', cleanup);
|
|
252
250
|
upstreamSock.on('close', cleanup);
|
|
253
|
-
client
|
|
254
|
-
|
|
251
|
+
// client 'error' and 'close' are wired once at handler setup (bottom
|
|
252
|
+
// of handleClient) and cover all phases — not re-attached here.
|
|
255
253
|
|
|
256
254
|
// SOCKS5 success (BND.ADDR 0.0.0.0:0 — Chromium ignores it for CONNECT)
|
|
257
255
|
client.write(Buffer.from([0x05, 0x00, 0x00, 0x01, 0, 0, 0, 0, 0, 0]));
|
|
@@ -273,6 +271,13 @@ function handleClient(client, upstream, forceDebug, relay) {
|
|
|
273
271
|
|
|
274
272
|
client.on('data', onData);
|
|
275
273
|
client.on('error', cleanup);
|
|
274
|
+
// Attach 'close' HERE (not after piping starts) so it covers the whole
|
|
275
|
+
// lifetime, including the up-to-20s upstream-connect await. A client that
|
|
276
|
+
// disconnects cleanly mid-connect now sets settled=true, letting the
|
|
277
|
+
// post-connect `if (settled)` net destroy the freshly-opened upstream
|
|
278
|
+
// socket instead of piping into a dead client; and a close mid-handshake
|
|
279
|
+
// clears the watchdog immediately rather than leaving it to fire later.
|
|
280
|
+
client.on('close', cleanup);
|
|
276
281
|
}
|
|
277
282
|
|
|
278
283
|
// SOCKS5 failure reply (valid only before piping starts).
|