npm - @fanboynz/network-scanner - Versions diffs - 3.1.0 → 3.2.0 - Mend

@fanboynz/network-scanner 3.1.0 → 3.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

package/lib/fingerprint.md ADDED Viewed

@@ -0,0 +1,94 @@
+# `lib/fingerprint.js` — Fingerprint Spoofing Coverage
+Bot-detection evasion for the scanner's headless Chromium. The goal is to make a
+scanned page see a coherent, real-Chrome **Stable** desktop profile rather than a
+headless/automation signature — and, just as important, to keep every spoofed
+value **internally consistent** (JS ↔ HTTP, claimed-value ↔ observable reality)
+so a detector cross-checking two surfaces can't catch a mismatch.
+## How it works
+Spoofing is applied per page, before navigation, by `applyAllFingerprintSpoofing(page, siteConfig, …)`, which runs three stages:
+| Stage | Gate (siteConfig) | What it covers |
+|---|---|---|
+| `applyUserAgentSpoofing` | **`userAgent`** (defaults to `"chrome"`) | Browser identity, automation/headless tells, and the bulk of the navigator/JS-API suite |
+| `applyBraveSpoofing` | Brave-mode only | Brave-specific surfaces |
+| `applyFingerprintProtection` | **`fingerprint_protection`** (`true` \| `"random"`) | Hardware fingerprint *values* (canvas/WebGL/audio noise, screen, memory) + CDP timezone. `"random"` seeds them per-domain (stable per site, varies across sites) |
+HTTP **Client Hints** request headers are set separately in `nwss.js` (gated on a `chrome` userAgent). Identity is pinned to **Stable Chrome** via two constants in `fingerprint.js` (`CHROME_BUILD`, `CHROME_GREASE_BRAND`) + the major in `USER_AGENT_COLLECTIONS` — see `feedback_chrome_spoof_version_bump`.
+**Gate legend:** `UA` = runs with `userAgent` set (on by default) · `FP` = runs with `fingerprint_protection` · `HTTP` = request header set in nwss.js.
+## Browser identity
+| Surface | Mitigation | Gate |
+|---|---|---|
+| `navigator.userAgent` / `appVersion` | Pinned to Stable Chrome 148 desktop UA | UA |
+| `navigator.userAgentData` (brands, platform, mobile) | Spoofed; brand order + GREASE string match real Chrome of the major exactly | UA |
+| `getHighEntropyValues()` | Full set: architecture, bitness, model, **wow64**, platformVersion, **uaFullVersion**, fullVersionList, **formFactors** — build from `CHROME_BUILD`, consistent with HTTP | UA |
+| `navigator.platform` / `vendor` / `productSub` / `vendorSub` | Spoofed UA-consistent (`Win32`, `Google Inc.`, `20030107`, `""`) | UA |
+| `Sec-CH-UA`, `-Platform`, `-Platform-Version`, `-Mobile`, `-Arch`, `-Bitness`, `-WoW64`, `-Model`, `-Full-Version`, `-Full-Version-List`, `-Form-Factors` | Set to match the JS values (same brand order/grease/build) | HTTP |
+## Automation & headless tells
+| Surface | Mitigation | Gate |
+|---|---|---|
+| `navigator.webdriver` | Forced `false` (launch flag + JS) | UA |
+| `cdc_…` / `$cdc_…` / selenium / phantom props | Removed | UA |
+| `window.chrome` + `chrome.runtime` | Provided / simulated | UA |
+| `<html webdriver>` attribute | Stripped | UA |
+| `navigator.plugins` / `mimeTypes` | Native 5-PDF set preserved (matches real Chrome) | UA |
+| `navigator.bluetooth` | Stub added (`getAvailability()→false`) — real Chrome always exposes it | UA |
+| `navigator.share` / `canShare` | Stubs added (Web Share; absent in headless) | UA |
+| `speechSynthesis.getVoices()` | Claimed-OS voice set (Windows → Microsoft + Google, 22 voices) | UA |
+| `Notification.permission` / `permissions.query` | `default` / consistent results | UA |
+| `navigator.userActivation` / `getInstalledRelatedApps` / `document.hasStorageAccess` | Stubs (present in real Chrome) | UA |
+## Hardware & rendering
+| Surface | Mitigation | Gate |
+|---|---|---|
+| WebGL `UNMASKED_VENDOR/RENDERER` | Spoofed GPU from an OS-appropriate pool (per-domain seeded) | UA + FP |
+| Canvas (`toDataURL`/`getImageData`) | Per-canvas noise (WeakMap-cached) | UA + FP |
+| AudioContext / `AudioBuffer` | `getChannelData`/`copyFromChannel` intercepted to defeat audio fingerprint | UA + FP |
+| Fonts (`measureText`/offset probes) | Normalized font metrics | UA |
+| `screen.*` (width/height/avail/colorDepth) | Spoofed (1920×1080, colorDepth 24) | UA + FP |
+| `navigator.hardwareConcurrency` | Spoofed down to 4–8 (hides datacenter core count; no HTTP counterpart) | FP |
+| `navigator.deviceMemory` (JS) + `Sec-CH-Device-Memory` (HTTP) | Both pinned to **8** (hides 32 GB host; JS = HTTP, gated together on FP) | FP / HTTP |
+| `PerformanceNavigationTiming` | Jittered to defeat timing fingerprint | UA |
+## Sensors, locale & network
+| Surface | Mitigation | Gate |
+|---|---|---|
+| Battery Status API | Plugged-in default (`charging:true, level:1, dischargingTime:Infinity`) — blends with the majority | UA |
+| `navigator.connection` (rtt/downlink/effectiveType) | **Native** (left untouched when present) — truthful to the real network so it survives a timing cross-check | — |
+| `navigator.languages` / `language` | `["en-US","en"]` / `en-US` | UA |
+| **Timezone** (`Date`, `Intl`, `getTimezoneOffset`) | CDP `emulateTimezone()` — makes all three consistent + DST-correct (replaced broken JS overrides) | FP |
+| `matchMedia` hover/pointer/color-scheme | Desktop-consistent (`hover`, `fine` pointer) | UA |
+| `maxTouchPoints` | UA-consistent (`0` on desktop) | UA |
+| WebRTC ICE candidates | All candidates stripped → no STUN public-IP leak past the proxy | UA |
+| `mediaDevices.enumerateDevices` | Plausible device set | UA |
+## Anti-introspection
+| Surface | Mitigation | Gate |
+|---|---|---|
+| `Function.prototype.toString` | Every overridden function masked to `function X() { [native code] }` (bulk + per-instance) | UA |
+| `Error.stack` / `prepareStackTrace` | Sanitized so injected frames don't leak | UA |
+| Console error noise from spoofs | Suppressed | UA |
+## Known limitations (not fixable at the browser layer)
+| Vector | Why it's out of scope | Mitigation |
+|---|---|---|
+| **IP reputation** | A datacenter IP is the single biggest tell; no JS/header spoof touches it | Residential **proxy/VPN** (`lib/proxy.js`, `lib/wireguard_vpn.js`, `lib/openvpn_vpn.js`) |
+| **TLS (JA3/JA4) + HTTP/2 fingerprint** | Negotiated below the JS layer | Puppeteer's Chromium already presents a genuine Chrome stack; a MITM proxy can alter it |
+| **Timezone vs exit-IP geolocation** | Timezone is now internally consistent, but the *chosen* zone should match the proxy's country | Per-proxy geo config (not yet wired) |
+| **Behavioural / mouse dynamics** | Statistical, not a property | `interact` / `ghost-cursor` config (`lib/interaction.js`) |
+## Verification
+- **`scripts/test-stealth.js`** — automated smoke test against sannysoft / creepjs / browserleaks. Run before/after a spoof change and diff.
+- **Manual reference diff** — launch with the spoof applied and compare each surface against a real Chrome of the pinned major (the coverage above was validated field-for-field against a live Chrome 148 desktop). The unspoofed deviations are deliberate: `hardwareConcurrency`/`deviceMemory` downscaled to hide the host, and `connection` left native.

package/lib/ghost-cursor.js CHANGED Viewed

@@ -15,6 +15,11 @@
 //   npm install ghost-cursor            (optional dependency)
 const { formatLogMessage, messageColors } = require('./colorize');
+// humanClick gives the coordinate-click path the same press realism as the
+// built-in content clicks (hover dwell + mousedown/hold/mouseup, optional
+// hand-tremor + mouseup drift) instead of a 0ms page.mouse.click. One-way
+// require — interaction.js does not depend on ghost-cursor, so no cycle.
+const { humanClick } = require('./interaction');
 const GHOST_CURSOR_TAG = messageColors.processing('[ghost-cursor]');
 let ghostCursorModule = null;
@@ -56,7 +61,7 @@ function createGhostCursor(page, options = {}) {
     const cursor = ghostCursorModule.createCursor(page, { x: startX, y: startY });
     if (forceDebug) {
-      console.log(formatLogMessage('debug', '[ghost-cursor] Cursor instance created'));
+      console.log(formatLogMessage('debug', `${GHOST_CURSOR_TAG} Cursor instance created`));
     }
     return cursor;
@@ -98,7 +103,7 @@ async function ghostMove(cursor, toX, toY, options = {}) {
     const moveOpts = {};
     if (moveSpeed !== undefined) moveOpts.moveSpeed = moveSpeed;
     if (moveDelay > 0) moveOpts.moveDelay = moveDelay;
-    if (randomizeMoveDelay !== undefined) moveOpts.randomizeMoveDelay = randomizeMoveDelay;
+    moveOpts.randomizeMoveDelay = randomizeMoveDelay; // always defined (defaults to true)
     if (overshootThreshold !== undefined) moveOpts.overshootThreshold = overshootThreshold;
     await cursor.moveTo({ x: toX, y: toY }, moveOpts);
@@ -126,6 +131,8 @@ async function ghostMove(cursor, toX, toY, options = {}) {
  * @param {number} options.waitForClick - Delay (ms) between mousedown/mouseup (default: auto)
  * @param {number} options.moveDelay - Delay (ms) after moving to target
  * @param {number} options.paddingPercentage - Click point within element (0=edge, 100=center)
+ * @param {import('puppeteer').Page} options.page - Page for coordinate clicks (falls back to cursor.page)
+ * @param {boolean} options.realistic - Coordinate clicks: emit hand-tremor + mouseup drift (default: false)
  * @param {boolean} options.forceDebug - Enable debug logging
  * @returns {Promise<boolean>} true if click succeeded
  */
@@ -137,6 +144,8 @@ async function ghostClick(cursor, target, options = {}) {
     waitForClick,
     moveDelay,
     paddingPercentage,
+    page,
+    realistic = false,
     forceDebug
   } = options;
@@ -149,16 +158,25 @@ async function ghostClick(cursor, target, options = {}) {
     if (typeof target === 'string') {
       await cursor.click(target, clickOpts);
     } else {
-      // For coordinate clicks, move first then use page click
+      // Coordinate click: ghost-cursor's bezier moveTo brings the cursor to the
+      // point, then humanClick does the realistic press (hover dwell, mousedown
+      // → hold → mouseup, plus hand-tremor + down≠up drift when realistic). This
+      // replaces a 0ms page.mouse.click, so the ghost path gets the same click
+      // realism as built-in content clicks.
       await cursor.moveTo(target);
-      // Small hesitation before clicking
-      if (hesitate > 0) {
-        await new Promise(resolve => setTimeout(resolve, hesitate));
-      }
-      const page = cursor._page || cursor.page;
-      if (page && typeof page.mouse?.click === 'function') {
-        await page.mouse.click(target.x, target.y);
+      // Prefer the caller-supplied page; fall back to the cursor's own page
+      // (ghost-cursor exposes it as cursor.page) so we don't depend on internals.
+      // Return false (not silent success) if there's no usable page — otherwise
+      // the "Clicked" log + return true below would lie about a click that
+      // never fired.
+      const clickPage = page || cursor.page;
+      if (!clickPage || typeof clickPage.mouse?.down !== 'function') {
+        if (forceDebug) {
+          console.log(formatLogMessage('debug', `${GHOST_CURSOR_TAG} Coordinate click skipped: no usable page`));
+        }
+        return false;
       }
+      await humanClick(clickPage, target.x, target.y, { realistic, forceDebug });
     }
     if (forceDebug) {
@@ -189,7 +207,7 @@ async function ghostRandomMove(cursor, options = {}) {
   try {
     await cursor.randomMove();
     if (options.forceDebug) {
-      console.log(formatLogMessage('debug', '[ghost-cursor] Random movement performed'));
+      console.log(formatLogMessage('debug', `${GHOST_CURSOR_TAG} Random movement performed`));
     }
     return true;
   } catch (err) {

package/lib/interaction.js CHANGED Viewed

@@ -1333,5 +1333,9 @@ module.exports = {
   simulateScrolling,
   interactWithElements,
   performContentClicks,
+  // Realistic timed click (hover dwell + mousedown/hold/mouseup, optional
+  // hand-tremor + mouseup drift). Reused by lib/ghost-cursor.js so the ghost
+  // coordinate click gets the same press realism as built-in content clicks.
+  humanClick,
   generateRandomCoordinates
 };

package/lib/nettools.js CHANGED Viewed

@@ -124,7 +124,6 @@ function loadDiskCache(filePath, cache, ttl, maxSize) {
     // Surface the event so the user knows they lost their warm cache;
     // previously this was a silent reset, which made "why did my dns
     // cache stop helping?" hard to diagnose.
-    // eslint-disable-next-line no-console
     console.warn(`${messageColors.highlight('[dns-cache]')} ${path.basename(filePath)} was unreadable (${err.message}); starting fresh`);
     try { fs.unlinkSync(filePath); } catch {}
   }
@@ -256,6 +255,38 @@ function getDnsCacheStats() {
 // Disk cache is opt-in via --dns-cache flag
 let diskCacheEnabled = false;
+// Optional dig resolver(s), set from --dns. When non-empty, dig queries
+// `@<one of these>` (round-robin) instead of the system resolver — so dig uses
+// the same reliable servers as the pre-check rather than a flaky /etc/resolv.conf
+// (the cause of `dig: Command timeout` drops on Cloudflare-fronted ad domains).
+let digResolvers = [];
+let digResolverCursor = 0;
+// dig's `@server` wants a bare IP; strip any `ipv4:port` / `[ipv6]:port` form.
+function digServerFromSpec(spec) {
+  const s = String(spec);
+  const br = s.match(/^\[([0-9a-fA-F:]+)\]/);
+  if (br) return br[1];
+  const v4p = s.match(/^(\d{1,3}(?:\.\d{1,3}){3}):\d+$/);
+  if (v4p) return v4p[1];
+  return s;
+}
+function setDigResolvers(servers) {
+  digResolvers = (Array.isArray(servers) ? servers : []).filter(Boolean).map(digServerFromSpec);
+}
+// Ordered `@server` attempt list for ONE dig lookup: starts at the round-robin
+// cursor (advanced once per lookup, preserving the old fairness) then falls
+// through the remaining resolvers as failover. Returns [null] when no --dns
+// resolvers are configured — a single attempt via the system resolver.
+function digServerAttemptList() {
+  if (digResolvers.length === 0) return [null];
+  const start = digResolverCursor++ % digResolvers.length;
+  const list = [];
+  for (let i = 0; i < digResolvers.length; i++) {
+    list.push('@' + digResolvers[(start + i) % digResolvers.length]);
+  }
+  return list;
+}
 /**
  * Enable persistent disk caching for dig/whois results.
  * Call this when --dns-cache flag is set. Idempotent — repeated calls
@@ -293,7 +324,6 @@ function enableDiskCache() {
   // Debug log only if anything was actually warmed; silent on fresh
   // installs / empty disk caches.
   if (digWarm > 0 || whoisWarm > 0) {
-    // eslint-disable-next-line no-console
     console.log(`${messageColors.highlight('[dns-cache]')} Warmed resolved-hostnames index from disk: ${digWarm} dig + ${whoisWarm} whois entries`);
   }
@@ -994,50 +1024,103 @@ async function whoisLookupWithRetry(domain = '', timeout = 10000, whoisServer =
  * @returns {Promise<Object>} Object with success status and output/error
  */
 async function digLookup(domain = '', recordType = 'A', timeout = 5000) {
-  try {
-    // Clean domain
-    const cleanDomain = domain.replace(/^https?:\/\//, '').replace(/\/.*$/, '').replace(/:\d+$/, '');
+  // Clean domain (defensive — callers usually pass an already-clean digDomain).
+  const cleanDomain = domain.replace(/^https?:\/\//, '').replace(/\/.*$/, '').replace(/:\d+$/, '');
+  // dig argv-injection guard. dig parses @/-/+ -leading tokens as options
+  // (`@host` redirects the query to an arbitrary server, `-f path` reads a
+  // file as a query batch) and has no `--` end-of-options marker like whois.
+  // Reject anything not hostname-shaped before shelling out — success:false so
+  // it's treated as no-match and not cached. (Charset blocks @ + / space etc;
+  // the leading-`-` check blocks `-f` and friends, since `-` is valid mid-host.)
+  if (!cleanDomain || /[^a-zA-Z0-9._-]/.test(cleanDomain) || cleanDomain.startsWith('-')) {
+    return { success: false, error: `invalid domain shape: ${cleanDomain}`, domain: cleanDomain, recordType };
+  }
+  // Resolver failover: try the round-robin resolver first, then fall through
+  // the remaining --dns resolvers on timeout / no-reply / REFUSED / SERVFAIL —
+  // the same resilience the whois path already has via whoisLookupWithRetry,
+  // and the DNS pre-check has via its rotation. Capped at 3 attempts (matches
+  // whois maxRetries default) so a host that's dead on every resolver can't
+  // burn the whole nettools budget.
+  const attempts = digServerAttemptList();
+  // Only do JS-level failover when --dns gave us pinned resolvers. Without it,
+  // attempts is [null]: a SINGLE system-resolver invocation that keeps dig's
+  // native resolv.conf rotation + retries (forcing +tries=1 there would strip
+  // that built-in resilience — the whole point is to be MORE resilient).
+  const usingResolvers = attempts[0] !== null;
+  const maxAttempts = usingResolvers ? Math.min(3, attempts.length) : 1;
+  // Pinned-resolver attempts use +time=2 +tries=1 (the JS loop owns failover)
+  // under a 4s SIGTERM ceiling. The system-resolver path keeps the full budget
+  // and dig's own retry behaviour, matching the pre-failover semantics exactly.
+  const perAttemptTimeout = usingResolvers ? Math.min(timeout, 4000) : timeout;
+  let lastError = 'no resolver attempts made';
+  for (let i = 0; i < maxAttempts; i++) {
+    const digServerArg = attempts[i];
+    // With a pinned resolver: one fast try (+time=2 +tries=1), then the JS loop
+    // moves to the next resolver. Without --dns: bare `dig name type` so dig
+    // applies its native resolv.conf rotation. execFile (no shell) => args
+    // can't be injected.
+    const digArgs = digServerArg
+      ? [digServerArg, '+time=2', '+tries=1', cleanDomain, recordType]
+      : [cleanDomain, recordType];
+    const resolverLabel = digServerArg ? digServerArg.slice(1) : 'system resolver';
+    try {
+      const { stdout: fullOutput } = await execFileWithTimeout('dig', digArgs, perAttemptTimeout);
+      // Judge success by RCODE, not by stderr. dig exits 0 for ANY server
+      // response, so non-zero exit (timeout / no-reply) already rejected above.
+      // REFUSED/SERVFAIL are resolver-SIDE failures another resolver may not
+      // share — fail over instead of accepting an answerless response (the
+      // EREFUSED-storm case). NOERROR/NXDOMAIN are definitive => accept.
+      const statusMatch = fullOutput.match(/status:\s*([A-Z]+)/i);
+      const rcode = statusMatch ? statusMatch[1].toUpperCase() : 'NOERROR';
+      if (rcode === 'REFUSED' || rcode === 'SERVFAIL') {
+        lastError = `dig ${rcode} from ${resolverLabel}`;
+        continue; // try next resolver in the failover list
+      }
+      // Non-empty stderr is intentionally NOT treated as failure here: dig
+      // prints `;; communications error ... timed out` warnings to stderr while
+      // still returning a valid ANSWER SECTION and exit 0. The old code failed
+      // the whole lookup on any stderr, discarding good answers — the exact
+      // missed-match pattern under flaky resolvers.
+      const answerMatch = fullOutput.match(/;; ANSWER SECTION:\n([\s\S]*?)(?:\n;;|\n*$)/);
+      let shortOutput = '';
+      if (answerMatch) {
+        shortOutput = answerMatch[1]
+          .split('\n')
+          .map(line => line.split(/\s+/).pop())
+          .filter(Boolean)
+          .join('\n');
+      }
-    // Single dig command — full output contains everything including short
-    // answers. execFile (no shell) so cleanDomain / recordType can contain
-    // any chars without injection risk.
-    const { stdout: fullOutput, stderr } = await execFileWithTimeout('dig', [cleanDomain, recordType], timeout);
-    if (stderr && stderr.trim()) {
       return {
-        success: false,
-        error: stderr.trim(),
+        success: true,
+        output: fullOutput,
+        shortOutput,
         domain: cleanDomain,
-        recordType
+        recordType,
+        resolver: resolverLabel
       };
+    } catch (error) {
+      // Timeout or non-zero exit (e.g. dig exit 9 = no reply from this server).
+      // Record and fall through to the next resolver.
+      lastError = error.message;
     }
-    // Extract short output from ANSWER SECTION of full dig output
-    const answerMatch = fullOutput.match(/;; ANSWER SECTION:\n([\s\S]*?)(?:\n;;|\n*$)/);
-    let shortOutput = '';
-    if (answerMatch) {
-      shortOutput = answerMatch[1]
-        .split('\n')
-        .map(line => line.split(/\s+/).pop())
-        .filter(Boolean)
-        .join('\n');
-    }
-    return {
-      success: true,
-      output: fullOutput,
-      shortOutput,
-      domain: cleanDomain,
-      recordType
-    };
-  } catch (error) {
-    return {
-      success: false,
-      error: error.message,
-      domain: domain,
-      recordType
-    };
   }
+  // Every attempt timed out / was refused. success:false so the handler does
+  // NOT cache it (transient — caching would poison the domain for the TTL).
+  return {
+    success: false,
+    error: lastError,
+    domain: cleanDomain,
+    recordType
+  };
 }
 /**
@@ -1170,15 +1253,20 @@ function createNetToolsHandler(config) {
     // Determine which domain will be used for dig lookup
     const digDomain = digSubdomain && originalDomain ? originalDomain : domain;
-    // For whois: use root domain only (whois data is consistent for entire domain)
-    const whoisRootDomain = getRootDomain ? getRootDomain(`http://${domain}`) : domain;
+    // For whois: use root domain only (whois data is consistent for entire
+    // domain). Only compute it when whois is actually configured — getRootDomain
+    // does a domain parse, so on a dig-only config (no whois/whois-or) this skips
+    // a parse + string build on every single request. whoisRootDomain is only
+    // ever read inside the whois branch, so the `domain` fallback is never used.
+    const wantWhois = hasWhois || hasWhoisOr;
+    const whoisRootDomain = wantWhois ? (getRootDomain ? getRootDomain(`http://${domain}`) : domain) : domain;
     // Check if we need to perform any lookups with appropriate deduplication
     // Whois: root domain + config (whois data same for sub.example.com and example.com)
-    const whoisDedupeKey = `${whoisRootDomain}:${whoisConfigKey}`;
+    const whoisDedupeKey = wantWhois ? `${whoisRootDomain}:${whoisConfigKey}` : '';
     // Dig: specific subdomain + config (DNS records can differ between subdomains)
     const digDedupeKey = `${digDomain}:${digConfigKey}`;
-    const needsWhoisLookup = (hasWhois || hasWhoisOr) && !processedWhoisDomains.has(whoisDedupeKey);
+    const needsWhoisLookup = wantWhois && !processedWhoisDomains.has(whoisDedupeKey);
     const needsDigLookup = (hasDig || hasDigOr) && !processedDigDomains.has(digDedupeKey);
     // Claim the dedupe keys NOW, synchronously, before executeNetToolsLookup
@@ -1606,11 +1694,20 @@ function createNetToolsHandler(config) {
               // backwards-compat additive: old code reading new cache
               // ignores it; new code reading old cache (no field) falls
               // back to lazy on-hit population in the cache-hit branch.
-              globalDigResultCache.set(digCacheKey, {
-                result: digResult,
-                timestamp: now,
-                hostname: digDomain
-              });
+              //
+              // Only cache a SUCCESSFUL dig. A timeout/error (success:false) is
+              // transient — caching it would poison the domain for the full
+              // cache TTL (20h when persisted via --dns-cache), so a host that
+              // resolves fine on the next attempt keeps getting dropped. (An
+              // NXDOMAIN is success:true with NXDOMAIN in the body — a real
+              // answer — so it's correctly still cached.)
+              if (digResult.success) {
+                globalDigResultCache.set(digCacheKey, {
+                  result: digResult,
+                  timestamp: now,
+                  hostname: digDomain
+                });
+              }
               dnsCacheStats.digMisses++;
               pushFreshSample(dnsCacheStats.freshDig, `${digDomain} (${digRecordType})`);
               // Index hostname IF dig actually proved resolution -- NXDOMAIN
@@ -1662,7 +1759,7 @@ function createNetToolsHandler(config) {
                 if (hasDig) logToConsoleAndFile(`${messageColors.highlight('[dig-and]')} Terms checked: ${digTerms.join(' AND ')}, matched: ${digMatched}`);
                 if (hasDigOr) logToConsoleAndFile(`${messageColors.highlight('[dig-or]')} Terms checked: ${digOrTerms.join(' OR ')}, matched: ${digOrMatched}`);
               }
-              logToConsoleAndFile(`${messageColors.highlight('[dig]')} Lookup completed for ${digDomain}, dig-and: ${digMatched}, dig-or: ${digOrMatched}`);
+              logToConsoleAndFile(`${messageColors.highlight('[dig]')} Lookup completed for ${digDomain}${digResult.resolver ? ` via ${digResult.resolver}` : ''}, dig-and: ${digMatched}, dig-or: ${digOrMatched}`);
               if (siteConfig.verbose === 1) {
                 if (hasDig) logToConsoleAndFile(`${messageColors.highlight('[dig]')} AND terms: ${digTerms.join(', ')}`);
                 if (hasDigOr) logToConsoleAndFile(`${messageColors.highlight('[dig]')} OR terms: ${digOrTerms.join(', ')}`);
@@ -1813,6 +1910,12 @@ module.exports = {
   validateDigAvailability,
   enableDiskCache,
   getDnsCacheStats,
+  // Route dig through the --dns resolver(s) instead of the system resolver.
+  setDigResolvers,
+  // Generic disk-cache primitives (atomic write, TTL/size-bounded) — reused by
+  // nwss.js to persist the DNS pre-check negative cache under --dns-cache.
+  loadDiskCache,
+  saveDiskCache,
   // Resolved-hostnames index for the DNS pre-check optimization.
   // nwss.js's per-task pre-check consults this BEFORE calling resolve4
   // so hosts already proven live by dig or whois (within their 20h

package/lib/output.js CHANGED Viewed

@@ -133,32 +133,43 @@ function formatDomain(domain, options = {}) {
   if (!domain || domain.length <= 6 || !domain.includes('.')) {
     return null;
   }
-  // If plain is true, always return just the domain regardless of other options
+  // Path-prefix rules (from output_regex) are stored as "host/path/" — they
+  // contain a '/'. Only adblock can express a path; every domain-only format
+  // (dnsmasq/unbound/pihole/hosts/privoxy/plain) falls back to the bare host
+  // (everything before the first '/') so output stays valid in all formats.
+  const slash = domain.indexOf('/');
+  const isPathRule = slash !== -1;
+  const host = isPathRule ? domain.slice(0, slash) : domain;
+  // If plain is true, always return just the host regardless of other options
   if (plain) {
-    return domain;
+    return host;
   }
   // Apply specific format based on output mode
   if (pihole) {
     // Escape dots for regex and use Pi-hole format: (^|\.)domain\.com$
-    const escapedDomain = domain.replace(/\./g, '\\.');
+    const escapedDomain = host.replace(/\./g, '\\.');
     return `(^|\\.)${escapedDomain}$`;
   } else if (privoxy) {
-    return `{ +block } .${domain}`;
+    return `{ +block } .${host}`;
   } else if (dnsmasq) {
-    return `local=/${domain}/`;
+    return `local=/${host}/`;
   } else if (dnsmasqOld) {
-    return `server=/${domain}/`;
+    return `server=/${host}/`;
   } else if (unbound) {
-    return `local-zone: "${domain}." always_null`;
+    return `local-zone: "${host}." always_null`;
   } else if (localhostIP) {
-    return `${localhostIP} ${domain}`;
+    return `${localhostIP} ${host}`;
   } else if (adblockRules && resourceType) {
-    // Generate adblock filter rules with resource type modifiers
-    return `||${domain}^${resourceType}`;
+    // Adblock with resource-type modifier. A path rule self-anchors via its
+    // trailing '/', so it takes no '^' separator; a domain rule needs '^'.
+    return isPathRule ? `||${domain}${resourceType}` : `||${domain}^${resourceType}`;
   } else {
-    return `||${domain}^`;
+    // Default adblock: ||host^ for a domain, ||host/path/ for a path rule
+    // (the path already anchors, so no trailing '^').
+    return isPathRule ? `||${domain}` : `||${domain}^`;
   }
 }

package/lib/proxy.js CHANGED Viewed

@@ -253,8 +253,12 @@ function getProxyArgs(siteConfig, forceDebug = false) {
     console.warn(formatLogMessage('proxy', `proxy_remote_dns ignored: SOCKS4 cannot do proxy-side DNS resolution (use SOCKS5)`));
   }
-  // Bypass list: domains that skip the proxy
-  const bypass = siteConfig.proxy_bypass || siteConfig.socks5_bypass || [];
+  // Bypass list: domains that skip the proxy. Accept either an array (the
+  // documented form) or a single string — a bare "localhost" used to throw
+  // `bypass.join is not a function` here, in the browser-launch path. Same
+  // string-or-array tolerance as the dig/whois siteConfig fields.
+  const rawBypass = siteConfig.proxy_bypass || siteConfig.socks5_bypass || [];
+  const bypass = Array.isArray(rawBypass) ? rawBypass : [rawBypass];
   if (bypass.length > 0) {
     args.push(`--proxy-bypass-list=${bypass.join(';')}`);
   }

package/lib/redirect.js CHANGED Viewed

@@ -19,7 +19,10 @@ async function navigateWithRedirectHandling(page, currentUrl, siteConfig, gotoOp
   let httpStatus = null;
   let cfRay = null;
   const jsRedirectTimeout = siteConfig.js_redirect_timeout || 5000; // Wait 5s for JS redirects
-  const maxRedirects = siteConfig.max_redirects || 10;
+  // Use a number check, not || , so max_redirects: 0 (follow none) isn't
+  // swallowed as falsy and silently bumped to 10. Only absent/negative/non-number defaults.
+  const maxRedirects = (typeof siteConfig.max_redirects === 'number' && siteConfig.max_redirects >= 0)
+    ? siteConfig.max_redirects : 10;
   const detectJSPatterns = siteConfig.detect_js_patterns !== false; // Default to true
   // Monitor frame navigations to detect redirects

package/lib/smart-cache.js CHANGED Viewed

@@ -93,10 +93,16 @@ class SmartCache {
       this._setupAutoSave();
     }
-    // Set up memory monitoring
+    // Set up memory monitoring. unref'd so this always-on housekeeping timer
+    // can never hold the event loop open past scan completion — destroy()
+    // clears it promptly on the normal path, but unref guarantees a clean
+    // exit on any path that skips destroy() (e.g. an unhandled throw before
+    // nwss reaches its cleanup). Matches the unref convention applied to
+    // every other Node-side timer in the codebase.
     this.memoryCheckInterval = setInterval(() => {
       this._checkMemoryPressure();
     }, this.options.memoryCheckInterval);
+    if (typeof this.memoryCheckInterval.unref === 'function') this.memoryCheckInterval.unref();
   }
   /**
@@ -1137,9 +1143,11 @@ class SmartCache {
    * @private
    */
   _setupAutoSave() {
+    // unref'd for the same reason as memoryCheckInterval — never block exit.
     this.autoSaveInterval = setInterval(() => {
       this.savePersistentCache();
     }, this.options.autoSaveInterval);
+    if (typeof this.autoSaveInterval.unref === 'function') this.autoSaveInterval.unref();
   }
   /**

package/lib/socks-relay.js CHANGED Viewed

@@ -227,13 +227,11 @@ function handleClient(client, upstream, forceDebug, relay) {
         upstreamSock = info.socket;
         // Safety net: if cleanup() ran while we were awaiting the upstream
-        // connect (some path other than the handshake watchdog — e.g. a
-        // 'close' event on the client during pause), settled is true and
-        // cleanup's settled guard would short-circuit a future call,
-        // orphaning this freshly-connected upstream socket. Destroy it
-        // here directly. With Fix #1a moving the watchdog clearTimeout to
-        // the 'connecting' transition this is currently unreachable, but
-        // cheap to keep as defense-in-depth against future code paths.
+        // connect, settled is true and cleanup's settled guard would
+        // short-circuit a future call, orphaning this freshly-connected
+        // upstream socket — so destroy it here directly. Reachable when the
+        // client emits 'error' or 'close' during the await (both wired to
+        // cleanup at handler setup), e.g. Chromium disconnects mid-connect.
         if (settled) {
           try { upstreamSock.destroy(); } catch (_) {}
           return;
@@ -250,8 +248,8 @@ function handleClient(client, upstream, forceDebug, relay) {
         try { upstreamSock.setKeepAlive(true, 60000); } catch (_) {}
         upstreamSock.on('error', cleanup);
         upstreamSock.on('close', cleanup);
-        client.on('error', cleanup);
-        client.on('close', cleanup);
+        // client 'error' and 'close' are wired once at handler setup (bottom
+        // of handleClient) and cover all phases — not re-attached here.
         // SOCKS5 success (BND.ADDR 0.0.0.0:0 — Chromium ignores it for CONNECT)
         client.write(Buffer.from([0x05, 0x00, 0x00, 0x01, 0, 0, 0, 0, 0, 0]));
@@ -273,6 +271,13 @@ function handleClient(client, upstream, forceDebug, relay) {
   client.on('data', onData);
   client.on('error', cleanup);
+  // Attach 'close' HERE (not after piping starts) so it covers the whole
+  // lifetime, including the up-to-20s upstream-connect await. A client that
+  // disconnects cleanly mid-connect now sets settled=true, letting the
+  // post-connect `if (settled)` net destroy the freshly-opened upstream
+  // socket instead of piping into a dead client; and a close mid-handshake
+  // clears the watchdog immediately rather than leaving it to fire later.
+  client.on('close', cleanup);
 }
 // SOCKS5 failure reply (valid only before piping starts).