@fanboynz/network-scanner 3.0.1 → 3.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -162,7 +162,10 @@ jobs:
162
162
  uses: softprops/action-gh-release@v2
163
163
  with:
164
164
  tag_name: v${{ steps.version.outputs.version }}
165
- name: v${{ steps.version.outputs.version }}
165
+ # Date suffix matches the convention used by the backfilled
166
+ # historical releases (v2.0.10..v2.0.66) so the Releases page
167
+ # shows the release date inline without a click-through.
168
+ name: v${{ steps.version.outputs.version }} (${{ steps.version.outputs.date }})
166
169
  body: ${{ inputs.release_notes_source == 'changelog' && steps.changelog_notes.outputs.notes || steps.manual_notes.outputs.notes }}
167
170
  draft: false
168
171
  prerelease: ${{ inputs.prerelease }}
package/CHANGELOG.md CHANGED
@@ -2,6 +2,40 @@
2
2
 
3
3
  All notable changes to the Network Scanner (nwss.js) project.
4
4
 
5
+ ## [3.0.2] - 2026-05-25
6
+
7
+ ### Security
8
+ - **Credentials redacted in `lib/proxy.js` 'Invalid proxy URL' warn** — `getProxyArgs` echoed the raw user-configured `proxyUrl` when parseProxyUrl returned null. For a URL like `socks5://user:pass@host:port` that fails parse (mistyped protocol, port out of range, etc.) this emitted the full credentials to stderr. Regex-strips the `user:pass@` segment (handles both scheme-prefixed and bare host:port forms) before logging. Same redaction policy as `getProxyInfo()` and the socks-relay logs already fixed in 3.0.1. The new port-range validation in this release expanded the trigger surface (one more parse-failure path) which made me find the leak.
9
+ - **`applyProxyAuth` debug log redacted** — the `Auth set for USER@host:port` debug-only log line emitted the raw username. Now `[redacted]@host:port`. Same leak class as above, third site of the same kind.
10
+
11
+ ### Added
12
+ - **`scripts/test-stealth.js --format=json`** (already shipped in 3.0.1, listed here only because the harness gained a real consumer via the next item) — `getRelayStats()` exposed from `lib/socks-relay.js`, returning `[{key, port, activeConnections, errors}]` per active relay (`key` with the username segment stripped for safety, IPv6-aware). Diagnostic surface for answering "is the proxy slow because the upstream is saturated or because the scan is opening too many parallel tunnels?" without enabling `forceDebug`.
13
+ - **`delay_uncapped: true` site-config flag** — lifts the 2s post-networkidle delay cap; honors the configured `delay` up to half the per-URL timeout. Targets sites with setTimeout-deferred lazy ad/tracker loaders (weather.com / cbssports.com class) where late requests fire well past the standard window. Default behavior unchanged (still 2s) so fast sites stay fast.
14
+
15
+ ### Fixed
16
+ - **Race: late-completing dig/whois validations were orphaned.** Per-URL async nettools handlers were scheduled via fire-and-forget `setImmediate(() => netToolsHandler(...))`; if the handler's full async chain (dig spawn + match check + addMatchedDomain) resolved AFTER the result snapshot ran, the addMatchedDomain call landed in a Set that was no longer referenced by any in-flight result. Most visible symptom: domains appearing in the end-of-scan "Fresh dig:" list with no corresponding rule in the output. Now tracked via `trackNetToolsHandler` (closure over per-URL `pendingNetTools[]`) and drained via `drainPendingNetTools()` with a 3s hard cap (`TIMEOUTS.NETTOOLS_DRAIN_TIMEOUT`), called BEFORE `formatRules` at all three snapshot sites (dry-run, success, partial-success/catch path). All three setImmediate call sites (popup observer, main request handler, secondary request handler) migrated.
17
+ - **Race: scan-exit hang up to ~100s when a dig/whois lookup hung.** Four `setTimeout`s in `lib/nettools.js` (outer exec timer, overall 65s timer, whois progressive retry delay up to ~30s, whois server-switch delay ~8s) were not `unref`'d, so a genuinely-hung lookup that survived the new 3s drain could hold the Node event loop alive for the remainder. All four now `unref`'d with defensive `typeof timer.unref === 'function'` guards; the previously-unref'd inner SIGKILL tail-timer makes 5/5 setTimeout sites in the module now safe for scan-exit. Natural-completion paths still `clearTimeout` on resolution, so this only affects the hung-process case.
18
+ - **`parseProxyUrl` accepted ports > 65535.** Now rejects ports outside 1-65535 at parse time, surfacing misconfiguration immediately instead of passing an invalid value to Chromium and getting an opaque downstream error.
19
+ - **`@version 1.1.0` JSDoc** in `lib/proxy.js` was stale (const said `1.2.0`). Aligned to 1.2.0; the const + export then went away in the export trim — see Improved.
20
+ - **Site-config `delay` field was a no-op.** `nwss.js` per-URL handler hardcoded `const delayMs = DEFAULT_DELAY` regardless of `siteConfig.delay`. Now reads `siteConfig.delay || DEFAULT_DELAY`. Visible only with the new `delay_uncapped: true` flag (without it, the configured value is still capped at 2s as before).
21
+ - **"Something went wrong when opening your profile" popup in `--keep-open` headful mode.** `--disable-sync` was conditionally dropped when `--keep-open` was set, which let Chrome's sync subsystem initialise against our temp `userDataDir` (which has no real profile), error out, and pop a modal that blocked the page until dismissed. Three-flag fix: `--disable-sync` is now always-on (was the only one of five `--keep-open`-conditional flags actually causing user-visible breakage), plus `--allow-browser-signin=false` and `AccountConsistencyMirror,AccountConsistencyDice` appended to the existing `--disable-features=` list as defence in depth across Chromium's multiple account-subsystem entry points. The other four conditional-on-keep-open flags (`--disable-component-extensions-with-background-pages`, `--disable-component-update`, `--disable-background-networking`, `--disable-extensions`) stay conditional so user-loaded extensions and live inspection still work normally.
22
+ - **Race: `socks-relay.ensureRelay` concurrent-init created orphan servers.** Two concurrent callers for the same upstream both passed the `_relays.get(key)` check, both created `net.Server` listeners, both raced to `_relays.set` — second overwrote first, first server was orphaned (listening forever, never closed by `closeAllRelays`). Not triggered by current usage (proxy.js's `prepareSocksRelays` uses a sequential await loop) but a latent bug for future parallel-init paths. Fix: singleflight via new `_pendingRelays` Map; second caller for an in-flight upstream rides the existing promise. Cleanup uses `.finally()` on the returned promise (not try/finally inside the IIFE) so a hypothetical sync-throw in the init body can't leave a permanent rejected entry in `_pendingRelays`. Mirrors the `pendingDigLookups`/`pendingWhoisLookups` pattern in `lib/nettools.js`.
23
+ - **Race: handshake watchdog firing during upstream connect orphaned the upstream socket.** `HANDSHAKE_TIMEOUT_MS = 10000` vs `SocksClient.createConnection` timeout = `20000` left a 10-second window where the watchdog could fire mid-await, destroy the client, and set `settled = true`. When the upstream connect then resolved into a fresh socket, the subsequent `cleanup()` short-circuited via the settled guard, leaving an open TCP connection to the upstream that was never destroyed — held alive until OS-level timeout or remote close. Fix: disarm the watchdog at the `phase = 'connecting'` transition (client has completed its part of the handshake; `SocksClient`'s own 20s timeout covers the upstream connect), plus a defence-in-depth `if (settled) destroy + return` after `upstreamSock = info.socket` for any other path that could call cleanup before upstreamSock registers.
24
+ - **Race: `closeAllRelays` didn't wait for in-flight `ensureRelay` inits.** A relay whose `listen()` completed AFTER `closeAllRelays` snapshotted `_relays` landed in `_relays` unowned by the close pass — leaked until next call or process exit. Pre-existing, more visible after `_pendingRelays` became a separate Map for the singleflight. Fix: `await Promise.allSettled(Array.from(_pendingRelays.values()))` at the head of `closeAllRelays` so the snapshot is guaranteed-complete. `allSettled` (not `all`) because rejected inits have already cleaned up their `_pendingRelays` entries via `.finally()`.
25
+
26
+ ### Improved
27
+ - **socks-relay handshake buffer cap** (`MAX_HANDSHAKE_BYTES = 4096`) on pre-piping growth. Prior code absorbed arbitrary bytes for the full 10s handshake watchdog window, letting a hostile/buggy local process pin memory by drip-feeding garbage. Sends a protocol-appropriate failure reply per phase before closing.
28
+ - **socks-relay TCP keep-alive on upstream socket** (`setKeepAlive(true, 60000)`). Catches silently-dead upstreams (NAT timeout, mobile-tower drop, proxy crash without FIN/RST) in ~12 minutes (60s idle + kernel-default 9 × 75s probes) instead of the Linux default ~2 hours. Comment is honest about the kernel-default probe math — `60000` is `TCP_KEEPIDLE` only, not the full detection time.
29
+ - **socks-relay auth-misconfig warn** — `ensureRelay` warns once per unique upstream when `username && !password`, since RFC 1929 auth will almost certainly fail. Surfaces the misconfiguration at relay start instead of as opaque per-request failures inside `forceDebug`-gated logs.
30
+ - **socks-relay `server.maxConnections = 256` cap** per relay. Sheds excess Chromium connections at the TCP-accept layer (where HTTP retry handles them cleanly) instead of letting all N tunnels open to the upstream and have the provider silently drop past-quota ones — which looks to the scan like random missed requests.
31
+ - **socks-relay per-relay error counter** tracked in `relayEntry.errors`, bumped on `SocksClient.createConnection` failures, surfaced via `getRelayStats()` as the `errors` field. Lets a post-scan reader see "X of N upstream connects failed" without re-running with forceDebug.
32
+ - **socks-relay graceful drain on `closeAllRelays`** — `DRAIN_TIMEOUT_MS = 2000` window via `Promise.race(closePromise, drainTimeout)` for in-flight tunnels to flush their last response bytes into Chromium / Puppeteer. Stragglers past 2s get force-destroyed (server.close callback then fires immediately). SIGINT mid-scan no longer amputates in-flight responses, but a hung tunnel can't block exit beyond 2s. Drain timer `unref`'d so it doesn't hold the event loop open when the close-promise wins the race.
33
+ - **`lib/proxy.js` exports trimmed 12 → 8** — removed `getModuleInfo`, `PROXY_MODULE_VERSION`, `SUPPORTED_PROTOCOLS`, `getConfiguredProxy` (zero external callers in each case, grep-verified). Mirrors the same trim already done in `lib/cloudflare.js`. `SUPPORTED_PROTOCOLS` and `getConfiguredProxy` stay as module-local since they're used internally.
34
+ - **`lib/proxy.js` code cleanup** — two `require('./socks-relay')` calls consolidated into one destructured import (with `closeAllRelays` renamed inline), `net` module require hoisted from `testProxy()` body to top of file, `applyProxyAuth` JSDoc enumerates the 5 distinct `false` return scenarios (caller treating false as "auth failed" would incorrectly retry on the SOCKS5 → relay handles it case).
35
+
36
+ ### CI
37
+ - **GitHub Release names now include date suffix** (`v3.0.2 (YYYY-MM-DD)`), matching the convention used by the backfilled v2.0.10 through v2.0.66 releases. Auto-applied via the already-computed `steps.version.outputs.date` in `softprops/action-gh-release`.
38
+
5
39
  ## [3.0.1] - 2026-05-24
6
40
 
7
41
  ### Security
package/lib/nettools.js CHANGED
@@ -418,6 +418,12 @@ function execFileWithTimeout(cmd, args, timeout = 10000) {
418
418
 
419
419
  reject(new Error(`Command timeout after ${timeout}ms: ${cmd} ${args.join(' ')}`));
420
420
  }, timeout);
421
+ // unref the outer timeout too — a hung dig/whois firing AFTER the
422
+ // per-URL drain (3s cap) already returned would otherwise hold the
423
+ // event loop alive for up to `timeout` (5-10s) on scan exit. The exec
424
+ // callback / 'error' handler still clear it via the existing
425
+ // clearTimeout, so this only matters for the genuinely-hung case.
426
+ if (typeof timer.unref === 'function') timer.unref();
421
427
 
422
428
  // Handle child process errors
423
429
  child.on('error', (err) => {
@@ -790,7 +796,17 @@ async function whoisLookupWithRetry(domain = '', timeout = 10000, whoisServer =
790
796
  console.log(formatLogMessage('debug', `${messageColors.highlight('[whois-retry]')} Adding ${actualDelay}ms progressive delay before retry ${retryCount + 1} (base: ${baseDelay}ms + extra: ${extraDelay}ms)...`));
791
797
  }
792
798
  }
793
- await new Promise(resolve => setTimeout(resolve, actualDelay));
799
+ // unref the retry-delay timer so a pending backoff (up to ~30s on
800
+ // late attempts) can't hold the event loop alive past scan exit
801
+ // when the per-URL drain has already returned. If the process is
802
+ // still otherwise busy, the timer fires normally; if it's the only
803
+ // thing left, the process exits and the now-pointless retry result
804
+ // never lands. Same pattern as the execFile/overall timers
805
+ // unref'd in 83209d4.
806
+ await new Promise(resolve => {
807
+ const t = setTimeout(resolve, actualDelay);
808
+ if (typeof t.unref === 'function') t.unref();
809
+ });
794
810
  } else if (serverIndex > 0 && retryCount === 0 && whoisDelay > 0) {
795
811
  // Add delay before trying a new server (but not the very first server)
796
812
  if (debugMode) {
@@ -800,7 +816,11 @@ async function whoisLookupWithRetry(domain = '', timeout = 10000, whoisServer =
800
816
  console.log(formatLogMessage('debug', `${messageColors.highlight('[whois-retry]')} Adding ${whoisDelay}ms delay before trying new server...`));
801
817
  }
802
818
  }
803
- await new Promise(resolve => setTimeout(resolve, whoisDelay));
819
+ // Same unref rationale as the retry-delay timer above.
820
+ await new Promise(resolve => {
821
+ const t = setTimeout(resolve, whoisDelay);
822
+ if (typeof t.unref === 'function') t.unref();
823
+ });
804
824
  } else if (debugMode && whoisDelay === 0) {
805
825
  // Log when delay is skipped due to whoisDelay being 0
806
826
  if (logFunc) {
@@ -1232,6 +1252,12 @@ function createNetToolsHandler(config) {
1232
1252
  })(),
1233
1253
  new Promise((_, reject) => {
1234
1254
  overallTimeoutId = setTimeout(() => reject(new Error('NetTools overall timeout')), 65000);
1255
+ // unref so a still-pending overall timeout (handler returned via
1256
+ // drain at 3s but the lookup is technically still in-flight) can't
1257
+ // hold the event loop alive for the full 65s on scan exit. The
1258
+ // finally on the inner promise still clearTimeouts on natural
1259
+ // completion, so this only matters for the genuinely-hung case.
1260
+ if (typeof overallTimeoutId.unref === 'function') overallTimeoutId.unref();
1235
1261
  })
1236
1262
  ]).catch(err => {
1237
1263
  if (forceDebug) {
package/lib/proxy.js CHANGED
@@ -61,13 +61,19 @@
61
61
  * // After page creation, before page.goto()
62
62
  * await applyProxyAuth(page, siteConfig, forceDebug);
63
63
  *
64
- * @version 1.1.0
64
+ * @version 1.2.0
65
65
  */
66
66
 
67
+ const net = require('net');
67
68
  const { formatLogMessage } = require('./colorize');
68
- const { ensureRelay, getRelayPort } = require('./socks-relay');
69
+ const { ensureRelay, getRelayPort, closeAllRelays: closeAllSocksRelays } = require('./socks-relay');
70
+
71
+ // Note: no separate subsystem TAG here — formatLogMessage('proxy', ...)
72
+ // already emits the `[proxy]` prefix from the severity. socks-relay.js's
73
+ // pattern (`[proxy] [socks-relay] ...`) is correct THERE because its
74
+ // module name differs from the severity. For this file the module IS the
75
+ // severity, so a second '[proxy]' would be redundant double-prefix.
69
76
 
70
- const PROXY_MODULE_VERSION = '1.2.0';
71
77
  const SUPPORTED_PROTOCOLS = ['socks5', 'socks4', 'http', 'https'];
72
78
 
73
79
  const DEFAULT_PORTS = {
@@ -115,6 +121,10 @@ function parseProxyUrl(proxyUrl) {
115
121
  if (!host) return null;
116
122
 
117
123
  const port = parseInt(url.port, 10) || DEFAULT_PORTS[protocol] || 1080;
124
+ // Reject obvious typos at parse time rather than passing a >65535 port
125
+ // through to Chromium and getting an opaque downstream error. Port 0
126
+ // is technically OS-assigned but never a valid proxy target.
127
+ if (port < 1 || port > 65535) return null;
118
128
  // decodeURIComponent throws URIError on a literal '%' that isn't a valid
119
129
  // escape (e.g. a password containing '%'). Fall back to the raw value so
120
130
  // an otherwise-valid proxy isn't rejected as "Invalid proxy URL".
@@ -186,7 +196,19 @@ function getProxyArgs(siteConfig, forceDebug = false) {
186
196
 
187
197
  const parsed = parseProxyUrl(proxyUrl);
188
198
  if (!parsed) {
189
- console.warn(formatLogMessage('proxy', `Invalid proxy URL: ${proxyUrl}`));
199
+ // Strip user:pass before echoing the URL — same redaction policy as
200
+ // getProxyInfo() / applyProxyAuth / socks-relay logs. Without this, a
201
+ // proxy URL with embedded creds (`socks5://user:pass@host:port`) that
202
+ // fails parse (typo in protocol, port out of range, etc.) leaks the
203
+ // raw creds to stderr. Regex handles both scheme-prefixed
204
+ // (`socks5://user:pass@`) and bare (`user:pass@`) forms — the latter
205
+ // because parseProxyUrl normalises bare host:port internally so the
206
+ // user-supplied string still reaches here unchanged.
207
+ const safeUrl = String(proxyUrl).replace(
208
+ /^([a-z0-9+]+:\/\/)?[^@\s]+@/i,
209
+ (_m, scheme) => `${scheme || ''}[redacted]@`
210
+ );
211
+ console.warn(formatLogMessage('proxy', `Invalid proxy URL: ${safeUrl}`));
190
212
  return [];
191
213
  }
192
214
 
@@ -249,10 +271,20 @@ function getProxyArgs(siteConfig, forceDebug = false) {
249
271
  * Applies proxy authentication to a page via Puppeteer's authenticate API.
250
272
  * Must be called BEFORE page.goto().
251
273
  *
274
+ * Returns `true` only on a successful HTTP/HTTPS page.authenticate() call.
275
+ * Returns `false` in five distinct scenarios — callers cannot use the
276
+ * boolean to distinguish them; treat `false` as "no further action needed
277
+ * from this module" rather than "auth failed":
278
+ * - no proxy configured
279
+ * - proxy has no username (anonymous)
280
+ * - SOCKS5 with creds -> the local relay handles upstream auth out-of-band
281
+ * - SOCKS4 with creds -> genuinely unsupported (warned)
282
+ * - page.authenticate() threw (warned)
283
+ *
252
284
  * @param {object} page - Puppeteer page instance
253
285
  * @param {object} siteConfig
254
286
  * @param {boolean} forceDebug
255
- * @returns {Promise<boolean>} True if auth was applied
287
+ * @returns {Promise<boolean>}
256
288
  */
257
289
  async function applyProxyAuth(page, siteConfig, forceDebug = false) {
258
290
  const proxyUrl = getConfiguredProxy(siteConfig);
@@ -283,7 +315,11 @@ async function applyProxyAuth(page, siteConfig, forceDebug = false) {
283
315
 
284
316
  const debug = forceDebug || siteConfig.proxy_debug || siteConfig.socks5_debug;
285
317
  if (debug) {
286
- console.log(formatLogMessage('proxy', `Auth set for ${parsed.username}@${parsed.host}:${parsed.port}`));
318
+ // Redact the username — same policy as getProxyInfo() and the
319
+ // socks-relay logs. debug output gets pasted into support tickets /
320
+ // screenshots / gists; '[redacted]' keeps the "yes, creds were
321
+ // attached" signal without disclosing what they were.
322
+ console.log(formatLogMessage('proxy', `Auth set for [redacted]@${parsed.host}:${parsed.port}`));
287
323
  }
288
324
 
289
325
  return true;
@@ -311,7 +347,6 @@ async function testProxy(siteConfig, timeoutMs = 5000) {
311
347
  return { reachable: false, latencyMs: 0, error: 'Invalid proxy URL' };
312
348
  }
313
349
 
314
- const net = require('net');
315
350
  const start = Date.now();
316
351
 
317
352
  return new Promise((resolve) => {
@@ -358,15 +393,11 @@ function getProxyInfo(siteConfig) {
358
393
  return `${parsed.protocol}://${auth}${parsed.host}:${parsed.port}`;
359
394
  }
360
395
 
361
- /**
362
- * Returns module version information
363
- */
364
- function getModuleInfo() {
365
- return { version: PROXY_MODULE_VERSION, name: 'Proxy Handler' };
366
- }
367
-
368
- // Re-export relay teardown so nwss.js cleanup paths can close listeners.
369
- const { closeAllRelays: closeAllSocksRelays } = require('./socks-relay');
396
+ // getModuleInfo() / PROXY_MODULE_VERSION / SUPPORTED_PROTOCOLS / and now
397
+ // getConfiguredProxy removed from exports -- zero external callers (mirrors
398
+ // the same trim done in lib/cloudflare.js). SUPPORTED_PROTOCOLS and
399
+ // getConfiguredProxy stay as module-local since parseProxyUrl /
400
+ // needsProxy / prepareSocksRelays / getProxyArgs use them.
370
401
 
371
402
  module.exports = {
372
403
  parseProxyUrl,
@@ -376,9 +407,5 @@ module.exports = {
376
407
  getProxyArgs,
377
408
  applyProxyAuth,
378
409
  testProxy,
379
- getProxyInfo,
380
- getModuleInfo,
381
- getConfiguredProxy,
382
- PROXY_MODULE_VERSION,
383
- SUPPORTED_PROTOCOLS
410
+ getProxyInfo
384
411
  };
@@ -22,9 +22,24 @@ const { SocksClient } = require('socks');
22
22
  const { formatLogMessage, messageColors } = require('./colorize');
23
23
  const SOCKS_RELAY_TAG = messageColors.processing('[socks-relay]');
24
24
 
25
- // upstreamKey -> { server, port, activeSockets:Set<net.Socket> }
25
+ // upstreamKey -> {
26
+ // server: net.Server, // listening on 127.0.0.1:port
27
+ // port: number, // OS-assigned local port
28
+ // activeSockets: Set<net.Socket>, // live client sockets (Chromium side)
29
+ // errors: number // cumulative upstream-connect failures
30
+ // }
26
31
  const _relays = new Map();
27
32
 
33
+ // upstreamKey -> Promise<port> currently initialising. Singleflight guard for
34
+ // ensureRelay so two concurrent callers for the same upstream share one
35
+ // in-flight init instead of both creating servers and racing to _relays.set,
36
+ // where the loser's server would be orphaned (listening forever, never
37
+ // closed by closeAllRelays). Not triggered by current usage (proxy.js's
38
+ // prepareSocksRelays uses a sequential await loop) but cheap defence
39
+ // against future callers that don't know to serialise. Mirrors the
40
+ // pendingDigLookups / pendingWhoisLookups pattern in lib/nettools.js.
41
+ const _pendingRelays = new Map();
42
+
28
43
  function upstreamKey(u) {
29
44
  return `${u.host}:${u.port}:${u.username || ''}`;
30
45
  }
@@ -39,7 +54,28 @@ function upstreamKey(u) {
39
54
  // notices (default ~2 hours on Linux).
40
55
  const HANDSHAKE_TIMEOUT_MS = 10000;
41
56
 
42
- function handleClient(client, upstream, forceDebug) {
57
+ // Cap pre-piping buffer growth. A real SOCKS5 greeting+request is well
58
+ // under 300 bytes; absorbing more before the watchdog fires lets a hostile
59
+ // or buggy local process drip-feed garbage to pin memory for up to 10s.
60
+ // 4096 is the next clean ceiling above any realistic handshake and matches
61
+ // typical TCP receive-buffer batches.
62
+ const MAX_HANDSHAKE_BYTES = 4096;
63
+
64
+ // Cap simultaneous local connections per relay. If Chromium opens more than
65
+ // this (prefetch-heavy site, fetch-retry loop), excess gets refused at the
66
+ // TCP-accept layer, which Chromium's HTTP retry logic handles cleanly. The
67
+ // alternative (no cap) is excess tunnels opening to the upstream and the
68
+ // provider silently dropping them past its concurrent-tunnel quota — looks
69
+ // to the scan like random missed requests.
70
+ const MAX_LOCAL_CONNECTIONS = 256;
71
+
72
+ // On closeAllRelays, give in-flight tunnels this long to drain their
73
+ // response data into Chromium before force-destroying. Without it, SIGINT
74
+ // mid-scan loses any upstream bytes that hadn't yet hit Puppeteer's
75
+ // response listener, leaving incomplete entries in results.json.
76
+ const DRAIN_TIMEOUT_MS = 2000;
77
+
78
+ function handleClient(client, upstream, forceDebug, relay) {
43
79
  let phase = 'greeting';
44
80
  let buf = Buffer.alloc(0);
45
81
  let upstreamSock = null;
@@ -73,6 +109,21 @@ function handleClient(client, upstream, forceDebug) {
73
109
 
74
110
  const onData = async (chunk) => {
75
111
  buf = Buffer.concat([buf, chunk]);
112
+ // Reject oversize pre-piping buffers before the 10s watchdog. Sends
113
+ // a protocol-appropriate failure reply per phase so a misbehaving but
114
+ // RFC-aware client gets a clean signal rather than a raw connection
115
+ // drop. Skipped once piping starts (buf is nulled then anyway).
116
+ if (phase !== 'piping' && buf.length > MAX_HANDSHAKE_BYTES) {
117
+ if (forceDebug) {
118
+ console.log(formatLogMessage('proxy', `${SOCKS_RELAY_TAG} handshake oversize (${buf.length} bytes, phase=${phase}) — closing`));
119
+ }
120
+ if (phase === 'greeting') {
121
+ try { client.write(Buffer.from([0x05, 0xFF])); } catch (_) {} // no acceptable methods
122
+ } else if (phase === 'request') {
123
+ failReply(client, 0x01); // general SOCKS server failure
124
+ }
125
+ return cleanup();
126
+ }
76
127
  try {
77
128
  if (phase === 'greeting') {
78
129
  // [0x05, NMETHODS, METHODS...]
@@ -134,6 +185,17 @@ function handleClient(client, upstream, forceDebug) {
134
185
  phase = 'connecting';
135
186
  client.pause();
136
187
  client.off('data', onData);
188
+ // Disarm the handshake watchdog now (not later at 'piping'). The
189
+ // client has completed its SOCKS5 negotiation; the remaining wait
190
+ // is on SocksClient.createConnection (which has its own 20s
191
+ // timeout below). Without this, if the upstream connect takes
192
+ // longer than HANDSHAKE_TIMEOUT_MS (10s) but less than 20s, the
193
+ // watchdog fires cleanup mid-await — destroying the client and
194
+ // setting settled=true — then the upstream connect resolves into
195
+ // a fresh socket that cleanup() can no longer destroy (settled
196
+ // guard short-circuits), orphaning an open TCP connection until
197
+ // OS-level timeout or remote close.
198
+ if (handshakeTimer) { clearTimeout(handshakeTimer); handshakeTimer = null; }
137
199
  const early = buf.subarray(hdrLen); // any bytes after the request header
138
200
  buf = null;
139
201
 
@@ -152,6 +214,10 @@ function handleClient(client, upstream, forceDebug) {
152
214
  timeout: 20000,
153
215
  });
154
216
  } catch (e) {
217
+ // Bump the per-relay error counter (exposed via getRelayStats)
218
+ // so post-scan diagnostics can see "X of N upstream connects
219
+ // failed" without re-running with forceDebug.
220
+ relay.errors++;
155
221
  if (forceDebug) {
156
222
  console.log(formatLogMessage('proxy', `${SOCKS_RELAY_TAG} upstream connect failed (${host}:${port}): ${e.message}`));
157
223
  }
@@ -160,7 +226,28 @@ function handleClient(client, upstream, forceDebug) {
160
226
  }
161
227
 
162
228
  upstreamSock = info.socket;
229
+ // Safety net: if cleanup() ran while we were awaiting the upstream
230
+ // connect (some path other than the handshake watchdog — e.g. a
231
+ // 'close' event on the client during pause), settled is true and
232
+ // cleanup's settled guard would short-circuit a future call,
233
+ // orphaning this freshly-connected upstream socket. Destroy it
234
+ // here directly. With Fix #1a moving the watchdog clearTimeout to
235
+ // the 'connecting' transition this is currently unreachable, but
236
+ // cheap to keep as defense-in-depth against future code paths.
237
+ if (settled) {
238
+ try { upstreamSock.destroy(); } catch (_) {}
239
+ return;
240
+ }
163
241
  try { upstreamSock.setNoDelay(true); } catch (_) {}
242
+ // Catch silently-dead upstreams (NAT timeout, mobile-tower drop,
243
+ // proxy crash without FIN/RST) faster than the default ~2-hour
244
+ // Linux idle. setKeepAlive(true, 60000) sets TCP_KEEPIDLE only —
245
+ // the kernel still uses tcp_keepalive_intvl/tcp_keepalive_probes
246
+ // for the probe phase, so total detection is ~60s idle + N probes
247
+ // (default 9 × 75s on Linux) ≈ 12 minutes. Big improvement over
248
+ // 2h, not the 60s the bare argument suggests. Client-side keep-
249
+ // alive is omitted — same kernel, OS surfaces death immediately.
250
+ try { upstreamSock.setKeepAlive(true, 60000); } catch (_) {}
164
251
  upstreamSock.on('error', cleanup);
165
252
  upstreamSock.on('close', cleanup);
166
253
  client.on('error', cleanup);
@@ -173,9 +260,8 @@ function handleClient(client, upstream, forceDebug) {
173
260
  upstreamSock.pipe(client);
174
261
  client.resume();
175
262
  phase = 'piping';
176
- // Negotiation complete disarm the handshake watchdog so a
177
- // long-running download isn't killed mid-transfer.
178
- if (handshakeTimer) { clearTimeout(handshakeTimer); handshakeTimer = null; }
263
+ // (handshakeTimer was already disarmed at the 'connecting'
264
+ // transition above; no second clearTimeout needed.)
179
265
  }
180
266
  } catch (e) {
181
267
  if (forceDebug) {
@@ -207,41 +293,86 @@ async function ensureRelay(upstream, forceDebug = false) {
207
293
  const existing = _relays.get(key);
208
294
  if (existing) return existing.port;
209
295
 
210
- const activeSockets = new Set();
211
- const server = net.createServer((clientSock) => {
212
- // Disable Nagle: page scanning is full of small-packet phases (per-origin
213
- // TLS handshakes, small XHR/API calls, the SOCKS handshake itself).
214
- // Nagle + delayed-ACK adds ~40ms stalls on those; relays should not.
215
- try { clientSock.setNoDelay(true); } catch (_) {}
216
- activeSockets.add(clientSock);
217
- clientSock.on('close', () => activeSockets.delete(clientSock));
218
- handleClient(clientSock, upstream, forceDebug);
219
- });
296
+ // Singleflight: if another caller is already initialising this upstream,
297
+ // ride its promise instead of starting a parallel init. Prevents the
298
+ // race where two concurrent callers both pass the _relays.get(key) check
299
+ // above, both create servers, and the second _relays.set(key, ...) below
300
+ // orphans the first server (listening forever, never closed).
301
+ if (_pendingRelays.has(key)) return _pendingRelays.get(key);
302
+
303
+ // Most authenticated SOCKS5 servers reject empty-password auth at the
304
+ // RFC 1929 handshake; without this warn, the misconfig surfaces only
305
+ // per-request inside forceDebug-gated logs (silent in production).
306
+ // Fire once per unique upstream (after the existing-relay short-circuit
307
+ // above) so repeated calls don't spam.
308
+ if (upstream.username && !upstream.password) {
309
+ console.warn(formatLogMessage('warn', `${SOCKS_RELAY_TAG} upstream ${upstream.host}:${upstream.port} has username but no password — RFC 1929 auth will likely fail`));
310
+ }
311
+
312
+ // .finally() (not try/finally inside the IIFE) so the cleanup is
313
+ // scheduled in a microtask, guaranteed to run AFTER the _pendingRelays.set
314
+ // below. If the cleanup were a try/finally inside an async IIFE and the
315
+ // body threw SYNCHRONOUSLY (before its first await), the finally would
316
+ // run SYNC before the implicit rejected promise was returned, _pendingRelays
317
+ // wouldn't be set yet, the delete would no-op, and then the outer .set
318
+ // would register a permanent rejected entry that future callers would
319
+ // await forever. The current body has no realistic sync-throw paths
320
+ // (net.createServer / Set / object literal don't throw), but defensive.
321
+ const initPromise = (async () => {
322
+ const activeSockets = new Set();
323
+ // Single mutable state object referenced by both the connection handler
324
+ // (writes .errors) and _relays / getRelayStats (read both). Server +
325
+ // port assigned after listen() completes; declared up-front so the
326
+ // closure below can close over `relayEntry` and pass it to handleClient.
327
+ const relayEntry = { server: null, port: null, activeSockets, errors: 0 };
328
+
329
+ const server = net.createServer((clientSock) => {
330
+ // Disable Nagle: page scanning is full of small-packet phases (per-origin
331
+ // TLS handshakes, small XHR/API calls, the SOCKS handshake itself).
332
+ // Nagle + delayed-ACK adds ~40ms stalls on those; relays should not.
333
+ try { clientSock.setNoDelay(true); } catch (_) {}
334
+ activeSockets.add(clientSock);
335
+ clientSock.on('close', () => activeSockets.delete(clientSock));
336
+ handleClient(clientSock, upstream, forceDebug, relayEntry);
337
+ });
220
338
 
221
- await new Promise((resolve, reject) => {
222
- const onErr = (e) => reject(e);
223
- server.once('error', onErr);
224
- server.listen(0, '127.0.0.1', () => {
225
- server.removeListener('error', onErr);
226
- // Keep a listener so a late server error doesn't crash the process.
227
- server.on('error', (e) => {
228
- if (forceDebug) console.log(formatLogMessage('proxy', `${SOCKS_RELAY_TAG} server error: ${e.message}`));
339
+ // Shed excess connections at the TCP-accept layer instead of letting them
340
+ // all proceed to open authenticated tunnels (which the upstream provider
341
+ // may silently drop past its quota).
342
+ server.maxConnections = MAX_LOCAL_CONNECTIONS;
343
+
344
+ await new Promise((resolve, reject) => {
345
+ const onErr = (e) => reject(e);
346
+ server.once('error', onErr);
347
+ server.listen(0, '127.0.0.1', () => {
348
+ server.removeListener('error', onErr);
349
+ // Keep a listener so a late server error doesn't crash the process.
350
+ server.on('error', (e) => {
351
+ if (forceDebug) console.log(formatLogMessage('proxy', `${SOCKS_RELAY_TAG} server error: ${e.message}`));
352
+ });
353
+ resolve();
229
354
  });
230
- resolve();
231
355
  });
356
+
357
+ relayEntry.server = server;
358
+ relayEntry.port = server.address().port;
359
+ _relays.set(key, relayEntry);
360
+ const port = relayEntry.port;
361
+ if (forceDebug) {
362
+ // auth status is kept as a presence flag only -- previously printed
363
+ // the raw username, which leaked into shared debug output (support
364
+ // tickets, screenshots, gists). Same redaction policy as the
365
+ // proxy.js getProxyInfo() change.
366
+ const authTag = upstream.username ? ' (auth: [redacted])' : ' (no auth)';
367
+ console.log(formatLogMessage('proxy', `${SOCKS_RELAY_TAG} 127.0.0.1:${port} -> ${upstream.host}:${upstream.port}${authTag}`));
368
+ }
369
+ return port;
370
+ })().finally(() => {
371
+ _pendingRelays.delete(key);
232
372
  });
233
373
 
234
- const port = server.address().port;
235
- _relays.set(key, { server, port, activeSockets });
236
- if (forceDebug) {
237
- // auth status is kept as a presence flag only -- previously printed
238
- // the raw username, which leaked into shared debug output (support
239
- // tickets, screenshots, gists). Same redaction policy as the
240
- // proxy.js getProxyInfo() change.
241
- const authTag = upstream.username ? ' (auth: [redacted])' : ' (no auth)';
242
- console.log(formatLogMessage('proxy', `${SOCKS_RELAY_TAG} 127.0.0.1:${port} -> ${upstream.host}:${upstream.port}${authTag}`));
243
- }
244
- return port;
374
+ _pendingRelays.set(key, initPromise);
375
+ return initPromise;
245
376
  }
246
377
 
247
378
  /**
@@ -255,26 +386,90 @@ function getRelayPort(upstream) {
255
386
  }
256
387
 
257
388
  /**
258
- * Tear down every relay: destroy in-flight sockets, close listeners.
259
- * Safe to call multiple times.
389
+ * Snapshot of active relays for diagnostics. Returns an array of
390
+ * { key, port, activeConnections, errors } — the upstream `key` has its
391
+ * trailing `:username` segment stripped using the same regex as
392
+ * closeAllRelays' display path (IPv6-safe). `errors` is the cumulative
393
+ * count of failed upstream-tunnel opens for the relay's lifetime.
394
+ * Useful for answering "is my proxy slow because the upstream is
395
+ * saturated, or because the scan is opening too many parallel tunnels?"
396
+ * without enabling forceDebug.
397
+ */
398
+ function getRelayStats() {
399
+ const stats = [];
400
+ for (const [key, r] of _relays) {
401
+ stats.push({
402
+ key: key.replace(/:[^:]*$/, ''),
403
+ port: r.port,
404
+ activeConnections: r.activeSockets.size,
405
+ errors: r.errors
406
+ });
407
+ }
408
+ return stats;
409
+ }
410
+
411
+ /**
412
+ * Tear down every relay. Stops accepting new connections, gives in-flight
413
+ * tunnels up to DRAIN_TIMEOUT_MS (2s) to flush remaining response bytes
414
+ * into Chromium / Puppeteer, then force-destroys any stragglers. Safe to
415
+ * call multiple times (subsequent calls iterate an empty _relays Map).
260
416
  */
261
417
  async function closeAllRelays(forceDebug = false) {
418
+ // Wait for any in-flight ensureRelay inits to finish before snapshotting
419
+ // _relays. Without this, a relay whose listen() completes AFTER our
420
+ // iteration starts would land in _relays unowned by closeAllRelays —
421
+ // leaked until next call or process exit. allSettled (not all) because
422
+ // a rejected init has already cleaned up its _pendingRelays entry via
423
+ // .finally; we just need to not throw here.
424
+ if (_pendingRelays.size > 0) {
425
+ await Promise.allSettled(Array.from(_pendingRelays.values()));
426
+ }
262
427
  for (const [key, r] of _relays) {
263
- for (const s of r.activeSockets) { try { s.destroy(); } catch (_) {} }
264
- await new Promise((res) => {
428
+ // upstreamKey embeds the username (host:port:username), so the raw
429
+ // key would leak it in debug output. Strip just the trailing
430
+ // `:username` segment for display; using a regex (not split-on-':')
431
+ // so IPv6 hosts with embedded colons (e.g. 2001:db8::1:1080:user)
432
+ // aren't mangled. The relay identity stays unambiguous from host+port.
433
+ const displayKey = key.replace(/:[^:]*$/, '');
434
+ const startedWith = r.activeSockets.size;
435
+
436
+ // server.close() stops accepting new connections and resolves only
437
+ // when all existing sockets have closed naturally. Race that against
438
+ // DRAIN_TIMEOUT_MS: if active tunnels finish flushing in time,
439
+ // Chromium / Puppeteer gets the response bytes it was waiting for;
440
+ // beyond that, force-destroy stragglers (the close callback then
441
+ // fires immediately). Trade-off chosen so SIGINT mid-scan doesn't
442
+ // amputate in-flight responses but a hung tunnel can't block exit.
443
+ const closePromise = new Promise((res) => {
265
444
  try { r.server.close(() => res()); } catch (_) { res(); }
266
445
  });
446
+
447
+ let timer;
448
+ const drained = await Promise.race([
449
+ closePromise.then(() => { if (timer) clearTimeout(timer); return true; }),
450
+ new Promise((res) => {
451
+ timer = setTimeout(() => res(false), DRAIN_TIMEOUT_MS);
452
+ // Don't hold the event loop open on a still-pending drain timer
453
+ // when the close-promise won the race.
454
+ if (typeof timer.unref === 'function') timer.unref();
455
+ })
456
+ ]);
457
+
458
+ let forcedCount = 0;
459
+ if (!drained) {
460
+ forcedCount = r.activeSockets.size;
461
+ for (const s of r.activeSockets) { try { s.destroy(); } catch (_) {} }
462
+ await closePromise; // resolves now that the last socket has closed
463
+ }
464
+
267
465
  if (forceDebug) {
268
- // upstreamKey embeds the username (host:port:username), so the raw
269
- // key would leak it in debug output. Strip just the trailing
270
- // `:username` segment for display; using a regex (not split-on-':')
271
- // so IPv6 hosts with embedded colons (e.g. 2001:db8::1:1080:user)
272
- // aren't mangled. The relay identity stays unambiguous from host+port.
273
- const displayKey = key.replace(/:[^:]*$/, '');
274
- console.log(formatLogMessage('proxy', `${SOCKS_RELAY_TAG} closed relay for ${displayKey}`));
466
+ const note = forcedCount > 0
467
+ ? ` (drain timeout force-closed ${forcedCount}/${startedWith} active socket(s))`
468
+ : (startedWith > 0 ? ` (drained ${startedWith} active socket(s))` : '');
469
+ console.log(formatLogMessage('proxy', `${SOCKS_RELAY_TAG} closed relay for ${displayKey}${note}`));
275
470
  }
276
471
  }
277
472
  _relays.clear();
278
473
  }
279
474
 
280
- module.exports = { ensureRelay, getRelayPort, closeAllRelays };
475
+ module.exports = { ensureRelay, getRelayPort, getRelayStats, closeAllRelays };
package/nwss.js CHANGED
@@ -109,6 +109,7 @@ const TIMEOUTS = Object.freeze({
109
109
  EMERGENCY_RESTART_DELAY: 2000, // Delay after emergency browser restart
110
110
  BROWSER_STABILIZE_DELAY: 1000, // Browser stabilization after restart
111
111
  CURL_HANDLER_DELAY: 3000, // Wait for async curl operations
112
+ NETTOOLS_DRAIN_TIMEOUT: 3000, // Hard cap for awaiting in-flight nettools (dig/whois) handlers before snapshot. Drains immediately if all complete; bounded so a hung dig can't block exit. Mirrors CURL_HANDLER_DELAY's role for curl/searchstring.
112
113
  PROTOCOL_TIMEOUT: 180000, // Chrome DevTools Protocol timeout
113
114
  REDIRECT_JS_TIMEOUT: 5000 // JavaScript redirect detection timeout
114
115
  });
@@ -777,7 +778,8 @@ Redirect Handling Options:
777
778
  isBrave: true/false Spoof Brave browser detection
778
779
  userAgent: "chrome"|"chrome_mac"|"chrome_linux"|"firefox"|"firefox_mac"|"firefox_linux"|"safari" Custom desktop User-Agent
779
780
  interact_intensity: "low"|"medium"|"high" Interaction simulation intensity (default: medium)
780
- delay: <milliseconds> Delay after load (default: 4000)
781
+ delay: <milliseconds> Delay after load (default: 6000, capped at 2000ms unless delay_uncapped: true)
782
+ delay_uncapped: true/false Honor 'delay' up to half the per-URL timeout instead of the 2s default cap. Use for sites with setTimeout-deferred lazy ad/tracker loaders that fire well past the standard post-networkidle window
781
783
  reload: <number> Reload page n times after load (default: 1)
782
784
  forcereload: true/false or ["domain1.com", "domain2.com"] Force cache-clearing reload for all URLs or specific domains
783
785
  clear_sitedata: true/false Clear all cookies, cache, storage before each load (default: false)
@@ -1864,7 +1866,13 @@ function setupFrameHandling(page, forceDebug) {
1864
1866
  '--disable-domain-reliability', // No reliability monitor disk writes
1865
1867
  // PERFORMANCE: Disable non-essential Chrome features in a single flag
1866
1868
  // IMPORTANT: Chrome only reads the LAST --disable-features flag, so combine all into one
1867
- `--disable-features=AudioServiceOutOfProcess,VizDisplayCompositor,TranslateUI,BlinkGenPropertyTrees,Translate,BackForwardCache,AcceptCHFrame,SafeBrowsing,HttpsFirstBalancedModeAutoEnable,site-per-process,PaintHolding${disable_ad_tagging ? ',AdTagging' : ''}`,
1869
+ // AccountConsistencyMirror + AccountConsistencyDice prevent the
1870
+ // Chrome sign-in subsystem from initialising at startup. Combined
1871
+ // with --disable-sync + --allow-browser-signin=false below, this
1872
+ // suppresses the "Something went wrong when opening your profile"
1873
+ // popup that fires in headful + --keep-open mode (temp userDataDir
1874
+ // has no real profile, so the sync init errors out and pops up).
1875
+ `--disable-features=AudioServiceOutOfProcess,VizDisplayCompositor,TranslateUI,BlinkGenPropertyTrees,Translate,BackForwardCache,AcceptCHFrame,SafeBrowsing,HttpsFirstBalancedModeAutoEnable,site-per-process,PaintHolding,AccountConsistencyMirror,AccountConsistencyDice${disable_ad_tagging ? ',AdTagging' : ''}`,
1868
1876
  '--disable-ipc-flooding-protection',
1869
1877
  '--aggressive-cache-discard',
1870
1878
  '--memory-pressure-off',
@@ -1874,7 +1882,16 @@ function setupFrameHandling(page, forceDebug) {
1874
1882
  '--no-sandbox',
1875
1883
  '--disable-setuid-sandbox',
1876
1884
  '--disable-dev-shm-usage',
1877
- ...(keepBrowserOpen ? [] : ['--disable-sync']),
1885
+ // --disable-sync is always-on (was previously dropped in --keep-open
1886
+ // mode, which let the sync subsystem init against our temp
1887
+ // userDataDir and pop the "Something went wrong when opening your
1888
+ // profile" dialog). Inspection during --keep-open doesn't need
1889
+ // sync; nothing in the scanner flow does.
1890
+ '--disable-sync',
1891
+ // Prevent the sign-in promo / account banner from appearing in
1892
+ // headful sessions. Same family of fixes as --disable-sync and the
1893
+ // AccountConsistency* features disabled above.
1894
+ '--allow-browser-signin=false',
1878
1895
  '--mute-audio',
1879
1896
  '--disable-translate',
1880
1897
  '--window-size=1920,1080',
@@ -2100,6 +2117,30 @@ function setupFrameHandling(page, forceDebug) {
2100
2117
  // Use Map to track domains and their resource types for --adblock-rules or --dry-run
2101
2118
  const matchedDomains = (adblockRulesMode || siteConfig.adblock_rules || dryRunMode) ? new Map() : new Set();
2102
2119
 
2120
+ // Per-URL tracking of in-flight async nettools (dig/whois) handlers so we
2121
+ // can drain them BEFORE snapshotting matchedDomains into the result. The
2122
+ // previous fire-and-forget setImmediate pattern dropped late-completing
2123
+ // matches (handler resolved after formatRules had already run). Each
2124
+ // setImmediate-scheduled handler now registers a promise via
2125
+ // trackNetToolsHandler; drainPendingNetTools() awaits all of them with a
2126
+ // hard cap (TIMEOUTS.NETTOOLS_DRAIN_TIMEOUT) so a hung dig can't block.
2127
+ const pendingNetTools = [];
2128
+ const trackNetToolsHandler = (handlerFn) => {
2129
+ pendingNetTools.push(new Promise((resolve) => {
2130
+ setImmediate(async () => {
2131
+ try { await handlerFn(); } catch (_) { /* handler logs its own errors */ }
2132
+ finally { resolve(); }
2133
+ });
2134
+ }));
2135
+ };
2136
+ const drainPendingNetTools = async () => {
2137
+ if (pendingNetTools.length === 0) return;
2138
+ await Promise.race([
2139
+ Promise.all(pendingNetTools),
2140
+ fastTimeout(TIMEOUTS.NETTOOLS_DRAIN_TIMEOUT)
2141
+ ]);
2142
+ };
2143
+
2103
2144
  // Local domain dedup scoped to THIS processUrl call only
2104
2145
  // Prevents cross-config contamination from the global domain cache
2105
2146
  const localDetectedDomains = new Set();
@@ -3167,7 +3208,7 @@ function setupFrameHandling(page, forceDebug) {
3167
3208
  currentUrl, getRootDomain, siteConfig, dumpUrls, matchedUrlsLogFile, forceDebug, fs,
3168
3209
  ignoreDomains, matchesIgnoreDomain
3169
3210
  });
3170
- setImmediate(() => popupNetToolsHandler(checkedRootDomain, fullSubdomain));
3211
+ trackNetToolsHandler(() => popupNetToolsHandler(checkedRootDomain, fullSubdomain));
3171
3212
  } else {
3172
3213
  // No nettools required — regex match alone counts.
3173
3214
  addMatchedDomain(checkedRootDomain, resourceType, fullSubdomain);
@@ -3573,7 +3614,7 @@ function setupFrameHandling(page, forceDebug) {
3573
3614
 
3574
3615
  // Execute nettools check asynchronously
3575
3616
  const originalDomain = fullSubdomain;
3576
- setImmediate(() => netToolsHandler(reqDomain, originalDomain));
3617
+ trackNetToolsHandler(() => netToolsHandler(reqDomain, originalDomain));
3577
3618
  }
3578
3619
  if (forceDebug) {
3579
3620
  console.log(formatLogMessage('debug', `${reqUrl} has nettools validation required - skipping immediate add`));
@@ -3688,7 +3729,7 @@ function setupFrameHandling(page, forceDebug) {
3688
3729
 
3689
3730
  // Execute nettools check asynchronously
3690
3731
  const originalDomain = fullSubdomain; // Use full subdomain for nettools
3691
- setImmediate(() => netToolsHandler(reqDomain, originalDomain));
3732
+ trackNetToolsHandler(() => netToolsHandler(reqDomain, originalDomain));
3692
3733
 
3693
3734
  // Do NOT continue processing this request for immediate domain addition
3694
3735
  // The nettools handler is responsible for adding the domain if validation passes
@@ -4237,13 +4278,22 @@ function setupFrameHandling(page, forceDebug) {
4237
4278
  }
4238
4279
  }
4239
4280
 
4240
- const delayMs = DEFAULT_DELAY;
4281
+ const delayMs = siteConfig.delay || DEFAULT_DELAY;
4241
4282
 
4242
4283
  // Optimized delays for Puppeteer 23.x performance
4243
4284
  const isFastSite = timeout <= TIMEOUTS.FAST_SITE_THRESHOLD;
4244
4285
  const networkIdleTime = TIMEOUTS.NETWORK_IDLE; // Balanced: 2s for reliable network detection
4245
4286
  const networkIdleTimeout = Math.min(timeout / 2, TIMEOUTS.NETWORK_IDLE_MAX); // Balanced: 10s timeout
4246
- const actualDelay = Math.min(delayMs, TIMEOUTS.NETWORK_IDLE); // Balanced: 2s delay for stability
4287
+ // Post-networkidle delay cap. Default (2s) keeps fast sites fast. Opt
4288
+ // in with `delay_uncapped: true` to honor the configured `delay` up to
4289
+ // half the per-URL timeout — useful for sites with setTimeout-deferred
4290
+ // lazy ad/tracker loaders (weather.com, cbssports.com class) where
4291
+ // late requests fire well past the 2s window. See also the per-URL
4292
+ // drainPendingNetTools() which awaits in-flight dig/whois handlers
4293
+ // before the matchedDomains snapshot regardless of this flag.
4294
+ const actualDelay = siteConfig.delay_uncapped === true
4295
+ ? Math.min(delayMs, Math.floor(timeout / 2))
4296
+ : Math.min(delayMs, TIMEOUTS.NETWORK_IDLE);
4247
4297
 
4248
4298
  // Build delay promise (networkIdle + delay + optional flowProxy delay)
4249
4299
  const delayPromise = (async () => {
@@ -4625,7 +4675,8 @@ function setupFrameHandling(page, forceDebug) {
4625
4675
  // Wait a moment for async nettools/searchstring operations to complete
4626
4676
  // Use fast timeout helper for Puppeteer 22.x compatibility
4627
4677
  await fastTimeout(TIMEOUTS.CURL_HANDLER_DELAY); // Wait for async operations
4628
-
4678
+ await drainPendingNetTools(); // Bounded wait for in-flight dig/whois (race fix)
4679
+
4629
4680
  return { url: currentUrl, rules: [], success: true, dryRun: true, matchCount: dryRunResult.matchCount };
4630
4681
  } else {
4631
4682
  // Format rules using the output module
@@ -4639,6 +4690,12 @@ function setupFrameHandling(page, forceDebug) {
4639
4690
  privoxyMode,
4640
4691
  piholeMode
4641
4692
  };
4693
+ // Drain pending dig/whois handlers BEFORE snapshotting matchedDomains.
4694
+ // Without this, late-completing async validations (request fired near
4695
+ // end of the delay window, dig still in flight) get orphaned — their
4696
+ // addMatchedDomain calls happen but the result has already been
4697
+ // returned. Bounded by TIMEOUTS.NETTOOLS_DRAIN_TIMEOUT.
4698
+ await drainPendingNetTools();
4642
4699
  const formattedRules = formatRules(matchedDomains, siteConfig, globalOptions);
4643
4700
 
4644
4701
  return {
@@ -4690,7 +4747,11 @@ function setupFrameHandling(page, forceDebug) {
4690
4747
  };
4691
4748
  }
4692
4749
 
4693
- // For other errors, preserve any matches we found before the error
4750
+ // For other errors, preserve any matches we found before the error.
4751
+ // Drain pending nettools first so dig/whois handlers scheduled DURING
4752
+ // the failed navigation get a chance to add to matchedDomains before
4753
+ // the partial-success snapshot — same race as the success path.
4754
+ await drainPendingNetTools();
4694
4755
  if (matchedDomains && (matchedDomains.size > 0 || (matchedDomains instanceof Map && matchedDomains.size > 0))) {
4695
4756
  const globalOptions = {
4696
4757
  localhostIP,
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@fanboynz/network-scanner",
3
- "version": "3.0.1",
3
+ "version": "3.0.2",
4
4
  "description": "A Puppeteer-based network scanner for analyzing web traffic, generating adblock filter rules, and identifying third-party requests. Features include fingerprint spoofing, Cloudflare bypass, content analysis with curl/grep, and multiple output formats.",
5
5
  "main": "nwss.js",
6
6
  "scripts": {