npm - @fanboynz/network-scanner - Versions diffs - 3.0.3 → 3.1.2 - Mend

@fanboynz/network-scanner 3.0.3 → 3.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,59 @@
 All notable changes to the Network Scanner (nwss.js) project.
+## [3.1.1] - 2026-05-30
+### Changed
+- **Fingerprint identity pinned to Stable Chrome 148**, not whatever Chrome-for-Testing puppeteer bundles (currently 149, ahead of Stable). The spoof must blend with the real-world population; claiming an unreleased build is itself a tell. The Chrome major + build (`CHROME_BUILD`) + GREASE brand (`CHROME_GREASE_BRAND`) are now single constants — see `lib/fingerprint.md`.
+- **UA Client Hints made fully consistent and matched to real Chrome 148** (verified field-for-field against a live desktop): brand-list order + GREASE string (`Not/A)Brand`), and the full-version build (`148.0.7778.217`) sourced from one place so JS `getHighEntropyValues` and the HTTP `Sec-CH-UA-Full-Version*` headers can't drift. Added `wow64`, `model`, `formFactors`, `uaFullVersion`, and `Sec-CH-UA-WoW64`/`-Model`/`-Form-Factors` headers; Windows `platformVersion` → `19.0.0`.
+- **`navigator.deviceMemory` and `Sec-CH-Device-Memory` both pinned to `8`** (consistent JS↔HTTP), hiding the host's real RAM; `hardwareConcurrency` reports 4–8 (hides datacenter core count).
+- **Dependencies**: puppeteer / puppeteer-core 25.1.0, lru-cache 11.5.1.
+### Fixed
+- **Timezone is now spoofed via CDP `emulateTimezone`** instead of JS overrides, so `Date`, `Intl`, and `getTimezoneOffset` are all consistent and DST-correct. The old JS patching left the real `Date` in the host zone — an 8-hour `Date`-vs-`Intl` contradiction and a leaked host timezone.
+- **Closed several headless tells**: Battery now reports the plugged-in default (`charging:true, level:1`); `navigator.bluetooth`, `navigator.share`/`canShare` stubs added (present in real Chrome, absent in headless); `speechSynthesis.getVoices()` returns the claimed-OS voice set (`instanceof`-correct).
+- **proxy**: a string `proxy_bypass`/`socks5_bypass` (instead of an array) no longer throws `bypass.join is not a function` in the browser-launch path.
+- **socks-relay**: a client that disconnects during the upstream-connect await is now handled, so a tunnel isn't opened for a gone client and the watchdog clears immediately.
+- **smart-cache**: the memory-check and auto-save `setInterval`s are now `unref`'d, so an error path that skips `destroy()` can no longer hang the process.
+### Removed
+- Dead code: `browserhealth` `testNetworkCapability` + `purgeStaleTrackers` (zero callers), and a redundant 2-voice `speechSynthesis` block superseded by the full voice set.
+### Added
+- **`lib/fingerprint.md`** — fingerprint spoofing coverage tables (surfaces, mitigations, gating flags) and known limitations.
+## [3.1.0] - 2026-05-29
+### Added
+- **`realistic_click`** site flag — denser mouse approach, hold tremor, and mouseup drift for sites that score click realism.
+- **`interact_click_count`** site override for popunder-discovery click volume (default content-click count also raised 2 → 3).
+- **`clear_sitedata_full_on_reload`** site flag — full storage clear between reloads; quick mode now also clears localStorage/sessionStorage.
+- **regex-tool rewritten** as a real `filterRegex` builder/tester: literal↔standard↔JSON conversion, multi-pattern + `regex_and`, and testing against real request URLs (matching mirrors the scanner exactly).
+- **Fingerprint coverage**: per-domain-seeded Battery / `navigator.connection` values, `AudioBuffer` fingerprint defeat, `PerformanceNavigationTiming` jitter, `userActivation`; UA strings bumped to Chrome 148 / Firefox 151 / Safari 19.5.
+### Changed
+- **`userAgent` now defaults to `"chrome"`** when a site doesn't set one — previously sites without it leaked the bundled `HeadlessChrome` UA.
+- **`Sec-CH-UA` headers and the curl content-fetch UA derive from the single UA source**, so Client Hints can't drift from `navigator.userAgent`.
+- **VPN configs force scan concurrency to 1** — the shared system routing table isn't concurrency-safe.
+- **Interaction time ceiling scales with the work envelope** (click count / `realistic_click`) instead of a flat 15s.
+### Fixed
+- **Per-URL timeout scales** with site timeout/delay/reload (+8s recovery grace) instead of a flat 75s that discarded partial-match recovery on multi-URL scans.
+- **Interaction hard cap is now actually enforced** (was cooperative, overshooting to 20s+ under concurrency).
+- **WireGuard** inline temp-config leaked the private key on failed connect and broke retries; temp dir is now per-PID so concurrent processes can't wipe each other's config.
+- **nettools**: fixed a dig dedup race (concurrent same-domain double lookups); whois no longer discards valid records over non-fatal stderr.
+- **Orphan resource leaks** on `Promise.race` timeout (cdp.js, clear_sitedata.js, browserhealth.js) and several un-`unref`'d `setTimeout` handles.
+- **Config keys validated at startup** with boolean-like coercion, preventing silent misconfiguration.
+### Security
+- **OpenVPN** `pkill`/`ping`/`curl` calls moved from shell-interpolated `execSync` to `spawnSync` arg arrays (command-injection).
+- **WireGuard/OpenVPN interface & connection names validated** against a strict charset before use in paths/commands.
+### Performance
+- **adblock**: O(1) exact-domain lookup for `$third-party` / `$first-party` rules.
+- Parallelized site-data clearing and window-cleanup checks.
+- Removed dead code across cdp, domain-cache, searchstring, compress, adblock-rust, and nettools.
 ## [3.0.3] - 2026-05-26
 ### Improved

package/lib/adblock-rust.js CHANGED Viewed

@@ -219,10 +219,20 @@ function parseAdblockRules(filePathOrArray, options = {}) {
       const buf = buffers[i];
       buffers[i] = null;
       const lines = buf.toString('utf-8').split('\n');
+      // Count actual rules for the startup banner. Skip:
+      //   - empty lines
+      //   - whitespace-only lines (trim then re-check length)
+      //   - '!'-prefixed comments (standard adblock)
+      //   - '['-prefixed filter list headers (e.g. '[Adblock Plus 2.0]')
+      // Previously only the first two skip conditions ran on the raw line,
+      // so whitespace lines + headers inflated the displayed count.
       for (let j = 0; j < lines.length; j++) {
         const line = lines[j];
         if (line.length === 0) continue;
-        if (line.charCodeAt(0) === 0x21) continue;
+        const trimmed = line.trim();
+        if (trimmed.length === 0) continue;
+        const c = trimmed.charCodeAt(0);
+        if (c === 0x21 || c === 0x5B) continue;  // '!' or '['
         ruleCount++;
       }
       filterSet.addFilters(lines);
@@ -238,7 +248,12 @@ function parseAdblockRules(filePathOrArray, options = {}) {
         // up by the TTL prune on a future run) but the final cachePath is
         // either complete or absent — never half-written.
         const tmpPath = cachePath + '.' + process.pid + '.tmp';
-        fs.writeFileSync(tmpPath, Buffer.from(serialized));
+        // Buffer.from(buffer) ALWAYS copies — wasteful when adblock-rs's
+        // serialize() already returns a Buffer (binding-version dependent).
+        // For a ~10MB compiled engine that's a pointless 5-10ms allocate+
+        // memcpy on the cold-cache-write path.
+        const out = Buffer.isBuffer(serialized) ? serialized : Buffer.from(serialized);
+        fs.writeFileSync(tmpPath, out);
         fs.renameSync(tmpPath, cachePath);
         // Best-effort prune of stale cache files. Done after our own write so
         // we never delete the entry we just created.
@@ -287,8 +302,6 @@ function parseAdblockRules(filePathOrArray, options = {}) {
   }
   return {
-    rules: { stats },
     shouldBlock(url, sourceUrl, resourceType) {
       // Avoid default-parameter syntax in the hot path — explicit null/undefined
       // checks are slightly cheaper for V8's argument adaptor.

package/lib/adblock.js CHANGED Viewed

@@ -85,22 +85,26 @@ function parseAdblockRules(filePath, options = {}) {
   const lines = fileContent.split('\n');
   const rules = {
-    domainMap: new Map(),          // ||domain.com^ - Exact domains for O(1) lookup
-    domainRules: [],               // ||*.domain.com^ - Wildcard domains (fallback)
-    thirdPartyRules: [],           // ||domain.com^$third-party
-    firstPartyRules: [],
-    pathRules: [],                 // /ads/*
-    scriptRules: [],               // .js$script
-    regexRules: [],                // /regex/
-    whitelist: [],                 // @@||domain.com^ - Wildcard whitelist
-    whitelistMap: new Map(),       // Exact whitelist domains for O(1) lookup
-    elementHiding: [],             // ##.ad-class (not used for network blocking)
+    domainMap: new Map(),                // ||domain.com^ - Exact domains for O(1) lookup
+    domainRules: [],                     // ||*.domain.com^ - Wildcard domains (fallback)
+    thirdPartyDomainMap: new Map(),      // ||domain.com^$third-party (exact)  — O(1)
+    thirdPartyRules: [],                 // wildcard / non-domain $third-party (fallback)
+    firstPartyDomainMap: new Map(),      // ||domain.com^$first-party (exact)  — O(1)
+    firstPartyRules: [],                 // wildcard / non-domain $first-party (fallback)
+    pathRules: [],                       // /ads/*
+    scriptRules: [],                     // .js$script
+    regexRules: [],                      // /regex/
+    whitelist: [],                       // @@||domain.com^ - Wildcard whitelist
+    whitelistMap: new Map(),             // Exact whitelist domains for O(1) lookup
+    elementHiding: [],                   // ##.ad-class (not used for network blocking)
     stats: {
       total: 0,
       domain: 0,
-      domainMapEntries: 0,         // Exact domain matches in Map
+      domainMapEntries: 0,               // Exact domain matches in Map
       thirdParty: 0,
+      thirdPartyMapEntries: 0,           // Exact-domain $third-party rules in Map
       firstParty: 0,
+      firstPartyMapEntries: 0,           // Exact-domain $first-party rules in Map
       path: 0,
       script: 0,
       regex: 0,
@@ -161,12 +165,28 @@ function parseAdblockRules(filePath, options = {}) {
       // Regular blocking rules
       const parsedRule = parseRule(line, false, enableLogging);
-      // Categorize based on rule type
+      // Categorize based on rule type. For $third-party and $first-party
+      // rules we additionally split out the exact-domain variants into a
+      // hash map keyed by hostname, mirroring the domainMap pattern. This
+      // turns the common `||example.com^$third-party` lookup from O(N) over
+      // thousands of array entries into O(1) by hostname (+ small parent
+      // walk). Wildcard / non-domain party rules still fall back to the
+      // linear array.
       if (parsedRule.isThirdParty) {
-        rules.thirdPartyRules.push(parsedRule);
+        if (parsedRule.isDomain && parsedRule.domain && !parsedRule.domain.includes('*')) {
+          rules.thirdPartyDomainMap.set(parsedRule.domain.toLowerCase(), parsedRule);
+          rules.stats.thirdPartyMapEntries++;
+        } else {
+          rules.thirdPartyRules.push(parsedRule);
+        }
         rules.stats.thirdParty++;
       } else if (parsedRule.isFirstParty) {
-        rules.firstPartyRules.push(parsedRule);
+        if (parsedRule.isDomain && parsedRule.domain && !parsedRule.domain.includes('*')) {
+          rules.firstPartyDomainMap.set(parsedRule.domain.toLowerCase(), parsedRule);
+          rules.stats.firstPartyMapEntries++;
+        } else {
+          rules.firstPartyRules.push(parsedRule);
+        }
         rules.stats.firstParty++;
       } else if (parsedRule.isDomain) {
         // Store exact domains in Map for O(1) lookup, wildcards in array
@@ -201,7 +221,11 @@ function parseAdblockRules(filePath, options = {}) {
     console.log(formatLogMessage('debug', `    • Exact matches (Map): ${rules.stats.domainMapEntries}`));
     console.log(formatLogMessage('debug', `    • Wildcard patterns (Array): ${rules.domainRules.length}`));
     console.log(formatLogMessage('debug', `  - Third-party rules: ${rules.stats.thirdParty}`));
+    console.log(formatLogMessage('debug', `    • Exact matches (Map): ${rules.stats.thirdPartyMapEntries}`));
+    console.log(formatLogMessage('debug', `    • Wildcard/path (Array): ${rules.thirdPartyRules.length}`));
     console.log(formatLogMessage('debug', `  - First-party rules: ${rules.stats.firstParty}`));
+    console.log(formatLogMessage('debug', `    • Exact matches (Map): ${rules.stats.firstPartyMapEntries}`));
+    console.log(formatLogMessage('debug', `    • Wildcard/path (Array): ${rules.firstPartyRules.length}`));
     console.log(formatLogMessage('debug', `  - Path rules: ${rules.stats.path}`));
     console.log(formatLogMessage('debug', `  - Script rules: ${rules.stats.script}`));
     console.log(formatLogMessage('debug', `  - Regex rules: ${rules.stats.regex}`));
@@ -445,7 +469,14 @@ function createMatcher(rules, options = {}) {
   let resultCacheHits = 0, resultCacheMisses = 0;
   let urlCacheHits = 0, urlCacheMisses = 0;
   let sourceCacheHits = 0, sourceCacheMisses = 0;
-  const hasPartyRules = rules.thirdPartyRules.length > 0 || rules.firstPartyRules.length > 0;
+  // Include the new domain-maps in the party-rules presence check — without
+  // this, a filter list whose $third-party rules ALL went into the Map (empty
+  // array) would never trigger third-party detection, silently disabling the
+  // entire third-party path.
+  const hasPartyRules = rules.thirdPartyRules.length > 0 ||
+                        rules.firstPartyRules.length > 0 ||
+                        rules.thirdPartyDomainMap.size > 0 ||
+                        rules.firstPartyDomainMap.size > 0;
   // Result cache uses FIFO eviction (see FIFOCache class comment) —
   // evicts oldest entries one at a time instead of clearing everything.
   const resultCache = new FIFOCache(32000);
@@ -634,6 +665,29 @@ function createMatcher(rules, options = {}) {
         // Check third-party rules
         if (isThirdParty) {
+          // Fast path: exact-domain $third-party rules (O(1) by hostname)
+          let rule = rules.thirdPartyDomainMap.get(lowerHostname);
+          if (rule && matchesRule(rule, url, hostname, isThirdParty, resourceType, sourceDomain)) {
+            if (enableLogging) {
+              console.log(formatLogMessage('debug', `${ADBLOCK_TAG} Blocked third-party: ${url} (${rule.raw || rule.pattern})`));
+            }
+            const r = { blocked: true, rule: rule.raw || rule.pattern, reason: 'third_party_rule' };
+            resultCacheSet(url, sourceUrl, resourceType, r);
+            return r;
+          }
+          // Parent-domain $third-party rules — same walk as domainMap
+          for (let i = 0; i < parents.length; i++) {
+            rule = rules.thirdPartyDomainMap.get(parents[i]);
+            if (rule && matchesRule(rule, url, hostname, isThirdParty, resourceType, sourceDomain)) {
+              if (enableLogging) {
+                console.log(formatLogMessage('debug', `${ADBLOCK_TAG} Blocked third-party: ${url} (${rule.raw || rule.pattern})`));
+              }
+              const r = { blocked: true, rule: rule.raw || rule.pattern, reason: 'third_party_rule' };
+              resultCacheSet(url, sourceUrl, resourceType, r);
+              return r;
+            }
+          }
+          // Slow path: wildcard / non-domain $third-party rules
           const thirdPartyLen = rules.thirdPartyRules.length;  // V8: Cache length
           for (let i = 0; i < thirdPartyLen; i++) {
             const rule = rules.thirdPartyRules[i];
@@ -650,6 +704,29 @@ function createMatcher(rules, options = {}) {
         // Check first-party rules
         if (!isThirdParty) {
+          // Fast path: exact-domain $first-party rules (O(1) by hostname)
+          let rule = rules.firstPartyDomainMap.get(lowerHostname);
+          if (rule && matchesRule(rule, url, hostname, isThirdParty, resourceType, sourceDomain)) {
+            if (enableLogging) {
+              console.log(formatLogMessage('debug', `${ADBLOCK_TAG} Blocked first-party: ${url} (${rule.raw || rule.pattern})`));
+            }
+            const r = { blocked: true, rule: rule.raw || rule.pattern, reason: 'first_party_rule' };
+            resultCacheSet(url, sourceUrl, resourceType, r);
+            return r;
+          }
+          // Parent-domain $first-party rules
+          for (let i = 0; i < parents.length; i++) {
+            rule = rules.firstPartyDomainMap.get(parents[i]);
+            if (rule && matchesRule(rule, url, hostname, isThirdParty, resourceType, sourceDomain)) {
+              if (enableLogging) {
+                console.log(formatLogMessage('debug', `${ADBLOCK_TAG} Blocked first-party: ${url} (${rule.raw || rule.pattern})`));
+              }
+              const r = { blocked: true, rule: rule.raw || rule.pattern, reason: 'first_party_rule' };
+              resultCacheSet(url, sourceUrl, resourceType, r);
+              return r;
+            }
+          }
+          // Slow path: wildcard / non-domain $first-party rules
           const firstPartyLen = rules.firstPartyRules.length;
           for (let i = 0; i < firstPartyLen; i++) {
             const rule = rules.firstPartyRules[i];

package/lib/browserhealth.js CHANGED Viewed

@@ -107,10 +107,13 @@ async function performGroupWindowCleanup(browserInstance, groupDescription, forc
     // Identify the main Puppeteer window (should be about:blank or the initial page)
     let mainPuppeteerPage = null;
     let pagesToClose = [];
-    // Find the main page - typically the first page that's about:blank or has been there longest
+    // First pass: synchronous categorization. Separate blank pages from
+    // content pages so the conservative-mode isPageFromPreviousScan() checks
+    // can run in parallel via Promise.all below, instead of N sequential
+    // awaits (each potentially a CDP roundtrip for page.title()).
+    const contentPages = [];
     for (const page of allPages) {
-      // Cache page.url() call to avoid repeated DOM/browser communication
       const pageUrl = page.url();
       if (pageUrl === 'about:blank' || pageUrl === '' || pageUrl.startsWith('chrome://')) {
         if (!mainPuppeteerPage) {
@@ -119,18 +122,21 @@ async function performGroupWindowCleanup(browserInstance, groupDescription, forc
           pagesToClose.push(page); // Additional blank pages can be closed
         }
       } else {
-        // Any page with actual content should be evaluated for closure
-        if (cleanupMode === "all") {
-          // Aggressive mode: close all content pages
-          pagesToClose.push(page);
-        } else {
-          // Conservative mode: only close pages that look like leftovers from previous scans
-          // Keep pages that might still be actively used
-          const isOldPage = await isPageFromPreviousScan(page, forceDebug);
-          if (isOldPage) {
-            pagesToClose.push(page);
-          }
-        }
+        contentPages.push(page);
+      }
+    }
+    if (cleanupMode === "all") {
+      // Aggressive mode: close all content pages — no per-page async check
+      for (const page of contentPages) pagesToClose.push(page);
+    } else {
+      // Conservative mode: run the isPageFromPreviousScan checks in parallel
+      // and collect the leftovers in original order.
+      const checks = await Promise.all(
+        contentPages.map(page => isPageFromPreviousScan(page, forceDebug))
+      );
+      for (let i = 0; i < contentPages.length; i++) {
+        if (checks[i]) pagesToClose.push(contentPages[i]);
       }
     }
@@ -391,12 +397,13 @@ async function performRealtimeWindowCleanup(browserInstance, threshold = REALTIM
           if (forceDebug) {
             console.log(formatLogMessage('debug', `${REALTIME_CLEANUP_TAG} Found ${contextPages.length} pages in popup context`));
           }
-          // Close popup context pages
-          for (const page of contextPages) {
-            if (!page.isClosed()) {
-              await page.close();
-            }
-          }
+          // Close popup context pages in parallel — each close is
+          // independent and the sequential await was both slow AND would
+          // abort the whole loop on the first close failure, leaking the
+          // remaining pages. .catch() per page ensures we attempt all.
+          await Promise.all(contextPages.map(page =>
+            page.isClosed() ? undefined : page.close().catch(() => {})
+          ));
         }
       }
     } catch (contextErr) {
@@ -600,16 +607,6 @@ function untrackPage(page) {
   pageUsageTracker.delete(page);
 }
-/**
- * No-op since the trackers were migrated to WeakMap — GC reclaims dead-page
- * entries automatically when Puppeteer drops its internal references. Kept
- * exported so the ~7 callers in nwss.js continue to compile; safe to delete
- * entirely once those callsites are scrubbed.
- */
-function purgeStaleTrackers() {
-  // intentionally empty
-}
 /**
  * Quick browser responsiveness test for use during page setup
  * Designed to catch browser degradation between operations
@@ -630,71 +627,6 @@ async function isQuicklyResponsive(browserInstance, timeout = 3000) {
   }
 }
-/**
- * Tests if browser can handle network operations (like Network.enable)
- * Creates a test page and attempts basic network setup
- * @param {import('puppeteer').Browser} browserInstance - Puppeteer browser instance
- * @param {number} timeout - Timeout in milliseconds (default: 10000)
- * @returns {Promise<object>} Network capability test result
- */
-async function testNetworkCapability(browserInstance, timeout = 10000) {
-  const result = {
-    capable: false,
-    error: null,
-    responseTime: 0
-  };
-  const startTime = Date.now();
-  let testPage = null;
-  try {
-    // Create test page
-    testPage = await raceWithTimeout(
-      browserInstance.newPage(),
-      timeout,
-      'Test page creation timeout'
-    );
-    // Test network operations (the critical operation that's failing)
-    await raceWithTimeout(
-      testPage.setRequestInterception(true),
-      timeout,
-      'Network.enable test timeout'
-    );
-    // Turn off interception. Symmetric to the enable above — Network.disable
-    // can hang for the same CDP reasons, so it needs the same watchdog.
-    await raceWithTimeout(
-      testPage.setRequestInterception(false),
-      timeout,
-      'Network.disable test timeout'
-    );
-    result.capable = true;
-    result.responseTime = Date.now() - startTime;
-  } catch (error) {
-    result.error = error.message;
-    result.responseTime = Date.now() - startTime;
-    // Classify the error type
-    if (error.message.includes('Network.enable') ||
-        error.message.includes('timed out') ||
-        error.message.includes('Protocol error')) {
-      result.error = `Network capability test failed: ${error.message}`;
-    }
-  } finally {
-    if (testPage && !testPage.isClosed()) {
-      try {
-        await testPage.close();
-      } catch (closeErr) {
-        /* ignore cleanup errors */
-      }
-    }
-  }
-  return result;
-}
 /**
  * Checks if browser instance is still responsive
  * @param {import('puppeteer').Browser} browserInstance - Puppeteer browser instance
@@ -740,9 +672,15 @@ async function checkBrowserHealth(browserInstance, timeout = 8000) {
     // Test 4: Create a single test page to verify both browser functionality AND network capability
     let testPage = null;
+    // Same orphan-cleanup pattern as cdp.js + clear_sitedata.js.
+    // Promise.race can't cancel newPage() — if the race
+    // times out the underlying call may still produce a Page tab nothing
+    // references → leaked tab.
+    let testPagePromise = null;
     try {
+      testPagePromise = browserInstance.newPage();
       testPage = await raceWithTimeout(
-        browserInstance.newPage(),
+        testPagePromise,
         timeout,
         'Page creation timeout'
       );
@@ -780,6 +718,11 @@ async function checkBrowserHealth(browserInstance, timeout = 8000) {
       await testPage.close();
     } catch (pageTestError) {
+      // Orphan cleanup: if testPage is null but newPage was started, the
+      // race timed out before assignment. Close the orphan when it arrives.
+      if (!testPage && testPagePromise) {
+        testPagePromise.then(p => p.close().catch(() => {})).catch(() => {});
+      }
       if (testPage && !testPage.isClosed()) {
         try { await testPage.close(); } catch (e) { /* ignore */ }
       }
@@ -1253,7 +1196,6 @@ module.exports = {
   performGroupWindowCleanup,
   performRealtimeWindowCleanup,
   trackPageForRealtime,
-  testNetworkCapability,
   isQuicklyResponsive,
   performHealthAssessment,
   monitorBrowserHealth,
@@ -1261,6 +1203,5 @@ module.exports = {
   isCriticalProtocolError,
   updatePageUsage,
   untrackPage,
-  cleanupPageBeforeReload,
-  purgeStaleTrackers
+  cleanupPageBeforeReload
 };

package/lib/cdp.js CHANGED Viewed

@@ -48,7 +48,8 @@ function raceWithTimeout(promise, ms, message) {
 }
 // Shared no-op cleanup used by every no-CDP / CDP-failed return path. Hoisted
-// so createSessionResult() doesn't allocate a fresh `async () => {}` per call.
+// so the success path doesn't allocate a fresh `async () => {}` per call
+// when cleanup logic isn't needed, and so NOOP_SESSION_RESULT can reuse it.
 const NOOP_CLEANUP = async () => {};
 /**
@@ -74,27 +75,39 @@ function isCriticalCDPError(message) {
          message.includes('Browser has been closed');
 }
-/**
- * Creates a standardized session result object for consistent V8 optimization
- * @param {object|null} session - CDP session or null
- * @param {Function} cleanup - Cleanup function
- * @param {boolean} isEnhanced - Whether enhanced features are active
- * @returns {object} Standardized session object
- */
-const createSessionResult = (session = null, cleanup = NOOP_CLEANUP, isEnhanced = false) => ({
-  session,
-  cleanup,
-  isEnhanced
+// Pre-allocated singleton for both the early-exit case (CDP not enabled OR
+// not in debug mode) AND the non-critical-error path. Frozen so callers can't
+// mutate the shared instance. Result shape is {session, cleanup}; previously
+// also carried an `isEnhanced: false` field that had zero consumers anywhere.
+const NOOP_SESSION_RESULT = Object.freeze({
+  session: null,
+  cleanup: NOOP_CLEANUP
 });
 /**
- * Creates a new page with timeout protection to prevent CDP hangs
+ * Creates a new page with timeout protection to prevent CDP hangs.
+ *
+ * Orphan-page handling: Promise.race cannot cancel browser.newPage(). If the
+ * timer wins, the underlying call keeps running and eventually resolves to a
+ * real Page tab nothing references → leaked tab in the browser. We capture
+ * the original promise and attach a close-on-resolve cleanup so the orphan
+ * is reaped if it arrives after the race lost.
+ *
  * @param {import('puppeteer').Browser} browser - Browser instance
  * @param {number} timeout - Timeout in milliseconds (default: 30000)
  * @returns {Promise<import('puppeteer').Page>} Page instance
  */
 async function createPageWithTimeout(browser, timeout = 30000) {
-  return raceWithTimeout(browser.newPage(), timeout, 'Page creation timeout - browser may be unresponsive');
+  const pagePromise = browser.newPage();
+  try {
+    return await raceWithTimeout(pagePromise, timeout, 'Page creation timeout - browser may be unresponsive');
+  } catch (err) {
+    // If pagePromise eventually resolves after the race gave up, close the
+    // orphan tab. .catch(() => {}) handles the case where pagePromise also
+    // rejected (no resource to clean up).
+    pagePromise.then(p => p.close().catch(() => {})).catch(() => {});
+    throw err;
+  }
 }
 /**
@@ -171,7 +184,7 @@ async function createCDPSession(page, currentUrl, options = {}) {
   const cdpLoggingNeeded = (enableCDP || siteSpecificCDP === true) && forceDebug;
   if (!cdpLoggingNeeded) {
-    return createSessionResult();
+    return NOOP_SESSION_RESULT;
   }
   // Parse the current URL hostname once and reuse it for the mode-log line,
@@ -187,11 +200,16 @@ async function createCDPSession(page, currentUrl, options = {}) {
   }
   let cdpSession = null;
+  let cdpSessionPromise = null;
   try {
-    // Create CDP session using modern Puppeteer 20+ API
-    // Add timeout protection for CDP session creation
-    cdpSession = await raceWithTimeout(page.createCDPSession(), 20000, 'CDP session creation timeout');
+    // Create CDP session using modern Puppeteer 20+ API.
+    // Capture the promise BEFORE racing so the catch block can attach an
+    // orphan-cleanup chain — if our race times out but the underlying
+    // createCDPSession() later resolves, we'd otherwise leak a CDP session
+    // on the browser side that nothing references.
+    cdpSessionPromise = page.createCDPSession();
+    cdpSession = await raceWithTimeout(cdpSessionPromise, 20000, 'CDP session creation timeout');
     // Enable network domain — required for network event monitoring. This is
     // the operation the rest of the codebase has learned can hang under
@@ -221,10 +239,13 @@ async function createCDPSession(page, currentUrl, options = {}) {
     console.log(formatLogMessage('debug', `${CDP_TAG} CDP session created successfully for ${currentUrl}`));
-    return createSessionResult(
-      cdpSession,
-      async () => {
-        // Safe cleanup that never throws errors
+    return {
+      session: cdpSession,
+      cleanup: async () => {
+        // Safe cleanup that never throws errors. Idempotent — null out the
+        // captured reference after the first successful detach so a
+        // double-cleanup is a true no-op instead of generating a misleading
+        // "Failed to detach: Session closed" debug log on the second call.
         if (cdpSession) {
           try {
             await cdpSession.detach();
@@ -232,28 +253,41 @@ async function createCDPSession(page, currentUrl, options = {}) {
           } catch (cdpCleanupErr) {
             // Log cleanup errors but don't throw - cleanup should never fail the calling code
             console.log(formatLogMessage('debug', `${CDP_TAG} Failed to detach CDP session for ${currentUrl}: ${cdpCleanupErr.message}`));
+          } finally {
+            cdpSession = null;
           }
         }
-      },
-      false
-    );
+      }
+    };
   } catch (cdpErr) {
-    // If the session was created but a subsequent send/wire-up failed, detach
-    // it so we don't leak a half-attached session. Previously the code just
-    // nulled the local and orphaned the session. We're already past the
-    // cdpLoggingNeeded gate here so forceDebug is true — log a failed detach
-    // instead of swallowing it, so partial-cleanup failures aren't invisible.
+    // Two distinct cleanup paths depending on where the failure was:
+    //
+    //   a) cdpSession IS set → failure was AFTER createCDPSession() resolved
+    //      (e.g. Network.enable timed out). We have a real handle — detach
+    //      directly. Previously the code just nulled the local and orphaned
+    //      the session; now we detach and log any failure.
+    //
+    //   b) cdpSession is null but cdpSessionPromise was started → the race
+    //      timed out before assignment. The underlying createCDPSession()
+    //      may still resolve later, producing an orphan session on the
+    //      browser side. Attach a detach-on-resolve chain; .catch(()=>{})
+    //      swallows the case where the underlying promise also rejected.
     if (cdpSession) {
       try { await cdpSession.detach(); }
       catch (partialDetachErr) {
         console.log(formatLogMessage('debug', `${CDP_TAG} Partial-session detach failed for ${currentUrl}: ${partialDetachErr.message}`));
       }
-      cdpSession = null;
+    } else if (cdpSessionPromise) {
+      cdpSessionPromise.then(s => s.detach().catch(() => {})).catch(() => {});
     }
-    // Enhanced error context for CDP domain-specific debugging
-    const urlContext = safeHostname(currentUrl, `${currentUrl.substring(0, 50)}...`);
+    // Enhanced error context for CDP domain-specific debugging. Reuse the
+    // currentHostname computed at function entry (one URL parse vs two);
+    // only fall back to the truncated raw URL when that parse failed too.
+    const urlContext = currentHostname !== 'unknown'
+      ? currentHostname
+      : `${currentUrl.substring(0, 50)}...`;
     // Critical errors: browser is broken, propagate so the caller can restart.
     if (isCriticalCDPError(cdpErr.message)) {
@@ -265,7 +299,7 @@ async function createCDPSession(page, currentUrl, options = {}) {
     console.warn(formatLogMessage('warn', `${CDP_TAG} Failed to attach CDP session for ${urlContext}: ${cdpErr.message}`));
     // Return null session with no-op cleanup for consistent API
-    return createSessionResult();
+    return NOOP_SESSION_RESULT;
   }
 }