npm - @fanboynz/network-scanner - Versions diffs - 3.0.2 → 3.1.0 - Mend

@fanboynz/network-scanner 3.0.2 → 3.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,40 @@
 All notable changes to the Network Scanner (nwss.js) project.
+## [3.0.3] - 2026-05-26
+### Improved
+- **3 DataDome-targeted gaps closed in `lib/fingerprint.js`** (inside `applyFingerprintProtection`, so gated on `siteConfig.fingerprint_protection` like every other spoof in that function):
+  - **`Notification.permission` static property** now returns `'default'` (real Chrome's no-granted-permission state). Previously only `Notification.requestPermission()` (the method) was patched; the static property still returned the headless default `'denied'` — a live tell for DataDome and similar detectors that read it directly.
+  - **`screen.orientation` interface** is now provided as a stable `{type: 'landscape-primary', angle: 0, addEventListener, lock, unlock, ...}` object when missing. Modern browsers always expose ScreenOrientation; absence is a "real browser?" check signal.
+  - **`<html>` `webdriver` DOM attribute** stripped if present. Defensive — modern Puppeteer with `ignoreDefaultArgs: ['--enable-automation']` doesn't emit this, but older driver setups do, and detectors check both `navigator.webdriver` AND `documentElement.getAttribute('webdriver')`. Appended to the existing `'webdriver removal'` safeExecute block so all webdriver cleanup lives together.
+  Targeted at sites running DataDome's `ct.captcha-delivery.com/i.js` (and similar fingerprint suites: PerimeterX, Akamai Bot Manager). Most other surfaces these detectors probe were already covered (chrome.app/csi/loadTimes, userAgentData, maxTouchPoints, permissions.query, WebGL UNMASKED_VENDOR/RENDERER, etc.). `scripts/test-stealth.js sannysoft` regression smoke holds at 29 passed / 1 warn / 0 failed (the warn is `CHR_DEBUG_TOOLS`, a CDP-attached signal that's fundamental to Puppeteer and unrelated to these additions). JS-only spoofing can't address TLS fingerprint, HTTP/2 fingerprint, IP reputation, or behavioural analysis — those still depend on proxy choice and `interact` / `ghost-cursor` config.
+### Added
+- **`scripts/test-stealth.js` now reports warn-row labels** for sannysoft, not just failure-row labels. Previously a cell moving from `passed` → `warn` between runs was invisible (only the count changed), making soft-regression debugging require `--headful`. Now the warn-row table contents print inline so you can see e.g. `warn rows: CHR_DEBUG_TOOLS` directly. Schema additive: result object gains a `warnings: string[]` array alongside the existing `failures: string[]`.
+- **`scripts/test-stealth.js` extracts CreepJS's actual current metrics** instead of stale `Trust Score` regex that returned `n/a` for every field. New extracted fields: `fpId` (CreepJS's stable fingerprint hash, lets you A/B before/after a spoof change), `isChromium` (engine identification), `headlessPct` (HARD headless detection score, lower = better), `likeHeadlessPct` (SOFT headless signals), `stealthPct` (spoof-detection probes score, HIGHER = better since it means our spoofs LOOK convincing). Formatter prints all five with directionality hints inline. Excerpt now 40 lines / 2KB (was 15 / 400 bytes) so future UI rotations are debuggable from the output without `--headful`.
+- **Additional headless-mode spoofs in `lib/fingerprint.js`** (all inside `applyFingerprintProtection`, gated on `siteConfig.fingerprint_protection`):
+  - **`matchMedia` hover/pointer queries**: `(any-hover: hover)`, `(any-hover: none)`, `(any-pointer: fine)`, `(any-pointer: none)`, `(any-pointer: coarse)` plus the legacy non-`any-` aliases. Headless Chrome reports no hover device and no fine pointer (no mouse hardware); detectors probe these as a binary 'real desktop hardware?' signal. Pass-through for all other queries (responsive, color-scheme, reduced-motion, etc.).
+  - **`screenLeft` / `screenTop` mirror `screenX` / `screenY`**. Real Chrome exposes these as identical-value legacy aliases; spoofers often leave them undefined or 0, which is inconsistent with the non-zero `screenX/Y` our existing patch produces.
+  - **Modern Chrome API stubs**: `document.hasStorageAccess()` → `Promise<true>`, `navigator.userActivation` → `{hasBeenActive: true, isActive: true}`, `navigator.getInstalledRelatedApps()` → `Promise<[]>`. Each gated on absence check so real-Chrome paths skip the override.
+  Honest measurement: CreepJS's specific `headless score` did NOT move after these additions (stayed at 67%). My prior estimate of '~-10 to -15 percentage points' was over-optimistic — CreepJS apparently doesn't weight matchMedia hover/pointer heavily in its headless calculation. The additions are still correct spoofs that close real fingerprint gaps and likely help against DataDome / PerimeterX which use different scoring; they're net-positive but score-neutral against CreepJS specifically. The remaining ~67% headless detection is architectural (CDP attachment, software-rasterizer GPU, no real mouse cursor) and can't be lowered without `--headful`.
+### Security
+- **WebRTC public-IP leak closed** in `lib/fingerprint.js` (`applyFingerprintProtection`). The previous local-IP filter only stripped RFC1918 private ranges (`10.x / 172.16-31.x / 192.168.x`), missing `srflx` (STUN-discovered PUBLIC IP), `prflx`, `relay`, and host candidates with non-RFC1918 addresses (CGNAT 100.64.0.0/10, link-local IPv6, real public IPs on bare-metal hosts). STUN traffic is UDP and **bypasses the SOCKS5 proxy entirely**, so the leaked IP was the real host IP regardless of proxy config — visible to any page that listened on `icecandidate` events. Caught by `test-stealth.js creepjs` which surfaced the candidate string `122.252.155.250 typ srflx` and the corresponding `ip:` field in its WebRTC panel. Fix: strip EVERY ICE candidate; deliver only the null-candidate sentinel (end-of-gathering signal). Side note: the property-based `pc.onicecandidate = fn` setter was also broken (stored handler but never wired it up); now mirrors the same filter as the addEventListener path. Side effect: any site that REQUIRES functional WebRTC peer connections sees ICE gathering produce zero candidates. For nwss.js's scanning use case this is correct.
+### Stealth hardening (toString masking)
+- **Added 8 session-introduced spoofs to `Function.prototype.toString` bulk masking** (`matchMedia`, `hasStorageAccess`, `getInstalledRelatedApps`, `userActivation` getter, `Notification.permission` getter, `screen.orientation` getter, `screenLeft`/`screenTop` getters). Without this, each new spoof was detectable via `.toString()` returning the override source instead of `[native code]`.
+- **Masked per-instance WebRTC `onicecandidate` getter/setter + `addEventListener` wrap.** The bulk-mask block only runs once at injection; per-RTCPeerConnection closures created inside the factory weren't covered. A site doing `Object.getOwnPropertyDescriptor(pc, 'onicecandidate').get.toString()` could see the spoof.
+- **Spoofed `navigator.productSub` + `vendorSub`** (UA-aware: `'20030107'` for Chrome/Safari/etc., `'20100101'` for Firefox; `vendorSub` always `''`). Companion legacy properties to the already-spoofed `vendor`/`product`. Common bot-detection signal since anti-detection libraries often spoof UA but forget these. `vendor`/`product` getters also added to the maskAsNative list (pre-existing oversight folded in).
+### Fixed
+- **`validatePageForInjection`'s 1.5s race timer is now `unref`'d.** Last remaining Node-side `setTimeout` that wasn't unref'd; could hold the event loop alive for up to 1.5s past scan completion. All Node-side timers in `lib/fingerprint.js`, `lib/nettools.js`, and `lib/socks-relay.js` are now unref'd.
+### Performance
+- **Canvas noise application now cached per `HTMLCanvasElement`** via WeakMap. `toDataURL` and `toBlob` previously did a `getImageData` + `putImageData` round-trip on every call (~500k iterations for size-capped canvases) to bake noise into the export. Now the round-trip runs once per canvas; subsequent exports skip it (the canvas backing store still has the noised pixels from the first call). Trade-off: animated canvases that redraw between exports won't have new content re-noised — acceptable for the common fingerprinter pattern (single probe → single toDataURL).
 ## [3.0.2] - 2026-05-25
 ### Security

package/lib/adblock-rust.js CHANGED Viewed

@@ -219,10 +219,20 @@ function parseAdblockRules(filePathOrArray, options = {}) {
       const buf = buffers[i];
       buffers[i] = null;
       const lines = buf.toString('utf-8').split('\n');
+      // Count actual rules for the startup banner. Skip:
+      //   - empty lines
+      //   - whitespace-only lines (trim then re-check length)
+      //   - '!'-prefixed comments (standard adblock)
+      //   - '['-prefixed filter list headers (e.g. '[Adblock Plus 2.0]')
+      // Previously only the first two skip conditions ran on the raw line,
+      // so whitespace lines + headers inflated the displayed count.
       for (let j = 0; j < lines.length; j++) {
         const line = lines[j];
         if (line.length === 0) continue;
-        if (line.charCodeAt(0) === 0x21) continue;
+        const trimmed = line.trim();
+        if (trimmed.length === 0) continue;
+        const c = trimmed.charCodeAt(0);
+        if (c === 0x21 || c === 0x5B) continue;  // '!' or '['
         ruleCount++;
       }
       filterSet.addFilters(lines);
@@ -238,7 +248,12 @@ function parseAdblockRules(filePathOrArray, options = {}) {
         // up by the TTL prune on a future run) but the final cachePath is
         // either complete or absent — never half-written.
         const tmpPath = cachePath + '.' + process.pid + '.tmp';
-        fs.writeFileSync(tmpPath, Buffer.from(serialized));
+        // Buffer.from(buffer) ALWAYS copies — wasteful when adblock-rs's
+        // serialize() already returns a Buffer (binding-version dependent).
+        // For a ~10MB compiled engine that's a pointless 5-10ms allocate+
+        // memcpy on the cold-cache-write path.
+        const out = Buffer.isBuffer(serialized) ? serialized : Buffer.from(serialized);
+        fs.writeFileSync(tmpPath, out);
         fs.renameSync(tmpPath, cachePath);
         // Best-effort prune of stale cache files. Done after our own write so
         // we never delete the entry we just created.
@@ -287,8 +302,6 @@ function parseAdblockRules(filePathOrArray, options = {}) {
   }
   return {
-    rules: { stats },
     shouldBlock(url, sourceUrl, resourceType) {
       // Avoid default-parameter syntax in the hot path — explicit null/undefined
       // checks are slightly cheaper for V8's argument adaptor.

package/lib/adblock.js CHANGED Viewed

@@ -85,22 +85,26 @@ function parseAdblockRules(filePath, options = {}) {
   const lines = fileContent.split('\n');
   const rules = {
-    domainMap: new Map(),          // ||domain.com^ - Exact domains for O(1) lookup
-    domainRules: [],               // ||*.domain.com^ - Wildcard domains (fallback)
-    thirdPartyRules: [],           // ||domain.com^$third-party
-    firstPartyRules: [],
-    pathRules: [],                 // /ads/*
-    scriptRules: [],               // .js$script
-    regexRules: [],                // /regex/
-    whitelist: [],                 // @@||domain.com^ - Wildcard whitelist
-    whitelistMap: new Map(),       // Exact whitelist domains for O(1) lookup
-    elementHiding: [],             // ##.ad-class (not used for network blocking)
+    domainMap: new Map(),                // ||domain.com^ - Exact domains for O(1) lookup
+    domainRules: [],                     // ||*.domain.com^ - Wildcard domains (fallback)
+    thirdPartyDomainMap: new Map(),      // ||domain.com^$third-party (exact)  — O(1)
+    thirdPartyRules: [],                 // wildcard / non-domain $third-party (fallback)
+    firstPartyDomainMap: new Map(),      // ||domain.com^$first-party (exact)  — O(1)
+    firstPartyRules: [],                 // wildcard / non-domain $first-party (fallback)
+    pathRules: [],                       // /ads/*
+    scriptRules: [],                     // .js$script
+    regexRules: [],                      // /regex/
+    whitelist: [],                       // @@||domain.com^ - Wildcard whitelist
+    whitelistMap: new Map(),             // Exact whitelist domains for O(1) lookup
+    elementHiding: [],                   // ##.ad-class (not used for network blocking)
     stats: {
       total: 0,
       domain: 0,
-      domainMapEntries: 0,         // Exact domain matches in Map
+      domainMapEntries: 0,               // Exact domain matches in Map
       thirdParty: 0,
+      thirdPartyMapEntries: 0,           // Exact-domain $third-party rules in Map
       firstParty: 0,
+      firstPartyMapEntries: 0,           // Exact-domain $first-party rules in Map
       path: 0,
       script: 0,
       regex: 0,
@@ -161,12 +165,28 @@ function parseAdblockRules(filePath, options = {}) {
       // Regular blocking rules
       const parsedRule = parseRule(line, false, enableLogging);
-      // Categorize based on rule type
+      // Categorize based on rule type. For $third-party and $first-party
+      // rules we additionally split out the exact-domain variants into a
+      // hash map keyed by hostname, mirroring the domainMap pattern. This
+      // turns the common `||example.com^$third-party` lookup from O(N) over
+      // thousands of array entries into O(1) by hostname (+ small parent
+      // walk). Wildcard / non-domain party rules still fall back to the
+      // linear array.
       if (parsedRule.isThirdParty) {
-        rules.thirdPartyRules.push(parsedRule);
+        if (parsedRule.isDomain && parsedRule.domain && !parsedRule.domain.includes('*')) {
+          rules.thirdPartyDomainMap.set(parsedRule.domain.toLowerCase(), parsedRule);
+          rules.stats.thirdPartyMapEntries++;
+        } else {
+          rules.thirdPartyRules.push(parsedRule);
+        }
         rules.stats.thirdParty++;
       } else if (parsedRule.isFirstParty) {
-        rules.firstPartyRules.push(parsedRule);
+        if (parsedRule.isDomain && parsedRule.domain && !parsedRule.domain.includes('*')) {
+          rules.firstPartyDomainMap.set(parsedRule.domain.toLowerCase(), parsedRule);
+          rules.stats.firstPartyMapEntries++;
+        } else {
+          rules.firstPartyRules.push(parsedRule);
+        }
         rules.stats.firstParty++;
       } else if (parsedRule.isDomain) {
         // Store exact domains in Map for O(1) lookup, wildcards in array
@@ -201,7 +221,11 @@ function parseAdblockRules(filePath, options = {}) {
     console.log(formatLogMessage('debug', `    • Exact matches (Map): ${rules.stats.domainMapEntries}`));
     console.log(formatLogMessage('debug', `    • Wildcard patterns (Array): ${rules.domainRules.length}`));
     console.log(formatLogMessage('debug', `  - Third-party rules: ${rules.stats.thirdParty}`));
+    console.log(formatLogMessage('debug', `    • Exact matches (Map): ${rules.stats.thirdPartyMapEntries}`));
+    console.log(formatLogMessage('debug', `    • Wildcard/path (Array): ${rules.thirdPartyRules.length}`));
     console.log(formatLogMessage('debug', `  - First-party rules: ${rules.stats.firstParty}`));
+    console.log(formatLogMessage('debug', `    • Exact matches (Map): ${rules.stats.firstPartyMapEntries}`));
+    console.log(formatLogMessage('debug', `    • Wildcard/path (Array): ${rules.firstPartyRules.length}`));
     console.log(formatLogMessage('debug', `  - Path rules: ${rules.stats.path}`));
     console.log(formatLogMessage('debug', `  - Script rules: ${rules.stats.script}`));
     console.log(formatLogMessage('debug', `  - Regex rules: ${rules.stats.regex}`));
@@ -445,7 +469,14 @@ function createMatcher(rules, options = {}) {
   let resultCacheHits = 0, resultCacheMisses = 0;
   let urlCacheHits = 0, urlCacheMisses = 0;
   let sourceCacheHits = 0, sourceCacheMisses = 0;
-  const hasPartyRules = rules.thirdPartyRules.length > 0 || rules.firstPartyRules.length > 0;
+  // Include the new domain-maps in the party-rules presence check — without
+  // this, a filter list whose $third-party rules ALL went into the Map (empty
+  // array) would never trigger third-party detection, silently disabling the
+  // entire third-party path.
+  const hasPartyRules = rules.thirdPartyRules.length > 0 ||
+                        rules.firstPartyRules.length > 0 ||
+                        rules.thirdPartyDomainMap.size > 0 ||
+                        rules.firstPartyDomainMap.size > 0;
   // Result cache uses FIFO eviction (see FIFOCache class comment) —
   // evicts oldest entries one at a time instead of clearing everything.
   const resultCache = new FIFOCache(32000);
@@ -634,6 +665,29 @@ function createMatcher(rules, options = {}) {
         // Check third-party rules
         if (isThirdParty) {
+          // Fast path: exact-domain $third-party rules (O(1) by hostname)
+          let rule = rules.thirdPartyDomainMap.get(lowerHostname);
+          if (rule && matchesRule(rule, url, hostname, isThirdParty, resourceType, sourceDomain)) {
+            if (enableLogging) {
+              console.log(formatLogMessage('debug', `${ADBLOCK_TAG} Blocked third-party: ${url} (${rule.raw || rule.pattern})`));
+            }
+            const r = { blocked: true, rule: rule.raw || rule.pattern, reason: 'third_party_rule' };
+            resultCacheSet(url, sourceUrl, resourceType, r);
+            return r;
+          }
+          // Parent-domain $third-party rules — same walk as domainMap
+          for (let i = 0; i < parents.length; i++) {
+            rule = rules.thirdPartyDomainMap.get(parents[i]);
+            if (rule && matchesRule(rule, url, hostname, isThirdParty, resourceType, sourceDomain)) {
+              if (enableLogging) {
+                console.log(formatLogMessage('debug', `${ADBLOCK_TAG} Blocked third-party: ${url} (${rule.raw || rule.pattern})`));
+              }
+              const r = { blocked: true, rule: rule.raw || rule.pattern, reason: 'third_party_rule' };
+              resultCacheSet(url, sourceUrl, resourceType, r);
+              return r;
+            }
+          }
+          // Slow path: wildcard / non-domain $third-party rules
           const thirdPartyLen = rules.thirdPartyRules.length;  // V8: Cache length
           for (let i = 0; i < thirdPartyLen; i++) {
             const rule = rules.thirdPartyRules[i];
@@ -650,6 +704,29 @@ function createMatcher(rules, options = {}) {
         // Check first-party rules
         if (!isThirdParty) {
+          // Fast path: exact-domain $first-party rules (O(1) by hostname)
+          let rule = rules.firstPartyDomainMap.get(lowerHostname);
+          if (rule && matchesRule(rule, url, hostname, isThirdParty, resourceType, sourceDomain)) {
+            if (enableLogging) {
+              console.log(formatLogMessage('debug', `${ADBLOCK_TAG} Blocked first-party: ${url} (${rule.raw || rule.pattern})`));
+            }
+            const r = { blocked: true, rule: rule.raw || rule.pattern, reason: 'first_party_rule' };
+            resultCacheSet(url, sourceUrl, resourceType, r);
+            return r;
+          }
+          // Parent-domain $first-party rules
+          for (let i = 0; i < parents.length; i++) {
+            rule = rules.firstPartyDomainMap.get(parents[i]);
+            if (rule && matchesRule(rule, url, hostname, isThirdParty, resourceType, sourceDomain)) {
+              if (enableLogging) {
+                console.log(formatLogMessage('debug', `${ADBLOCK_TAG} Blocked first-party: ${url} (${rule.raw || rule.pattern})`));
+              }
+              const r = { blocked: true, rule: rule.raw || rule.pattern, reason: 'first_party_rule' };
+              resultCacheSet(url, sourceUrl, resourceType, r);
+              return r;
+            }
+          }
+          // Slow path: wildcard / non-domain $first-party rules
           const firstPartyLen = rules.firstPartyRules.length;
           for (let i = 0; i < firstPartyLen; i++) {
             const rule = rules.firstPartyRules[i];

package/lib/browserhealth.js CHANGED Viewed

@@ -107,10 +107,13 @@ async function performGroupWindowCleanup(browserInstance, groupDescription, forc
     // Identify the main Puppeteer window (should be about:blank or the initial page)
     let mainPuppeteerPage = null;
     let pagesToClose = [];
-    // Find the main page - typically the first page that's about:blank or has been there longest
+    // First pass: synchronous categorization. Separate blank pages from
+    // content pages so the conservative-mode isPageFromPreviousScan() checks
+    // can run in parallel via Promise.all below, instead of N sequential
+    // awaits (each potentially a CDP roundtrip for page.title()).
+    const contentPages = [];
     for (const page of allPages) {
-      // Cache page.url() call to avoid repeated DOM/browser communication
       const pageUrl = page.url();
       if (pageUrl === 'about:blank' || pageUrl === '' || pageUrl.startsWith('chrome://')) {
         if (!mainPuppeteerPage) {
@@ -119,18 +122,21 @@ async function performGroupWindowCleanup(browserInstance, groupDescription, forc
           pagesToClose.push(page); // Additional blank pages can be closed
         }
       } else {
-        // Any page with actual content should be evaluated for closure
-        if (cleanupMode === "all") {
-          // Aggressive mode: close all content pages
-          pagesToClose.push(page);
-        } else {
-          // Conservative mode: only close pages that look like leftovers from previous scans
-          // Keep pages that might still be actively used
-          const isOldPage = await isPageFromPreviousScan(page, forceDebug);
-          if (isOldPage) {
-            pagesToClose.push(page);
-          }
-        }
+        contentPages.push(page);
+      }
+    }
+    if (cleanupMode === "all") {
+      // Aggressive mode: close all content pages — no per-page async check
+      for (const page of contentPages) pagesToClose.push(page);
+    } else {
+      // Conservative mode: run the isPageFromPreviousScan checks in parallel
+      // and collect the leftovers in original order.
+      const checks = await Promise.all(
+        contentPages.map(page => isPageFromPreviousScan(page, forceDebug))
+      );
+      for (let i = 0; i < contentPages.length; i++) {
+        if (checks[i]) pagesToClose.push(contentPages[i]);
       }
     }
@@ -391,12 +397,13 @@ async function performRealtimeWindowCleanup(browserInstance, threshold = REALTIM
           if (forceDebug) {
             console.log(formatLogMessage('debug', `${REALTIME_CLEANUP_TAG} Found ${contextPages.length} pages in popup context`));
           }
-          // Close popup context pages
-          for (const page of contextPages) {
-            if (!page.isClosed()) {
-              await page.close();
-            }
-          }
+          // Close popup context pages in parallel — each close is
+          // independent and the sequential await was both slow AND would
+          // abort the whole loop on the first close failure, leaking the
+          // remaining pages. .catch() per page ensures we attempt all.
+          await Promise.all(contextPages.map(page =>
+            page.isClosed() ? undefined : page.close().catch(() => {})
+          ));
         }
       }
     } catch (contextErr) {
@@ -646,11 +653,17 @@ async function testNetworkCapability(browserInstance, timeout = 10000) {
   const startTime = Date.now();
   let testPage = null;
+  // Hoisted so the catch can attach an orphan-close chain. Promise.race
+  // cannot cancel browser.newPage() — if the race times out, the underlying
+  // call may still resolve to a real Page tab nothing references. Same
+  // pattern as cdp.js (commit 0772ccd) and clear_sitedata.js (commit 780b443).
+  let testPagePromise = null;
   try {
     // Create test page
+    testPagePromise = browserInstance.newPage();
     testPage = await raceWithTimeout(
-      browserInstance.newPage(),
+      testPagePromise,
       timeout,
       'Test page creation timeout'
     );
@@ -673,21 +686,26 @@ async function testNetworkCapability(browserInstance, timeout = 10000) {
     result.responseTime = Date.now() - startTime;
   } catch (error) {
+    // Orphan cleanup: if testPage is null but newPage() was started, the
+    // race timed out before assignment. Close the orphan when it arrives.
+    if (!testPage && testPagePromise) {
+      testPagePromise.then(p => p.close().catch(() => {})).catch(() => {});
+    }
     result.error = error.message;
     result.responseTime = Date.now() - startTime;
     // Classify the error type
-    if (error.message.includes('Network.enable') ||
+    if (error.message.includes('Network.enable') ||
         error.message.includes('timed out') ||
         error.message.includes('Protocol error')) {
       result.error = `Network capability test failed: ${error.message}`;
     }
   } finally {
     if (testPage && !testPage.isClosed()) {
-      try {
-        await testPage.close();
-      } catch (closeErr) {
-        /* ignore cleanup errors */
+      try {
+        await testPage.close();
+      } catch (closeErr) {
+        /* ignore cleanup errors */
       }
     }
   }
@@ -740,9 +758,15 @@ async function checkBrowserHealth(browserInstance, timeout = 8000) {
     // Test 4: Create a single test page to verify both browser functionality AND network capability
     let testPage = null;
+    // Same orphan-cleanup pattern as testNetworkCapability above + cdp.js +
+    // clear_sitedata.js. Promise.race can't cancel newPage() — if the race
+    // times out the underlying call may still produce a Page tab nothing
+    // references → leaked tab.
+    let testPagePromise = null;
     try {
+      testPagePromise = browserInstance.newPage();
       testPage = await raceWithTimeout(
-        browserInstance.newPage(),
+        testPagePromise,
         timeout,
         'Page creation timeout'
       );
@@ -780,6 +804,11 @@ async function checkBrowserHealth(browserInstance, timeout = 8000) {
       await testPage.close();
     } catch (pageTestError) {
+      // Orphan cleanup: if testPage is null but newPage was started, the
+      // race timed out before assignment. Close the orphan when it arrives.
+      if (!testPage && testPagePromise) {
+        testPagePromise.then(p => p.close().catch(() => {})).catch(() => {});
+      }
       if (testPage && !testPage.isClosed()) {
         try { await testPage.close(); } catch (e) { /* ignore */ }
       }

package/lib/cdp.js CHANGED Viewed

@@ -48,7 +48,8 @@ function raceWithTimeout(promise, ms, message) {
 }
 // Shared no-op cleanup used by every no-CDP / CDP-failed return path. Hoisted
-// so createSessionResult() doesn't allocate a fresh `async () => {}` per call.
+// so the success path doesn't allocate a fresh `async () => {}` per call
+// when cleanup logic isn't needed, and so NOOP_SESSION_RESULT can reuse it.
 const NOOP_CLEANUP = async () => {};
 /**
@@ -74,27 +75,39 @@ function isCriticalCDPError(message) {
          message.includes('Browser has been closed');
 }
-/**
- * Creates a standardized session result object for consistent V8 optimization
- * @param {object|null} session - CDP session or null
- * @param {Function} cleanup - Cleanup function
- * @param {boolean} isEnhanced - Whether enhanced features are active
- * @returns {object} Standardized session object
- */
-const createSessionResult = (session = null, cleanup = NOOP_CLEANUP, isEnhanced = false) => ({
-  session,
-  cleanup,
-  isEnhanced
+// Pre-allocated singleton for both the early-exit case (CDP not enabled OR
+// not in debug mode) AND the non-critical-error path. Frozen so callers can't
+// mutate the shared instance. Result shape is {session, cleanup}; previously
+// also carried an `isEnhanced: false` field that had zero consumers anywhere.
+const NOOP_SESSION_RESULT = Object.freeze({
+  session: null,
+  cleanup: NOOP_CLEANUP
 });
 /**
- * Creates a new page with timeout protection to prevent CDP hangs
+ * Creates a new page with timeout protection to prevent CDP hangs.
+ *
+ * Orphan-page handling: Promise.race cannot cancel browser.newPage(). If the
+ * timer wins, the underlying call keeps running and eventually resolves to a
+ * real Page tab nothing references → leaked tab in the browser. We capture
+ * the original promise and attach a close-on-resolve cleanup so the orphan
+ * is reaped if it arrives after the race lost.
+ *
  * @param {import('puppeteer').Browser} browser - Browser instance
  * @param {number} timeout - Timeout in milliseconds (default: 30000)
  * @returns {Promise<import('puppeteer').Page>} Page instance
  */
 async function createPageWithTimeout(browser, timeout = 30000) {
-  return raceWithTimeout(browser.newPage(), timeout, 'Page creation timeout - browser may be unresponsive');
+  const pagePromise = browser.newPage();
+  try {
+    return await raceWithTimeout(pagePromise, timeout, 'Page creation timeout - browser may be unresponsive');
+  } catch (err) {
+    // If pagePromise eventually resolves after the race gave up, close the
+    // orphan tab. .catch(() => {}) handles the case where pagePromise also
+    // rejected (no resource to clean up).
+    pagePromise.then(p => p.close().catch(() => {})).catch(() => {});
+    throw err;
+  }
 }
 /**
@@ -171,7 +184,7 @@ async function createCDPSession(page, currentUrl, options = {}) {
   const cdpLoggingNeeded = (enableCDP || siteSpecificCDP === true) && forceDebug;
   if (!cdpLoggingNeeded) {
-    return createSessionResult();
+    return NOOP_SESSION_RESULT;
   }
   // Parse the current URL hostname once and reuse it for the mode-log line,
@@ -187,11 +200,16 @@ async function createCDPSession(page, currentUrl, options = {}) {
   }
   let cdpSession = null;
+  let cdpSessionPromise = null;
   try {
-    // Create CDP session using modern Puppeteer 20+ API
-    // Add timeout protection for CDP session creation
-    cdpSession = await raceWithTimeout(page.createCDPSession(), 20000, 'CDP session creation timeout');
+    // Create CDP session using modern Puppeteer 20+ API.
+    // Capture the promise BEFORE racing so the catch block can attach an
+    // orphan-cleanup chain — if our race times out but the underlying
+    // createCDPSession() later resolves, we'd otherwise leak a CDP session
+    // on the browser side that nothing references.
+    cdpSessionPromise = page.createCDPSession();
+    cdpSession = await raceWithTimeout(cdpSessionPromise, 20000, 'CDP session creation timeout');
     // Enable network domain — required for network event monitoring. This is
     // the operation the rest of the codebase has learned can hang under
@@ -221,10 +239,13 @@ async function createCDPSession(page, currentUrl, options = {}) {
     console.log(formatLogMessage('debug', `${CDP_TAG} CDP session created successfully for ${currentUrl}`));
-    return createSessionResult(
-      cdpSession,
-      async () => {
-        // Safe cleanup that never throws errors
+    return {
+      session: cdpSession,
+      cleanup: async () => {
+        // Safe cleanup that never throws errors. Idempotent — null out the
+        // captured reference after the first successful detach so a
+        // double-cleanup is a true no-op instead of generating a misleading
+        // "Failed to detach: Session closed" debug log on the second call.
         if (cdpSession) {
           try {
             await cdpSession.detach();
@@ -232,28 +253,41 @@ async function createCDPSession(page, currentUrl, options = {}) {
           } catch (cdpCleanupErr) {
             // Log cleanup errors but don't throw - cleanup should never fail the calling code
             console.log(formatLogMessage('debug', `${CDP_TAG} Failed to detach CDP session for ${currentUrl}: ${cdpCleanupErr.message}`));
+          } finally {
+            cdpSession = null;
           }
         }
-      },
-      false
-    );
+      }
+    };
   } catch (cdpErr) {
-    // If the session was created but a subsequent send/wire-up failed, detach
-    // it so we don't leak a half-attached session. Previously the code just
-    // nulled the local and orphaned the session. We're already past the
-    // cdpLoggingNeeded gate here so forceDebug is true — log a failed detach
-    // instead of swallowing it, so partial-cleanup failures aren't invisible.
+    // Two distinct cleanup paths depending on where the failure was:
+    //
+    //   a) cdpSession IS set → failure was AFTER createCDPSession() resolved
+    //      (e.g. Network.enable timed out). We have a real handle — detach
+    //      directly. Previously the code just nulled the local and orphaned
+    //      the session; now we detach and log any failure.
+    //
+    //   b) cdpSession is null but cdpSessionPromise was started → the race
+    //      timed out before assignment. The underlying createCDPSession()
+    //      may still resolve later, producing an orphan session on the
+    //      browser side. Attach a detach-on-resolve chain; .catch(()=>{})
+    //      swallows the case where the underlying promise also rejected.
     if (cdpSession) {
       try { await cdpSession.detach(); }
       catch (partialDetachErr) {
         console.log(formatLogMessage('debug', `${CDP_TAG} Partial-session detach failed for ${currentUrl}: ${partialDetachErr.message}`));
       }
-      cdpSession = null;
+    } else if (cdpSessionPromise) {
+      cdpSessionPromise.then(s => s.detach().catch(() => {})).catch(() => {});
     }
-    // Enhanced error context for CDP domain-specific debugging
-    const urlContext = safeHostname(currentUrl, `${currentUrl.substring(0, 50)}...`);
+    // Enhanced error context for CDP domain-specific debugging. Reuse the
+    // currentHostname computed at function entry (one URL parse vs two);
+    // only fall back to the truncated raw URL when that parse failed too.
+    const urlContext = currentHostname !== 'unknown'
+      ? currentHostname
+      : `${currentUrl.substring(0, 50)}...`;
     // Critical errors: browser is broken, propagate so the caller can restart.
     if (isCriticalCDPError(cdpErr.message)) {
@@ -265,7 +299,7 @@ async function createCDPSession(page, currentUrl, options = {}) {
     console.warn(formatLogMessage('warn', `${CDP_TAG} Failed to attach CDP session for ${urlContext}: ${cdpErr.message}`));
     // Return null session with no-op cleanup for consistent API
-    return createSessionResult();
+    return NOOP_SESSION_RESULT;
   }
 }