@fanboynz/network-scanner 3.0.3 → 3.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -2,6 +2,59 @@
2
2
 
3
3
  All notable changes to the Network Scanner (nwss.js) project.
4
4
 
5
+ ## [3.1.1] - 2026-05-30
6
+
7
+ ### Changed
8
+ - **Fingerprint identity pinned to Stable Chrome 148**, not whatever Chrome-for-Testing puppeteer bundles (currently 149, ahead of Stable). The spoof must blend with the real-world population; claiming an unreleased build is itself a tell. The Chrome major + build (`CHROME_BUILD`) + GREASE brand (`CHROME_GREASE_BRAND`) are now single constants — see `lib/fingerprint.md`.
9
+ - **UA Client Hints made fully consistent and matched to real Chrome 148** (verified field-for-field against a live desktop): brand-list order + GREASE string (`Not/A)Brand`), and the full-version build (`148.0.7778.217`) sourced from one place so JS `getHighEntropyValues` and the HTTP `Sec-CH-UA-Full-Version*` headers can't drift. Added `wow64`, `model`, `formFactors`, `uaFullVersion`, and `Sec-CH-UA-WoW64`/`-Model`/`-Form-Factors` headers; Windows `platformVersion` → `19.0.0`.
10
+ - **`navigator.deviceMemory` and `Sec-CH-Device-Memory` both pinned to `8`** (consistent JS↔HTTP), hiding the host's real RAM; `hardwareConcurrency` reports 4–8 (hides datacenter core count).
11
+ - **Dependencies**: puppeteer / puppeteer-core 25.1.0, lru-cache 11.5.1.
12
+
13
+ ### Fixed
14
+ - **Timezone is now spoofed via CDP `emulateTimezone`** instead of JS overrides, so `Date`, `Intl`, and `getTimezoneOffset` are all consistent and DST-correct. The old JS patching left the real `Date` in the host zone — an 8-hour `Date`-vs-`Intl` contradiction and a leaked host timezone.
15
+ - **Closed several headless tells**: Battery now reports the plugged-in default (`charging:true, level:1`); `navigator.bluetooth`, `navigator.share`/`canShare` stubs added (present in real Chrome, absent in headless); `speechSynthesis.getVoices()` returns the claimed-OS voice set (`instanceof`-correct).
16
+ - **proxy**: a string `proxy_bypass`/`socks5_bypass` (instead of an array) no longer throws `bypass.join is not a function` in the browser-launch path.
17
+ - **socks-relay**: a client that disconnects during the upstream-connect await is now handled, so a tunnel isn't opened for a gone client and the watchdog clears immediately.
18
+ - **smart-cache**: the memory-check and auto-save `setInterval`s are now `unref`'d, so an error path that skips `destroy()` can no longer hang the process.
19
+
20
+ ### Removed
21
+ - Dead code: `browserhealth` `testNetworkCapability` + `purgeStaleTrackers` (zero callers), and a redundant 2-voice `speechSynthesis` block superseded by the full voice set.
22
+
23
+ ### Added
24
+ - **`lib/fingerprint.md`** — fingerprint spoofing coverage tables (surfaces, mitigations, gating flags) and known limitations.
25
+
26
+ ## [3.1.0] - 2026-05-29
27
+
28
+ ### Added
29
+ - **`realistic_click`** site flag — denser mouse approach, hold tremor, and mouseup drift for sites that score click realism.
30
+ - **`interact_click_count`** site override for popunder-discovery click volume (default content-click count also raised 2 → 3).
31
+ - **`clear_sitedata_full_on_reload`** site flag — full storage clear between reloads; quick mode now also clears localStorage/sessionStorage.
32
+ - **regex-tool rewritten** as a real `filterRegex` builder/tester: literal↔standard↔JSON conversion, multi-pattern + `regex_and`, and testing against real request URLs (matching mirrors the scanner exactly).
33
+ - **Fingerprint coverage**: per-domain-seeded Battery / `navigator.connection` values, `AudioBuffer` fingerprint defeat, `PerformanceNavigationTiming` jitter, `userActivation`; UA strings bumped to Chrome 148 / Firefox 151 / Safari 19.5.
34
+
35
+ ### Changed
36
+ - **`userAgent` now defaults to `"chrome"`** when a site doesn't set one — previously sites without it leaked the bundled `HeadlessChrome` UA.
37
+ - **`Sec-CH-UA` headers and the curl content-fetch UA derive from the single UA source**, so Client Hints can't drift from `navigator.userAgent`.
38
+ - **VPN configs force scan concurrency to 1** — the shared system routing table isn't concurrency-safe.
39
+ - **Interaction time ceiling scales with the work envelope** (click count / `realistic_click`) instead of a flat 15s.
40
+
41
+ ### Fixed
42
+ - **Per-URL timeout scales** with site timeout/delay/reload (+8s recovery grace) instead of a flat 75s that discarded partial-match recovery on multi-URL scans.
43
+ - **Interaction hard cap is now actually enforced** (was cooperative, overshooting to 20s+ under concurrency).
44
+ - **WireGuard** inline temp-config leaked the private key on failed connect and broke retries; temp dir is now per-PID so concurrent processes can't wipe each other's config.
45
+ - **nettools**: fixed a dig dedup race (concurrent same-domain double lookups); whois no longer discards valid records over non-fatal stderr.
46
+ - **Orphan resource leaks** on `Promise.race` timeout (cdp.js, clear_sitedata.js, browserhealth.js) and several un-`unref`'d `setTimeout` handles.
47
+ - **Config keys validated at startup** with boolean-like coercion, preventing silent misconfiguration.
48
+
49
+ ### Security
50
+ - **OpenVPN** `pkill`/`ping`/`curl` calls moved from shell-interpolated `execSync` to `spawnSync` arg arrays (command-injection).
51
+ - **WireGuard/OpenVPN interface & connection names validated** against a strict charset before use in paths/commands.
52
+
53
+ ### Performance
54
+ - **adblock**: O(1) exact-domain lookup for `$third-party` / `$first-party` rules.
55
+ - Parallelized site-data clearing and window-cleanup checks.
56
+ - Removed dead code across cdp, domain-cache, searchstring, compress, adblock-rust, and nettools.
57
+
5
58
  ## [3.0.3] - 2026-05-26
6
59
 
7
60
  ### Improved
@@ -219,10 +219,20 @@ function parseAdblockRules(filePathOrArray, options = {}) {
219
219
  const buf = buffers[i];
220
220
  buffers[i] = null;
221
221
  const lines = buf.toString('utf-8').split('\n');
222
+ // Count actual rules for the startup banner. Skip:
223
+ // - empty lines
224
+ // - whitespace-only lines (trim then re-check length)
225
+ // - '!'-prefixed comments (standard adblock)
226
+ // - '['-prefixed filter list headers (e.g. '[Adblock Plus 2.0]')
227
+ // Previously only the first two skip conditions ran on the raw line,
228
+ // so whitespace lines + headers inflated the displayed count.
222
229
  for (let j = 0; j < lines.length; j++) {
223
230
  const line = lines[j];
224
231
  if (line.length === 0) continue;
225
- if (line.charCodeAt(0) === 0x21) continue;
232
+ const trimmed = line.trim();
233
+ if (trimmed.length === 0) continue;
234
+ const c = trimmed.charCodeAt(0);
235
+ if (c === 0x21 || c === 0x5B) continue; // '!' or '['
226
236
  ruleCount++;
227
237
  }
228
238
  filterSet.addFilters(lines);
@@ -238,7 +248,12 @@ function parseAdblockRules(filePathOrArray, options = {}) {
238
248
  // up by the TTL prune on a future run) but the final cachePath is
239
249
  // either complete or absent — never half-written.
240
250
  const tmpPath = cachePath + '.' + process.pid + '.tmp';
241
- fs.writeFileSync(tmpPath, Buffer.from(serialized));
251
+ // Buffer.from(buffer) ALWAYS copies — wasteful when adblock-rs's
252
+ // serialize() already returns a Buffer (binding-version dependent).
253
+ // For a ~10MB compiled engine that's a pointless 5-10ms allocate+
254
+ // memcpy on the cold-cache-write path.
255
+ const out = Buffer.isBuffer(serialized) ? serialized : Buffer.from(serialized);
256
+ fs.writeFileSync(tmpPath, out);
242
257
  fs.renameSync(tmpPath, cachePath);
243
258
  // Best-effort prune of stale cache files. Done after our own write so
244
259
  // we never delete the entry we just created.
@@ -287,8 +302,6 @@ function parseAdblockRules(filePathOrArray, options = {}) {
287
302
  }
288
303
 
289
304
  return {
290
- rules: { stats },
291
-
292
305
  shouldBlock(url, sourceUrl, resourceType) {
293
306
  // Avoid default-parameter syntax in the hot path — explicit null/undefined
294
307
  // checks are slightly cheaper for V8's argument adaptor.
package/lib/adblock.js CHANGED
@@ -85,22 +85,26 @@ function parseAdblockRules(filePath, options = {}) {
85
85
  const lines = fileContent.split('\n');
86
86
 
87
87
  const rules = {
88
- domainMap: new Map(), // ||domain.com^ - Exact domains for O(1) lookup
89
- domainRules: [], // ||*.domain.com^ - Wildcard domains (fallback)
90
- thirdPartyRules: [], // ||domain.com^$third-party
91
- firstPartyRules: [],
92
- pathRules: [], // /ads/*
93
- scriptRules: [], // .js$script
94
- regexRules: [], // /regex/
95
- whitelist: [], // @@||domain.com^ - Wildcard whitelist
96
- whitelistMap: new Map(), // Exact whitelist domains for O(1) lookup
97
- elementHiding: [], // ##.ad-class (not used for network blocking)
88
+ domainMap: new Map(), // ||domain.com^ - Exact domains for O(1) lookup
89
+ domainRules: [], // ||*.domain.com^ - Wildcard domains (fallback)
90
+ thirdPartyDomainMap: new Map(), // ||domain.com^$third-party (exact) — O(1)
91
+ thirdPartyRules: [], // wildcard / non-domain $third-party (fallback)
92
+ firstPartyDomainMap: new Map(), // ||domain.com^$first-party (exact) — O(1)
93
+ firstPartyRules: [], // wildcard / non-domain $first-party (fallback)
94
+ pathRules: [], // /ads/*
95
+ scriptRules: [], // .js$script
96
+ regexRules: [], // /regex/
97
+ whitelist: [], // @@||domain.com^ - Wildcard whitelist
98
+ whitelistMap: new Map(), // Exact whitelist domains for O(1) lookup
99
+ elementHiding: [], // ##.ad-class (not used for network blocking)
98
100
  stats: {
99
101
  total: 0,
100
102
  domain: 0,
101
- domainMapEntries: 0, // Exact domain matches in Map
103
+ domainMapEntries: 0, // Exact domain matches in Map
102
104
  thirdParty: 0,
105
+ thirdPartyMapEntries: 0, // Exact-domain $third-party rules in Map
103
106
  firstParty: 0,
107
+ firstPartyMapEntries: 0, // Exact-domain $first-party rules in Map
104
108
  path: 0,
105
109
  script: 0,
106
110
  regex: 0,
@@ -161,12 +165,28 @@ function parseAdblockRules(filePath, options = {}) {
161
165
  // Regular blocking rules
162
166
  const parsedRule = parseRule(line, false, enableLogging);
163
167
 
164
- // Categorize based on rule type
168
+ // Categorize based on rule type. For $third-party and $first-party
169
+ // rules we additionally split out the exact-domain variants into a
170
+ // hash map keyed by hostname, mirroring the domainMap pattern. This
171
+ // turns the common `||example.com^$third-party` lookup from O(N) over
172
+ // thousands of array entries into O(1) by hostname (+ small parent
173
+ // walk). Wildcard / non-domain party rules still fall back to the
174
+ // linear array.
165
175
  if (parsedRule.isThirdParty) {
166
- rules.thirdPartyRules.push(parsedRule);
176
+ if (parsedRule.isDomain && parsedRule.domain && !parsedRule.domain.includes('*')) {
177
+ rules.thirdPartyDomainMap.set(parsedRule.domain.toLowerCase(), parsedRule);
178
+ rules.stats.thirdPartyMapEntries++;
179
+ } else {
180
+ rules.thirdPartyRules.push(parsedRule);
181
+ }
167
182
  rules.stats.thirdParty++;
168
183
  } else if (parsedRule.isFirstParty) {
169
- rules.firstPartyRules.push(parsedRule);
184
+ if (parsedRule.isDomain && parsedRule.domain && !parsedRule.domain.includes('*')) {
185
+ rules.firstPartyDomainMap.set(parsedRule.domain.toLowerCase(), parsedRule);
186
+ rules.stats.firstPartyMapEntries++;
187
+ } else {
188
+ rules.firstPartyRules.push(parsedRule);
189
+ }
170
190
  rules.stats.firstParty++;
171
191
  } else if (parsedRule.isDomain) {
172
192
  // Store exact domains in Map for O(1) lookup, wildcards in array
@@ -201,7 +221,11 @@ function parseAdblockRules(filePath, options = {}) {
201
221
  console.log(formatLogMessage('debug', ` • Exact matches (Map): ${rules.stats.domainMapEntries}`));
202
222
  console.log(formatLogMessage('debug', ` • Wildcard patterns (Array): ${rules.domainRules.length}`));
203
223
  console.log(formatLogMessage('debug', ` - Third-party rules: ${rules.stats.thirdParty}`));
224
+ console.log(formatLogMessage('debug', ` • Exact matches (Map): ${rules.stats.thirdPartyMapEntries}`));
225
+ console.log(formatLogMessage('debug', ` • Wildcard/path (Array): ${rules.thirdPartyRules.length}`));
204
226
  console.log(formatLogMessage('debug', ` - First-party rules: ${rules.stats.firstParty}`));
227
+ console.log(formatLogMessage('debug', ` • Exact matches (Map): ${rules.stats.firstPartyMapEntries}`));
228
+ console.log(formatLogMessage('debug', ` • Wildcard/path (Array): ${rules.firstPartyRules.length}`));
205
229
  console.log(formatLogMessage('debug', ` - Path rules: ${rules.stats.path}`));
206
230
  console.log(formatLogMessage('debug', ` - Script rules: ${rules.stats.script}`));
207
231
  console.log(formatLogMessage('debug', ` - Regex rules: ${rules.stats.regex}`));
@@ -445,7 +469,14 @@ function createMatcher(rules, options = {}) {
445
469
  let resultCacheHits = 0, resultCacheMisses = 0;
446
470
  let urlCacheHits = 0, urlCacheMisses = 0;
447
471
  let sourceCacheHits = 0, sourceCacheMisses = 0;
448
- const hasPartyRules = rules.thirdPartyRules.length > 0 || rules.firstPartyRules.length > 0;
472
+ // Include the new domain-maps in the party-rules presence check — without
473
+ // this, a filter list whose $third-party rules ALL went into the Map (empty
474
+ // array) would never trigger third-party detection, silently disabling the
475
+ // entire third-party path.
476
+ const hasPartyRules = rules.thirdPartyRules.length > 0 ||
477
+ rules.firstPartyRules.length > 0 ||
478
+ rules.thirdPartyDomainMap.size > 0 ||
479
+ rules.firstPartyDomainMap.size > 0;
449
480
  // Result cache uses FIFO eviction (see FIFOCache class comment) —
450
481
  // evicts oldest entries one at a time instead of clearing everything.
451
482
  const resultCache = new FIFOCache(32000);
@@ -634,6 +665,29 @@ function createMatcher(rules, options = {}) {
634
665
 
635
666
  // Check third-party rules
636
667
  if (isThirdParty) {
668
+ // Fast path: exact-domain $third-party rules (O(1) by hostname)
669
+ let rule = rules.thirdPartyDomainMap.get(lowerHostname);
670
+ if (rule && matchesRule(rule, url, hostname, isThirdParty, resourceType, sourceDomain)) {
671
+ if (enableLogging) {
672
+ console.log(formatLogMessage('debug', `${ADBLOCK_TAG} Blocked third-party: ${url} (${rule.raw || rule.pattern})`));
673
+ }
674
+ const r = { blocked: true, rule: rule.raw || rule.pattern, reason: 'third_party_rule' };
675
+ resultCacheSet(url, sourceUrl, resourceType, r);
676
+ return r;
677
+ }
678
+ // Parent-domain $third-party rules — same walk as domainMap
679
+ for (let i = 0; i < parents.length; i++) {
680
+ rule = rules.thirdPartyDomainMap.get(parents[i]);
681
+ if (rule && matchesRule(rule, url, hostname, isThirdParty, resourceType, sourceDomain)) {
682
+ if (enableLogging) {
683
+ console.log(formatLogMessage('debug', `${ADBLOCK_TAG} Blocked third-party: ${url} (${rule.raw || rule.pattern})`));
684
+ }
685
+ const r = { blocked: true, rule: rule.raw || rule.pattern, reason: 'third_party_rule' };
686
+ resultCacheSet(url, sourceUrl, resourceType, r);
687
+ return r;
688
+ }
689
+ }
690
+ // Slow path: wildcard / non-domain $third-party rules
637
691
  const thirdPartyLen = rules.thirdPartyRules.length; // V8: Cache length
638
692
  for (let i = 0; i < thirdPartyLen; i++) {
639
693
  const rule = rules.thirdPartyRules[i];
@@ -650,6 +704,29 @@ function createMatcher(rules, options = {}) {
650
704
 
651
705
  // Check first-party rules
652
706
  if (!isThirdParty) {
707
+ // Fast path: exact-domain $first-party rules (O(1) by hostname)
708
+ let rule = rules.firstPartyDomainMap.get(lowerHostname);
709
+ if (rule && matchesRule(rule, url, hostname, isThirdParty, resourceType, sourceDomain)) {
710
+ if (enableLogging) {
711
+ console.log(formatLogMessage('debug', `${ADBLOCK_TAG} Blocked first-party: ${url} (${rule.raw || rule.pattern})`));
712
+ }
713
+ const r = { blocked: true, rule: rule.raw || rule.pattern, reason: 'first_party_rule' };
714
+ resultCacheSet(url, sourceUrl, resourceType, r);
715
+ return r;
716
+ }
717
+ // Parent-domain $first-party rules
718
+ for (let i = 0; i < parents.length; i++) {
719
+ rule = rules.firstPartyDomainMap.get(parents[i]);
720
+ if (rule && matchesRule(rule, url, hostname, isThirdParty, resourceType, sourceDomain)) {
721
+ if (enableLogging) {
722
+ console.log(formatLogMessage('debug', `${ADBLOCK_TAG} Blocked first-party: ${url} (${rule.raw || rule.pattern})`));
723
+ }
724
+ const r = { blocked: true, rule: rule.raw || rule.pattern, reason: 'first_party_rule' };
725
+ resultCacheSet(url, sourceUrl, resourceType, r);
726
+ return r;
727
+ }
728
+ }
729
+ // Slow path: wildcard / non-domain $first-party rules
653
730
  const firstPartyLen = rules.firstPartyRules.length;
654
731
  for (let i = 0; i < firstPartyLen; i++) {
655
732
  const rule = rules.firstPartyRules[i];
@@ -107,10 +107,13 @@ async function performGroupWindowCleanup(browserInstance, groupDescription, forc
107
107
  // Identify the main Puppeteer window (should be about:blank or the initial page)
108
108
  let mainPuppeteerPage = null;
109
109
  let pagesToClose = [];
110
-
111
- // Find the main page - typically the first page that's about:blank or has been there longest
110
+
111
+ // First pass: synchronous categorization. Separate blank pages from
112
+ // content pages so the conservative-mode isPageFromPreviousScan() checks
113
+ // can run in parallel via Promise.all below, instead of N sequential
114
+ // awaits (each potentially a CDP roundtrip for page.title()).
115
+ const contentPages = [];
112
116
  for (const page of allPages) {
113
- // Cache page.url() call to avoid repeated DOM/browser communication
114
117
  const pageUrl = page.url();
115
118
  if (pageUrl === 'about:blank' || pageUrl === '' || pageUrl.startsWith('chrome://')) {
116
119
  if (!mainPuppeteerPage) {
@@ -119,18 +122,21 @@ async function performGroupWindowCleanup(browserInstance, groupDescription, forc
119
122
  pagesToClose.push(page); // Additional blank pages can be closed
120
123
  }
121
124
  } else {
122
- // Any page with actual content should be evaluated for closure
123
- if (cleanupMode === "all") {
124
- // Aggressive mode: close all content pages
125
- pagesToClose.push(page);
126
- } else {
127
- // Conservative mode: only close pages that look like leftovers from previous scans
128
- // Keep pages that might still be actively used
129
- const isOldPage = await isPageFromPreviousScan(page, forceDebug);
130
- if (isOldPage) {
131
- pagesToClose.push(page);
132
- }
133
- }
125
+ contentPages.push(page);
126
+ }
127
+ }
128
+
129
+ if (cleanupMode === "all") {
130
+ // Aggressive mode: close all content pages no per-page async check
131
+ for (const page of contentPages) pagesToClose.push(page);
132
+ } else {
133
+ // Conservative mode: run the isPageFromPreviousScan checks in parallel
134
+ // and collect the leftovers in original order.
135
+ const checks = await Promise.all(
136
+ contentPages.map(page => isPageFromPreviousScan(page, forceDebug))
137
+ );
138
+ for (let i = 0; i < contentPages.length; i++) {
139
+ if (checks[i]) pagesToClose.push(contentPages[i]);
134
140
  }
135
141
  }
136
142
 
@@ -391,12 +397,13 @@ async function performRealtimeWindowCleanup(browserInstance, threshold = REALTIM
391
397
  if (forceDebug) {
392
398
  console.log(formatLogMessage('debug', `${REALTIME_CLEANUP_TAG} Found ${contextPages.length} pages in popup context`));
393
399
  }
394
- // Close popup context pages
395
- for (const page of contextPages) {
396
- if (!page.isClosed()) {
397
- await page.close();
398
- }
399
- }
400
+ // Close popup context pages in parallel — each close is
401
+ // independent and the sequential await was both slow AND would
402
+ // abort the whole loop on the first close failure, leaking the
403
+ // remaining pages. .catch() per page ensures we attempt all.
404
+ await Promise.all(contextPages.map(page =>
405
+ page.isClosed() ? undefined : page.close().catch(() => {})
406
+ ));
400
407
  }
401
408
  }
402
409
  } catch (contextErr) {
@@ -600,16 +607,6 @@ function untrackPage(page) {
600
607
  pageUsageTracker.delete(page);
601
608
  }
602
609
 
603
- /**
604
- * No-op since the trackers were migrated to WeakMap — GC reclaims dead-page
605
- * entries automatically when Puppeteer drops its internal references. Kept
606
- * exported so the ~7 callers in nwss.js continue to compile; safe to delete
607
- * entirely once those callsites are scrubbed.
608
- */
609
- function purgeStaleTrackers() {
610
- // intentionally empty
611
- }
612
-
613
610
  /**
614
611
  * Quick browser responsiveness test for use during page setup
615
612
  * Designed to catch browser degradation between operations
@@ -630,71 +627,6 @@ async function isQuicklyResponsive(browserInstance, timeout = 3000) {
630
627
  }
631
628
  }
632
629
 
633
- /**
634
- * Tests if browser can handle network operations (like Network.enable)
635
- * Creates a test page and attempts basic network setup
636
- * @param {import('puppeteer').Browser} browserInstance - Puppeteer browser instance
637
- * @param {number} timeout - Timeout in milliseconds (default: 10000)
638
- * @returns {Promise<object>} Network capability test result
639
- */
640
- async function testNetworkCapability(browserInstance, timeout = 10000) {
641
- const result = {
642
- capable: false,
643
- error: null,
644
- responseTime: 0
645
- };
646
-
647
- const startTime = Date.now();
648
- let testPage = null;
649
-
650
- try {
651
- // Create test page
652
- testPage = await raceWithTimeout(
653
- browserInstance.newPage(),
654
- timeout,
655
- 'Test page creation timeout'
656
- );
657
-
658
- // Test network operations (the critical operation that's failing)
659
- await raceWithTimeout(
660
- testPage.setRequestInterception(true),
661
- timeout,
662
- 'Network.enable test timeout'
663
- );
664
-
665
- // Turn off interception. Symmetric to the enable above — Network.disable
666
- // can hang for the same CDP reasons, so it needs the same watchdog.
667
- await raceWithTimeout(
668
- testPage.setRequestInterception(false),
669
- timeout,
670
- 'Network.disable test timeout'
671
- );
672
- result.capable = true;
673
- result.responseTime = Date.now() - startTime;
674
-
675
- } catch (error) {
676
- result.error = error.message;
677
- result.responseTime = Date.now() - startTime;
678
-
679
- // Classify the error type
680
- if (error.message.includes('Network.enable') ||
681
- error.message.includes('timed out') ||
682
- error.message.includes('Protocol error')) {
683
- result.error = `Network capability test failed: ${error.message}`;
684
- }
685
- } finally {
686
- if (testPage && !testPage.isClosed()) {
687
- try {
688
- await testPage.close();
689
- } catch (closeErr) {
690
- /* ignore cleanup errors */
691
- }
692
- }
693
- }
694
-
695
- return result;
696
- }
697
-
698
630
  /**
699
631
  * Checks if browser instance is still responsive
700
632
  * @param {import('puppeteer').Browser} browserInstance - Puppeteer browser instance
@@ -740,9 +672,15 @@ async function checkBrowserHealth(browserInstance, timeout = 8000) {
740
672
 
741
673
  // Test 4: Create a single test page to verify both browser functionality AND network capability
742
674
  let testPage = null;
675
+ // Same orphan-cleanup pattern as cdp.js + clear_sitedata.js.
676
+ // Promise.race can't cancel newPage() — if the race
677
+ // times out the underlying call may still produce a Page tab nothing
678
+ // references → leaked tab.
679
+ let testPagePromise = null;
743
680
  try {
681
+ testPagePromise = browserInstance.newPage();
744
682
  testPage = await raceWithTimeout(
745
- browserInstance.newPage(),
683
+ testPagePromise,
746
684
  timeout,
747
685
  'Page creation timeout'
748
686
  );
@@ -780,6 +718,11 @@ async function checkBrowserHealth(browserInstance, timeout = 8000) {
780
718
  await testPage.close();
781
719
 
782
720
  } catch (pageTestError) {
721
+ // Orphan cleanup: if testPage is null but newPage was started, the
722
+ // race timed out before assignment. Close the orphan when it arrives.
723
+ if (!testPage && testPagePromise) {
724
+ testPagePromise.then(p => p.close().catch(() => {})).catch(() => {});
725
+ }
783
726
  if (testPage && !testPage.isClosed()) {
784
727
  try { await testPage.close(); } catch (e) { /* ignore */ }
785
728
  }
@@ -1253,7 +1196,6 @@ module.exports = {
1253
1196
  performGroupWindowCleanup,
1254
1197
  performRealtimeWindowCleanup,
1255
1198
  trackPageForRealtime,
1256
- testNetworkCapability,
1257
1199
  isQuicklyResponsive,
1258
1200
  performHealthAssessment,
1259
1201
  monitorBrowserHealth,
@@ -1261,6 +1203,5 @@ module.exports = {
1261
1203
  isCriticalProtocolError,
1262
1204
  updatePageUsage,
1263
1205
  untrackPage,
1264
- cleanupPageBeforeReload,
1265
- purgeStaleTrackers
1206
+ cleanupPageBeforeReload
1266
1207
  };
package/lib/cdp.js CHANGED
@@ -48,7 +48,8 @@ function raceWithTimeout(promise, ms, message) {
48
48
  }
49
49
 
50
50
  // Shared no-op cleanup used by every no-CDP / CDP-failed return path. Hoisted
51
- // so createSessionResult() doesn't allocate a fresh `async () => {}` per call.
51
+ // so the success path doesn't allocate a fresh `async () => {}` per call
52
+ // when cleanup logic isn't needed, and so NOOP_SESSION_RESULT can reuse it.
52
53
  const NOOP_CLEANUP = async () => {};
53
54
 
54
55
  /**
@@ -74,27 +75,39 @@ function isCriticalCDPError(message) {
74
75
  message.includes('Browser has been closed');
75
76
  }
76
77
 
77
- /**
78
- * Creates a standardized session result object for consistent V8 optimization
79
- * @param {object|null} session - CDP session or null
80
- * @param {Function} cleanup - Cleanup function
81
- * @param {boolean} isEnhanced - Whether enhanced features are active
82
- * @returns {object} Standardized session object
83
- */
84
- const createSessionResult = (session = null, cleanup = NOOP_CLEANUP, isEnhanced = false) => ({
85
- session,
86
- cleanup,
87
- isEnhanced
78
+ // Pre-allocated singleton for both the early-exit case (CDP not enabled OR
79
+ // not in debug mode) AND the non-critical-error path. Frozen so callers can't
80
+ // mutate the shared instance. Result shape is {session, cleanup}; previously
81
+ // also carried an `isEnhanced: false` field that had zero consumers anywhere.
82
+ const NOOP_SESSION_RESULT = Object.freeze({
83
+ session: null,
84
+ cleanup: NOOP_CLEANUP
88
85
  });
89
86
 
90
87
  /**
91
- * Creates a new page with timeout protection to prevent CDP hangs
88
+ * Creates a new page with timeout protection to prevent CDP hangs.
89
+ *
90
+ * Orphan-page handling: Promise.race cannot cancel browser.newPage(). If the
91
+ * timer wins, the underlying call keeps running and eventually resolves to a
92
+ * real Page tab nothing references → leaked tab in the browser. We capture
93
+ * the original promise and attach a close-on-resolve cleanup so the orphan
94
+ * is reaped if it arrives after the race lost.
95
+ *
92
96
  * @param {import('puppeteer').Browser} browser - Browser instance
93
97
  * @param {number} timeout - Timeout in milliseconds (default: 30000)
94
98
  * @returns {Promise<import('puppeteer').Page>} Page instance
95
99
  */
96
100
  async function createPageWithTimeout(browser, timeout = 30000) {
97
- return raceWithTimeout(browser.newPage(), timeout, 'Page creation timeout - browser may be unresponsive');
101
+ const pagePromise = browser.newPage();
102
+ try {
103
+ return await raceWithTimeout(pagePromise, timeout, 'Page creation timeout - browser may be unresponsive');
104
+ } catch (err) {
105
+ // If pagePromise eventually resolves after the race gave up, close the
106
+ // orphan tab. .catch(() => {}) handles the case where pagePromise also
107
+ // rejected (no resource to clean up).
108
+ pagePromise.then(p => p.close().catch(() => {})).catch(() => {});
109
+ throw err;
110
+ }
98
111
  }
99
112
 
100
113
  /**
@@ -171,7 +184,7 @@ async function createCDPSession(page, currentUrl, options = {}) {
171
184
  const cdpLoggingNeeded = (enableCDP || siteSpecificCDP === true) && forceDebug;
172
185
 
173
186
  if (!cdpLoggingNeeded) {
174
- return createSessionResult();
187
+ return NOOP_SESSION_RESULT;
175
188
  }
176
189
 
177
190
  // Parse the current URL hostname once and reuse it for the mode-log line,
@@ -187,11 +200,16 @@ async function createCDPSession(page, currentUrl, options = {}) {
187
200
  }
188
201
 
189
202
  let cdpSession = null;
203
+ let cdpSessionPromise = null;
190
204
 
191
205
  try {
192
- // Create CDP session using modern Puppeteer 20+ API
193
- // Add timeout protection for CDP session creation
194
- cdpSession = await raceWithTimeout(page.createCDPSession(), 20000, 'CDP session creation timeout');
206
+ // Create CDP session using modern Puppeteer 20+ API.
207
+ // Capture the promise BEFORE racing so the catch block can attach an
208
+ // orphan-cleanup chain if our race times out but the underlying
209
+ // createCDPSession() later resolves, we'd otherwise leak a CDP session
210
+ // on the browser side that nothing references.
211
+ cdpSessionPromise = page.createCDPSession();
212
+ cdpSession = await raceWithTimeout(cdpSessionPromise, 20000, 'CDP session creation timeout');
195
213
 
196
214
  // Enable network domain — required for network event monitoring. This is
197
215
  // the operation the rest of the codebase has learned can hang under
@@ -221,10 +239,13 @@ async function createCDPSession(page, currentUrl, options = {}) {
221
239
 
222
240
  console.log(formatLogMessage('debug', `${CDP_TAG} CDP session created successfully for ${currentUrl}`));
223
241
 
224
- return createSessionResult(
225
- cdpSession,
226
- async () => {
227
- // Safe cleanup that never throws errors
242
+ return {
243
+ session: cdpSession,
244
+ cleanup: async () => {
245
+ // Safe cleanup that never throws errors. Idempotent — null out the
246
+ // captured reference after the first successful detach so a
247
+ // double-cleanup is a true no-op instead of generating a misleading
248
+ // "Failed to detach: Session closed" debug log on the second call.
228
249
  if (cdpSession) {
229
250
  try {
230
251
  await cdpSession.detach();
@@ -232,28 +253,41 @@ async function createCDPSession(page, currentUrl, options = {}) {
232
253
  } catch (cdpCleanupErr) {
233
254
  // Log cleanup errors but don't throw - cleanup should never fail the calling code
234
255
  console.log(formatLogMessage('debug', `${CDP_TAG} Failed to detach CDP session for ${currentUrl}: ${cdpCleanupErr.message}`));
256
+ } finally {
257
+ cdpSession = null;
235
258
  }
236
259
  }
237
- },
238
- false
239
- );
260
+ }
261
+ };
240
262
 
241
263
  } catch (cdpErr) {
242
- // If the session was created but a subsequent send/wire-up failed, detach
243
- // it so we don't leak a half-attached session. Previously the code just
244
- // nulled the local and orphaned the session. We're already past the
245
- // cdpLoggingNeeded gate here so forceDebug is truelog a failed detach
246
- // instead of swallowing it, so partial-cleanup failures aren't invisible.
264
+ // Two distinct cleanup paths depending on where the failure was:
265
+ //
266
+ // a) cdpSession IS set failure was AFTER createCDPSession() resolved
267
+ // (e.g. Network.enable timed out). We have a real handle — detach
268
+ // directly. Previously the code just nulled the local and orphaned
269
+ // the session; now we detach and log any failure.
270
+ //
271
+ // b) cdpSession is null but cdpSessionPromise was started → the race
272
+ // timed out before assignment. The underlying createCDPSession()
273
+ // may still resolve later, producing an orphan session on the
274
+ // browser side. Attach a detach-on-resolve chain; .catch(()=>{})
275
+ // swallows the case where the underlying promise also rejected.
247
276
  if (cdpSession) {
248
277
  try { await cdpSession.detach(); }
249
278
  catch (partialDetachErr) {
250
279
  console.log(formatLogMessage('debug', `${CDP_TAG} Partial-session detach failed for ${currentUrl}: ${partialDetachErr.message}`));
251
280
  }
252
- cdpSession = null;
281
+ } else if (cdpSessionPromise) {
282
+ cdpSessionPromise.then(s => s.detach().catch(() => {})).catch(() => {});
253
283
  }
254
284
 
255
- // Enhanced error context for CDP domain-specific debugging
256
- const urlContext = safeHostname(currentUrl, `${currentUrl.substring(0, 50)}...`);
285
+ // Enhanced error context for CDP domain-specific debugging. Reuse the
286
+ // currentHostname computed at function entry (one URL parse vs two);
287
+ // only fall back to the truncated raw URL when that parse failed too.
288
+ const urlContext = currentHostname !== 'unknown'
289
+ ? currentHostname
290
+ : `${currentUrl.substring(0, 50)}...`;
257
291
 
258
292
  // Critical errors: browser is broken, propagate so the caller can restart.
259
293
  if (isCriticalCDPError(cdpErr.message)) {
@@ -265,7 +299,7 @@ async function createCDPSession(page, currentUrl, options = {}) {
265
299
  console.warn(formatLogMessage('warn', `${CDP_TAG} Failed to attach CDP session for ${urlContext}: ${cdpErr.message}`));
266
300
 
267
301
  // Return null session with no-op cleanup for consistent API
268
- return createSessionResult();
302
+ return NOOP_SESSION_RESULT;
269
303
  }
270
304
  }
271
305