@fanboynz/network-scanner 1.0.35

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/nwss.1 ADDED
@@ -0,0 +1,824 @@
1
+ .TH NWSS-SCRIPT 1 "2025" "scanner-script v1.0.32" "User Commands"
2
+ .SH NAME
3
+ NWSS scanner-script \- Network scanner for malware detection and domain analysis with advanced similarity filtering
4
+
5
+ .SH SYNOPSIS
6
+ .B node nwss.js
7
+ [\fIOPTIONS\fR]
8
+
9
+ .SH DESCRIPTION
10
+ .B nwss.js
11
+ is a comprehensive network scanner that uses Puppeteer to analyze web pages for malicious content, tracking scripts, and suspicious domains. It can detect threats through URL pattern matching, content analysis, DNS/WHOIS lookups, and behavioral analysis.
12
+
13
+ The scanner supports multiple detection methods including regex filtering, content searching with curl/grep, network tools integration, and advanced browser-based analysis with frame monitoring and fingerprint spoofing. It includes intelligent domain similarity filtering to reduce noise and improve detection accuracy.
14
+
15
+ .SH OPTIONS
16
+
17
+ .SS Output Options
18
+ .TP
19
+ .BR \-o ", " \--output " \fIFILE\fR"
20
+ Write rules to \fIFILE\fR instead of standard output.
21
+
22
+ .TP
23
+ .BR \--compare " \fIFILE\fR"
24
+ Remove rules that already exist in \fIFILE\fR before output (requires \fB\-o\fR).
25
+
26
+ .TP
27
+ .B \--append
28
+ Append new rules to output file instead of overwriting (requires \fB\-o\fR).
29
+
30
+ .SS Output Format Options
31
+ .TP
32
+ .B \--localhost
33
+ Output rules as \fB127.0.0.1 domain.com\fR format for hosts file.
34
+
35
+ .TP
36
+ .B \--localhost-0.0.0.0
37
+ Output rules as \fB0.0.0.0 domain.com\fR format for hosts file.
38
+
39
+ .TP
40
+ .B \--plain
41
+ Output just domain names without any formatting.
42
+
43
+ .TP
44
+ .B \--dnsmasq
45
+ Output as \fBlocal=/domain.com/\fR format for dnsmasq.
46
+
47
+ .TP
48
+ .B \--dnsmasq-old
49
+ Output as \fBserver=/domain.com/\fR format for older dnsmasq versions.
50
+
51
+ .TP
52
+ .B \--unbound
53
+ Output as \fBlocal-zone: "domain.com." always_null\fR format for Unbound DNS.
54
+
55
+ .TP
56
+ .B \--privoxy
57
+ Output as \fB{ +block } .domain.com\fR format for Privoxy action files.
58
+
59
+ .TP
60
+ .B \--pihole
61
+ Output as \fB(^|\\.)domain\\.com$\fR format for Pi-hole regex filters.
62
+
63
+ .TP
64
+ .B \--adblock-rules
65
+ Generate adblock filter rules with resource type modifiers (requires \fB\-o\fR).
66
+
67
+ .SS General Options
68
+ .TP
69
+ .B \--verbose
70
+ Enable verbose output globally for all sites.
71
+
72
+ .TP
73
+ .B \--debug
74
+ Enable debug mode with detailed logging of all network requests.
75
+
76
+ .TP
77
+ .B \--silent
78
+ Suppress normal console output (errors and warnings still shown).
79
+
80
+ .TP
81
+ .B \--titles
82
+ Add comment lines with site URLs before each rule group.
83
+
84
+ .TP
85
+ .B \--dumpurls
86
+ Log all matched URLs to timestamped log files in \fBlogs/\fR directory.
87
+
88
+ .TP
89
+ .B \--compress-logs
90
+ Compress log files with gzip after completion (requires \fB\--dumpurls\fR).
91
+
92
+ .TP
93
+ .B \--sub-domains
94
+ Output full subdomains instead of collapsing to root domains.
95
+
96
+ .TP
97
+ .B \--no-interact
98
+ Disable mouse simulation and page interaction globally.
99
+
100
+ .TP
101
+ .BR \--custom-json " \fIFILE\fR"
102
+ Use \fIFILE\fR instead of \fBconfig.json\fR for configuration.
103
+
104
+ .TP
105
+ .B \--headful
106
+ Launch browser with GUI instead of headless mode.
107
+
108
+ .TP
109
+ .B \--cdp
110
+ Enable Chrome DevTools Protocol logging for network analysis.
111
+
112
+ .TP
113
+ .B \--remove-dupes
114
+ Remove duplicate domains from output (only with \fB\-o\fR).
115
+
116
+ .TP
117
+ .B \--eval-on-doc
118
+ Globally enable JavaScript injection for Fetch/XHR interception.
119
+
120
+ .TP
121
+ .B \--dry-run
122
+ Console output only: show matching regex, titles, whois/dig/searchstring results, and adblock rules without writing files.
123
+
124
+ .TP
125
+ .B \--remove-tempfiles
126
+ Remove Chrome/Puppeteer temporary files before exit.
127
+
128
+ .TP
129
+ .BR \-h ", " \--help
130
+ Show help message and exit.
131
+
132
+ .TP
133
+ .B \--version
134
+ Show version information and exit.
135
+
136
+ .SS Validation Options
137
+ .TP
138
+ .B \--validate-config
139
+ Validate config.json file and exit.
140
+
141
+ .TP
142
+ .B \--validate-rules [\fIFILE\fR]
143
+ Validate rule file format (uses \fB\--output\fR/\fB\--compare\fR files if no file specified).
144
+
145
+ .TP
146
+ .B \--clean-rules [\fIFILE\fR]
147
+ Clean rule files by removing invalid lines and optionally duplicates (uses \fB\--output\fR/\fB\--compare\fR files if no file specified).
148
+
149
+ .TP
150
+ .B \--test-validation
151
+ Run domain validation tests and exit.
152
+
153
+ .SH CONFIGURATION
154
+
155
+ Configuration is provided via JSON files. The default configuration file is \fBconfig.json\fR.
156
+
157
+ .SS Global Configuration Options
158
+
159
+ .TP
160
+ .B ignoreDomains
161
+ Array of domains to completely ignore. Supports wildcards (e.g., \fB"*.ads.com"\fR).
162
+
163
+ .TP
164
+ .B blocked
165
+ Array of global regex patterns to block requests.
166
+
167
+ .TP
168
+ .B whois_delay
169
+ Default delay between whois requests in milliseconds (default: 3000).
170
+
171
+ .TP
172
+ .B whois_server_mode
173
+ Default server selection mode for all sites: \fB"random"\fR or \fB"cycle"\fR (default: "random").
174
+
175
+ .TP
176
+ .B ignore_similar
177
+ Boolean. Ignore domains similar to already found domains (default: true).
178
+
179
+ .TP
180
+ .B ignore_similar_threshold
181
+ Number. Similarity threshold percentage for ignore_similar (default: 80).
182
+
183
+ .TP
184
+ .B ignore_similar_ignored_domains
185
+ Boolean. Ignore domains similar to ignoreDomains list (default: true).
186
+
187
+ .SS Per-Site Configuration Options
188
+
189
+ .TP
190
+ .B url
191
+ Single URL string or array of URLs to scan.
192
+
193
+ .TP
194
+ .B filterRegex
195
+ Regex pattern(s) to match suspicious requests.
196
+
197
+ .TP
198
+ .B comments
199
+ Documentation strings or notes - completely ignored by the scanner. Can be a single string or array of strings. Used for adding context, URLs, timestamps, or any documentation notes to configuration files.
200
+
201
+ .TP
202
+ .B searchstring
203
+ Text string(s) to search for in response content (OR logic).
204
+
205
+ .TP
206
+ .B searchstring_and
207
+ Text string(s) that must ALL be present in content (AND logic).
208
+
209
+ .TP
210
+ .B curl
211
+ Boolean. Use curl to download and analyze content.
212
+
213
+ .TP
214
+ .B grep
215
+ Boolean. Use system grep for faster pattern matching (requires \fBcurl=true\fR).
216
+
217
+ .TP
218
+ .B resourceTypes
219
+ Array of resource types to process (e.g., \fB["script", "xhr", "fetch"]\fR).
220
+
221
+ .TP
222
+ .B blocked
223
+ Array of regex patterns to block requests for this site.
224
+
225
+ .TP
226
+ .B css_blocked
227
+ Array of CSS selectors to hide elements on the page.
228
+
229
+ .TP
230
+ .B userAgent
231
+ Spoof User-Agent: \fB"chrome"\fR, \fB"firefox"\fR, or \fB"safari"\fR.
232
+
233
+ .TP
234
+ .B interact
235
+ Boolean. Simulate mouse movements and clicks.
236
+
237
+ .TP
238
+ .B delay
239
+ Milliseconds to wait after page load (default: 4000).
240
+
241
+ .TP
242
+ .B reload
243
+ Number of times to reload the page (default: 1).
244
+
245
+ .TP
246
+ .B timeout
247
+ Request timeout in milliseconds (default: 30000).
248
+
249
+ .TP
250
+ .B firstParty
251
+ Boolean. Allow first-party request matching (default: false).
252
+
253
+ .TP
254
+ .B thirdParty
255
+ Boolean. Allow third-party request matching (default: true).
256
+
257
+ .TP
258
+ .B fingerprint_protection
259
+ Boolean or \fB"random"\fR. Enable browser fingerprint spoofing.
260
+
261
+ .TP
262
+ .B ignore_similar
263
+ Boolean. Override global ignore_similar setting for this site.
264
+
265
+ .TP
266
+ .B ignore_similar_threshold
267
+ Number. Override global similarity threshold for this site.
268
+
269
+ .TP
270
+ .B ignore_similar_ignored_domains
271
+ Boolean. Override global ignore_similar_ignored_domains for this site.
272
+
273
+ .TP
274
+ .B even_blocked
275
+ Boolean. Add matching rules even if requests are blocked (default: false).
276
+
277
+ .TP
278
+ .B whois
279
+ Array of terms that must ALL be found in WHOIS data (AND logic).
280
+
281
+ .TP
282
+ .B whois-or
283
+ Array of terms where ANY must be found in WHOIS data (OR logic).
284
+
285
+ .TP
286
+ .B whois_server
287
+ Custom WHOIS server(s) to use for lookups.
288
+
289
+ .TP
290
+ .B whois_server_mode
291
+ Server selection mode: \fB"random"\fR (default) or \fB"cycle"\fR through list.
292
+
293
+ .TP
294
+ .B whois_max_retries
295
+ Number. Maximum retry attempts per domain for WHOIS queries (default: 2).
296
+
297
+ .TP
298
+ .B whois_timeout_multiplier
299
+ Number. Timeout increase multiplier per retry (default: 1.5).
300
+
301
+ .TP
302
+ .B whois_use_fallback
303
+ Boolean. Add TLD-specific fallback servers for WHOIS (default: true).
304
+
305
+ .TP
306
+ .B whois_retry_on_timeout
307
+ Boolean. Retry on timeout errors (default: true).
308
+
309
+ .TP
310
+ .B whois_retry_on_error
311
+ Boolean. Retry on connection/other errors (default: false).
312
+
313
+ .TP
314
+ .B whois_delay
315
+ Milliseconds. Delay between whois requests for this site (default: global whois_delay).
316
+
317
+ .TP
318
+ .B dig
319
+ Array of terms that must ALL be found in DNS records (AND logic).
320
+
321
+ .TP
322
+ .B dig-or
323
+ Array of terms where ANY must be found in DNS records (OR logic).
324
+
325
+ .TP
326
+ .B digRecordType
327
+ DNS record type for dig queries (default: "A").
328
+
329
+ .TP
330
+ .B dig_subdomain
331
+ Boolean. Use subdomain for dig lookup instead of root domain (default: false).
332
+
333
+ .TP
334
+ .B goto_options
335
+ Object. Custom page.goto() options for Puppeteer navigation. Available options:
336
+ .RS
337
+ .IP \(bu 4
338
+ \fBwaitUntil\fR: When to consider navigation successful. Options:
339
+ .RS
340
+ .IP \(bu 4
341
+ \fB"load"\fR - Wait for all resources to load (default)
342
+ .IP \(bu 4
343
+ \fB"domcontentloaded"\fR - Wait for DOM only, faster loading
344
+ .IP \(bu 4
345
+ \fB"networkidle0"\fR - Wait until 0 network requests for 500ms
346
+ .IP \(bu 4
347
+ \fB"networkidle2"\fR - Wait until ≤2 network requests for 500ms
348
+ .RE
349
+ .IP \(bu 4
350
+ \fBtimeout\fR: Maximum navigation time in milliseconds (overrides site timeout)
351
+ .IP \(bu 4
352
+ \fBreferer\fR: Referer header to send with navigation request
353
+ .RE
354
+ Example: \fB{"waitUntil": "networkidle2", "timeout": 60000}\fR
355
+
356
+ .TP
357
+ .B forcereload
358
+ Boolean. Force an additional reload with cache disabled after normal reloads.
359
+
360
+ .TP
361
+ .B clear_sitedata
362
+ Boolean. Clear all cookies, cache, and storage before each page load (default: false).
363
+
364
+ .TP
365
+ .B isBrave
366
+ Boolean. Spoof Brave browser detection.
367
+
368
+ .TP
369
+ .B evaluateOnNewDocument
370
+ Boolean. Inject Fetch/XHR interceptor scripts into page context.
371
+
372
+ .TP
373
+ .B cdp
374
+ Boolean. Enable Chrome DevTools Protocol logging for this specific site.
375
+
376
+ .TP
377
+ .B source
378
+ Boolean. Save page source HTML after loading.
379
+
380
+ .TP
381
+ .B screenshot
382
+ Boolean. Capture screenshot on page load failure.
383
+
384
+ .TP
385
+ .B headful
386
+ Boolean. Launch browser with GUI for this specific site.
387
+
388
+ .TP
389
+ .B adblock_rules
390
+ Boolean. Generate adblock filter rules with resource types for this site.
391
+
392
+ .TP
393
+ .B cloudflare_phish
394
+ Boolean. Auto-click through Cloudflare phishing warnings (default: false).
395
+
396
+ .TP
397
+ .B cloudflare_bypass
398
+ Boolean. Auto-solve Cloudflare "Verify you are human" challenges (default: false).
399
+
400
+ .TP
401
+ .B flowproxy_detection
402
+ Boolean. Enable flowProxy protection detection and handling (default: false).
403
+
404
+ .TP
405
+ .B flowproxy_page_timeout
406
+ Milliseconds. Page timeout for flowProxy sites (default: 45000).
407
+
408
+ .TP
409
+ .B flowproxy_nav_timeout
410
+ Milliseconds. Navigation timeout for flowProxy sites (default: 45000).
411
+
412
+ .TP
413
+ .B flowproxy_js_timeout
414
+ Milliseconds. JavaScript challenge timeout (default: 15000).
415
+
416
+ .TP
417
+ .B flowproxy_delay
418
+ Milliseconds. Delay for rate limiting (default: 30000).
419
+
420
+ .TP
421
+ .B flowproxy_additional_delay
422
+ Milliseconds. Additional processing delay (default: 5000).
423
+
424
+ .TP
425
+ .B verbose
426
+ Boolean. Enable verbose output for this specific site.
427
+
428
+ .TP
429
+ .B subDomains
430
+ Number. Output full subdomains instead of root domains (1/0).
431
+
432
+ .TP
433
+ .B localhost
434
+ Boolean. Force localhost output format (127.0.0.1) for this site.
435
+
436
+ .TP
437
+ .B localhost_0_0_0_0
438
+ Boolean. Force localhost output format (0.0.0.0) for this site.
439
+
440
+ .TP
441
+ .B dnsmasq
442
+ Boolean. Force dnsmasq output format for this site.
443
+
444
+ .TP
445
+ .B dnsmasq_old
446
+ Boolean. Force dnsmasq old format for this site.
447
+
448
+ .TP
449
+ .B unbound
450
+ Boolean. Force unbound output format for this site.
451
+
452
+ .TP
453
+ .B privoxy
454
+ Boolean. Force Privoxy output format for this site.
455
+
456
+ .TP
457
+ .B pihole
458
+ Boolean. Force Pi-hole regex output format for this site.
459
+
460
+ .TP
461
+ .B plain
462
+ Boolean. Force plain domain output for this site.
463
+
464
+ .SH SIMILARITY FILTERING
465
+
466
+ The scanner includes advanced similarity filtering to reduce noise and improve detection accuracy by automatically ignoring domains that are very similar to ones already found or explicitly ignored.
467
+
468
+ .SS Two-Layer Similarity Protection
469
+
470
+ .TP
471
+ .B Standard Similarity Filtering
472
+ Ignores domains similar to already-found domains during scanning. For example, if \fBanimerco.com\fR is found, \fBanimerco.org\fR and \fBanimerco.net\fR will be automatically ignored (100% base domain similarity).
473
+
474
+ .TP
475
+ .B Ignored Domains Similarity Filtering
476
+ Ignores domains similar to those in the \fBignoreDomains\fR list. For example, if \fBgoogle.com\fR is in ignoreDomains, then \fBgoogle.co.uk\fR, \fBgoogle.com.au\fR, and \fBgooglee.com\fR will be automatically ignored.
477
+
478
+ .SS Multi-Part TLD Support
479
+
480
+ The similarity engine correctly handles 70+ international multi-part TLDs including:
481
+ .RS
482
+ .IP \(bu 4
483
+ \fBEurope\fR: .co.uk, .org.uk, .com.de, .com.fr, .com.es, .com.it, .com.pl, .com.ru
484
+ .IP \(bu 4
485
+ \fBAsia-Pacific\fR: .co.jp, .or.jp, .com.au, .org.au, .co.nz, .org.nz, .com.cn, .org.cn
486
+ .IP \(bu 4
487
+ \fBAmericas\fR: .com.br, .org.br, .com.ar, .org.ar, .com.mx, .org.mx, .com.co
488
+ .IP \(bu 4
489
+ \fBOthers\fR: .co.za, .org.za, .co.il, .org.il, .com.eg, .org.eg
490
+ .RE
491
+
492
+ .SS Similarity Configuration
493
+
494
+ .TP
495
+ .B ignore_similar
496
+ Global and per-site boolean to enable/disable similarity filtering (default: true).
497
+
498
+ .TP
499
+ .B ignore_similar_threshold
500
+ Similarity threshold percentage 0-100. Higher values = more strict filtering (default: 80).
501
+
502
+ .TP
503
+ .B ignore_similar_ignored_domains
504
+ Global and per-site boolean to enable similarity filtering against ignoreDomains (default: true).
505
+
506
+ .SS Similarity Examples
507
+
508
+ With default settings (\fBignore_similar_threshold: 80\fR):
509
+ .RS
510
+ .IP \(bu 4
511
+ \fBanimerco.com\fR vs \fBanimerco.org\fR → 100% similar → Ignored
512
+ .IP \(bu 4
513
+ \fBgoogle.com\fR vs \fBgoogle.co.uk\fR → 100% similar → Ignored
514
+ .IP \(bu 4
515
+ \fBamazon.com\fR vs \fBamazon2.org\fR → 89% similar → Ignored
516
+ .IP \(bu 4
517
+ \fBfacebook.com\fR vs \fBfaceboook.com\fR → 91% similar → Ignored
518
+ .IP \(bu 4
519
+ \fBapple.com\fR vs \fBmicrosoft.com\fR → 0% similar → Kept
520
+ .RE
521
+
522
+ .SH EXAMPLES
523
+
524
+ .SS Basic malware domain detection:
525
+ .EX
526
+ {
527
+ "url": "https://suspicious-site.com",
528
+ "filterRegex": "\\\\.(space|website|tech|buzz)\\\\b",
529
+ "resourceTypes": ["script", "xhr", "fetch"]
530
+ }
531
+ .EE
532
+
533
+ .SS Configuration with similarity filtering:
534
+ .EX
535
+ {
536
+ "ignoreDomains": ["google.com", "facebook.com", "amazon.com"],
537
+ "ignore_similar": true,
538
+ "ignore_similar_threshold": 80,
539
+ "ignore_similar_ignored_domains": true,
540
+ "sites": [
541
+ {
542
+ "url": "https://ad-network.com",
543
+ "filterRegex": "\\\\.(top|click|buzz)\\\\b",
544
+ "ignore_similar": true,
545
+ "ignore_similar_threshold": 85,
546
+ "resourceTypes": ["script", "fetch"]
547
+ }
548
+ ]
549
+ }
550
+ .EE
551
+
552
+ .SS Content analysis with OR logic search:
553
+ .EX
554
+ {
555
+ "url": "https://ad-network.com",
556
+ "filterRegex": "\\\\.(top|click|buzz)\\\\b",
557
+ "searchstring": ["tracking", "analytics", "pixel"],
558
+ "curl": true,
559
+ "resourceTypes": ["script", "fetch"]
560
+ }
561
+ .EE
562
+
563
+ .SS Content analysis with AND logic (all terms required):
564
+ .EX
565
+ {
566
+ "url": "https://crypto-site.com",
567
+ "filterRegex": "\\\\.(space|website)\\\\b",
568
+ "searchstring_and": ["mining", "crypto", "wallet"],
569
+ "curl": true,
570
+ "grep": true
571
+ }
572
+ .EE
573
+
574
+ .SS WHOIS-based malicious domain detection:
575
+ .EX
576
+ {
577
+ "url": "https://phishing-target.com",
578
+ "filterRegex": "\\\\.(top|click|buzz|space)\\\\b",
579
+ "whois": ["privacy", "protection"],
580
+ "whois_server": "whois.verisign-grs.com",
581
+ "resourceTypes": ["script", "image", "fetch"]
582
+ }
583
+ .EE
584
+
585
+ .SS Combined content and network analysis with similarity filtering:
586
+ .EX
587
+ {
588
+ "ignoreDomains": ["google.com", "googlee.com"],
589
+ "ignore_similar": true,
590
+ "ignore_similar_threshold": 75,
591
+ "ignore_similar_ignored_domains": true,
592
+ "sites": [
593
+ {
594
+ "url": "https://complex-threat.com",
595
+ "filterRegex": "\\\\.(space|website|tech)\\\\b",
596
+ "searchstring_and": ["bitcoin", "mining"],
597
+ "whois": ["privacy"],
598
+ "dig-or": ["tor", "onion"],
599
+ "curl": true,
600
+ "ignore_similar_threshold": 90,
601
+ "resourceTypes": ["script", "fetch", "xhr"]
602
+ }
603
+ ]
604
+ }
605
+ .EE
606
+
607
+ .SS Configuration with documentation comments:
608
+ .EX
609
+ {
610
+ "comments": ["Testing malware sites", "Updated 2025-01-15", "https://docs.example.com/config"],
611
+ "ignore_similar": true,
612
+ "ignore_similar_threshold": 80,
613
+ "sites": [
614
+ {
615
+ "url": "https://suspicious-site.com",
616
+ "comments": "Main phishing target for Q1 testing",
617
+ "filterRegex": "\\\\.(space|website|tech|buzz)\\\\b",
618
+ "resourceTypes": ["script", "xhr", "fetch"]
619
+ },
620
+ {
621
+ "url": "https://crypto-mining.com",
622
+ "comments": ["Cryptojacking site", "Added by security team", "Ticket #12345"],
623
+ "filterRegex": "\\\\.(top|click)\\\\b",
624
+ "searchstring": ["mining", "crypto"],
625
+ "curl": true,
626
+ "ignore_similar": false
627
+ }
628
+ ]
629
+ }
630
+ .EE
631
+
632
+ .SS Command line usage examples:
633
+
634
+ .SS Run with debug mode and similarity filtering:
635
+ .EX
636
+ node nwss.js --debug --dry-run --verbose
637
+ .EE
638
+
639
+ .SS Run with adblock output format:
640
+ .EX
641
+ node nwss.js --output rules.txt --adblock-rules --remove-dupes
642
+ .EE
643
+
644
+ .SS Validate configuration and rules:
645
+ .EX
646
+ node nwss.js --validate-config
647
+ node nwss.js --validate-rules rules.txt
648
+ node nwss.js --clean-rules --remove-dupes --dry-run
649
+ .EE
650
+
651
+ .SS Advanced validation and cleaning:
652
+ .EX
653
+ node nwss.js --clean-rules rules.txt --remove-dupes
654
+ node nwss.js --test-validation
655
+ .EE
656
+
657
+ .SS Multiple output formats:
658
+ .EX
659
+ node nwss.js -o hosts.txt --localhost --remove-dupes
660
+ node nwss.js -o dnsmasq.conf --dnsmasq --titles
661
+ node nwss.js -o pihole_regex.txt --pihole --debug
662
+ .EE
663
+
664
+ .SS Cloudflare bypass and fingerprint spoofing:
665
+ .EX
666
+ {
667
+ "url": "https://protected-site.com",
668
+ "filterRegex": "\\\\.(top|buzz)\\\\b",
669
+ "cloudflare_bypass": true,
670
+ "cloudflare_phish": true,
671
+ "fingerprint_protection": "random",
672
+ "isBrave": true,
673
+ "userAgent": "chrome"
674
+ }
675
+ .EE
676
+
677
+ .SS FlowProxy protection handling:
678
+ .EX
679
+ {
680
+ "url": "https://flowproxy-protected.com",
681
+ "filterRegex": "\\\\.(space|website)\\\\b",
682
+ "flowproxy_detection": true,
683
+ "flowproxy_page_timeout": 45000,
684
+ "flowproxy_nav_timeout": 45000,
685
+ "flowproxy_js_timeout": 15000,
686
+ "flowproxy_delay": 30000,
687
+ "flowproxy_additional_delay": 5000
688
+ }
689
+ .EE
690
+
691
+ .SH OUTPUT FORMATS
692
+
693
+ The scanner supports multiple output formats for different blocking systems:
694
+
695
+ .SS Standard Adblock Format
696
+ Default format: \fB||domain.com^\fR
697
+ .br
698
+ Compatible with uBlock Origin, AdBlock Plus, and other browser ad blockers.
699
+
700
+ .SS Privoxy Format
701
+ Flag: \fB\--privoxy\fR
702
+ .br
703
+ Format: \fB{ +block } .domain.com\fR
704
+ .br
705
+ For use in Privoxy action files. The leading dot blocks domain and all subdomains.
706
+
707
+ .SS Pi-hole Regex Format
708
+ Flag: \fB\--pihole\fR
709
+ .br
710
+ Format: \fB(^|\\.)domain\\.com$\fR
711
+ .br
712
+ For Pi-hole regex filters. Blocks domain and subdomains at DNS level.
713
+
714
+ .SS Hosts File Formats
715
+ Flags: \fB\--localhost\fR, \fB\--localhost-0.0.0.0\fR
716
+ .br
717
+ Formats: \fB127.0.0.1 domain.com\fR, \fB0.0.0.0 domain.com\fR
718
+ .br
719
+ For system hosts files.
720
+
721
+ .SS DNS Server Formats
722
+ Flags: \fB\--dnsmasq\fR, \fB\--dnsmasq-old\fR, \fB\--unbound\fR
723
+ .br
724
+ For dnsmasq and Unbound DNS servers.
725
+
726
+ .SS Plain Domain Format
727
+ Flag: \fB\--plain\fR
728
+ .br
729
+ Format: \fBdomain.com\fR
730
+ .br
731
+ Simple domain list without formatting.
732
+
733
+ .SH FILES
734
+
735
+ .TP
736
+ .B config.json
737
+ Default configuration file containing scan targets and rules.
738
+
739
+ .TP
740
+ .B logs/
741
+ Directory created for debug and matched URL logs when \fB\--debug\fR or \fB\--dumpurls\fR is used.
742
+
743
+ .TP
744
+ .B user.action
745
+ Common Privoxy action file when using \fB\--privoxy\fR output.
746
+
747
+ .SH DETECTION METHODS
748
+
749
+ .SS URL Pattern Matching
750
+ Uses regex patterns to identify suspicious domains and request URLs.
751
+
752
+ .SS Content Analysis
753
+ Downloads page content with curl and searches for malicious strings using JavaScript or grep.
754
+
755
+ .SS Network Tools Integration
756
+ Performs WHOIS and DNS lookups to identify suspicious domain registrations.
757
+
758
+ .SS Browser-Based Analysis
759
+ Uses Puppeteer to monitor network requests, analyze frames, and detect dynamic threats.
760
+
761
+ .SS Resource Type Filtering
762
+ Filters analysis by HTTP resource type (script, xhr, fetch, image, etc.).
763
+
764
+ .SS Similarity-Based Filtering
765
+ Automatically filters out domains similar to already-found domains or those in the ignore list, supporting 70+ international TLD formats.
766
+
767
+ .SH SECURITY FEATURES
768
+
769
+ .SS Fingerprint Spoofing
770
+ Randomizes browser fingerprints to avoid detection by malicious sites.
771
+
772
+ .SS Request Blocking
773
+ Blocks suspicious requests during scanning to prevent malware execution.
774
+
775
+ .SS Frame Isolation
776
+ Safely analyzes iframe content without executing malicious scripts.
777
+
778
+ .SS Cloudflare Bypass
779
+ Automatically handles Cloudflare protection challenges.
780
+
781
+ .SS FlowProxy Protection
782
+ Detects and handles FlowProxy protection systems.
783
+
784
+ .SS Intelligent Domain Filtering
785
+ Advanced similarity algorithms prevent duplicate detection across international domains and variations.
786
+
787
+ .SH EXIT STATUS
788
+ .TP
789
+ .B 0
790
+ Success. All URLs processed successfully.
791
+ .TP
792
+ .B 1
793
+ Error in configuration, file access, or critical failure.
794
+
795
+ .SH BUGS
796
+ Frame navigation errors may appear in debug output but do not affect detection functionality.
797
+
798
+ Report bugs to the project repository or maintainer.
799
+
800
+ .SH SEE ALSO
801
+ .BR curl (1),
802
+ .BR grep (1),
803
+ .BR whois (1),
804
+ .BR dig (1),
805
+ .BR dnsmasq (8),
806
+ .BR unbound (8),
807
+ .BR privoxy (8)
808
+
809
+ .SH AUTHORS
810
+ Written for malware research and network security analysis.
811
+
812
+ .SH COPYRIGHT
813
+ Copyright (C) 2025 Free Software Foundation, Inc.
814
+ This is free software; you can redistribute it and/or modify it under the
815
+ terms of the GNU General Public License as published by the Free Software
816
+ Foundation; either version 3 of the License, or (at your option) any later
817
+ version.
818
+
819
+ This program is distributed in the hope that it will be useful, but WITHOUT
820
+ ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
821
+ FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
822
+
823
+ You should have received a copy of the GNU General Public License along with
824
+ this program. If not, see <https://www.gnu.org/licenses/>.