@fanboynz/network-scanner 1.0.35
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.github/workflows/npm-publish.yml +33 -0
- package/JSONMANUAL.md +121 -0
- package/LICENSE +674 -0
- package/README.md +357 -0
- package/config.json +74 -0
- package/lib/browserexit.js +522 -0
- package/lib/browserhealth.js +308 -0
- package/lib/cloudflare.js +660 -0
- package/lib/colorize.js +168 -0
- package/lib/compare.js +159 -0
- package/lib/compress.js +129 -0
- package/lib/fingerprint.js +613 -0
- package/lib/flowproxy.js +274 -0
- package/lib/grep.js +348 -0
- package/lib/ignore_similar.js +237 -0
- package/lib/nettools.js +1200 -0
- package/lib/output.js +633 -0
- package/lib/redirect.js +384 -0
- package/lib/searchstring.js +561 -0
- package/lib/validate_rules.js +1107 -0
- package/nwss.1 +824 -0
- package/nwss.js +2488 -0
- package/package.json +45 -0
- package/regex-samples.md +27 -0
- package/scanner-script-org.js +588 -0
package/nwss.1
ADDED
|
@@ -0,0 +1,824 @@
|
|
|
1
|
+
.TH NWSS-SCRIPT 1 "2025" "scanner-script v1.0.32" "User Commands"
|
|
2
|
+
.SH NAME
|
|
3
|
+
NWSS scanner-script \- Network scanner for malware detection and domain analysis with advanced similarity filtering
|
|
4
|
+
|
|
5
|
+
.SH SYNOPSIS
|
|
6
|
+
.B node nwss.js
|
|
7
|
+
[\fIOPTIONS\fR]
|
|
8
|
+
|
|
9
|
+
.SH DESCRIPTION
|
|
10
|
+
.B nwss.js
|
|
11
|
+
is a comprehensive network scanner that uses Puppeteer to analyze web pages for malicious content, tracking scripts, and suspicious domains. It can detect threats through URL pattern matching, content analysis, DNS/WHOIS lookups, and behavioral analysis.
|
|
12
|
+
|
|
13
|
+
The scanner supports multiple detection methods including regex filtering, content searching with curl/grep, network tools integration, and advanced browser-based analysis with frame monitoring and fingerprint spoofing. It includes intelligent domain similarity filtering to reduce noise and improve detection accuracy.
|
|
14
|
+
|
|
15
|
+
.SH OPTIONS
|
|
16
|
+
|
|
17
|
+
.SS Output Options
|
|
18
|
+
.TP
|
|
19
|
+
.BR \-o ", " \--output " \fIFILE\fR"
|
|
20
|
+
Write rules to \fIFILE\fR instead of standard output.
|
|
21
|
+
|
|
22
|
+
.TP
|
|
23
|
+
.BR \--compare " \fIFILE\fR"
|
|
24
|
+
Remove rules that already exist in \fIFILE\fR before output (requires \fB\-o\fR).
|
|
25
|
+
|
|
26
|
+
.TP
|
|
27
|
+
.B \--append
|
|
28
|
+
Append new rules to output file instead of overwriting (requires \fB\-o\fR).
|
|
29
|
+
|
|
30
|
+
.SS Output Format Options
|
|
31
|
+
.TP
|
|
32
|
+
.B \--localhost
|
|
33
|
+
Output rules as \fB127.0.0.1 domain.com\fR format for hosts file.
|
|
34
|
+
|
|
35
|
+
.TP
|
|
36
|
+
.B \--localhost-0.0.0.0
|
|
37
|
+
Output rules as \fB0.0.0.0 domain.com\fR format for hosts file.
|
|
38
|
+
|
|
39
|
+
.TP
|
|
40
|
+
.B \--plain
|
|
41
|
+
Output just domain names without any formatting.
|
|
42
|
+
|
|
43
|
+
.TP
|
|
44
|
+
.B \--dnsmasq
|
|
45
|
+
Output as \fBlocal=/domain.com/\fR format for dnsmasq.
|
|
46
|
+
|
|
47
|
+
.TP
|
|
48
|
+
.B \--dnsmasq-old
|
|
49
|
+
Output as \fBserver=/domain.com/\fR format for older dnsmasq versions.
|
|
50
|
+
|
|
51
|
+
.TP
|
|
52
|
+
.B \--unbound
|
|
53
|
+
Output as \fBlocal-zone: "domain.com." always_null\fR format for Unbound DNS.
|
|
54
|
+
|
|
55
|
+
.TP
|
|
56
|
+
.B \--privoxy
|
|
57
|
+
Output as \fB{ +block } .domain.com\fR format for Privoxy action files.
|
|
58
|
+
|
|
59
|
+
.TP
|
|
60
|
+
.B \--pihole
|
|
61
|
+
Output as \fB(^|\\.)domain\\.com$\fR format for Pi-hole regex filters.
|
|
62
|
+
|
|
63
|
+
.TP
|
|
64
|
+
.B \--adblock-rules
|
|
65
|
+
Generate adblock filter rules with resource type modifiers (requires \fB\-o\fR).
|
|
66
|
+
|
|
67
|
+
.SS General Options
|
|
68
|
+
.TP
|
|
69
|
+
.B \--verbose
|
|
70
|
+
Enable verbose output globally for all sites.
|
|
71
|
+
|
|
72
|
+
.TP
|
|
73
|
+
.B \--debug
|
|
74
|
+
Enable debug mode with detailed logging of all network requests.
|
|
75
|
+
|
|
76
|
+
.TP
|
|
77
|
+
.B \--silent
|
|
78
|
+
Suppress normal console output (errors and warnings still shown).
|
|
79
|
+
|
|
80
|
+
.TP
|
|
81
|
+
.B \--titles
|
|
82
|
+
Add comment lines with site URLs before each rule group.
|
|
83
|
+
|
|
84
|
+
.TP
|
|
85
|
+
.B \--dumpurls
|
|
86
|
+
Log all matched URLs to timestamped log files in \fBlogs/\fR directory.
|
|
87
|
+
|
|
88
|
+
.TP
|
|
89
|
+
.B \--compress-logs
|
|
90
|
+
Compress log files with gzip after completion (requires \fB\--dumpurls\fR).
|
|
91
|
+
|
|
92
|
+
.TP
|
|
93
|
+
.B \--sub-domains
|
|
94
|
+
Output full subdomains instead of collapsing to root domains.
|
|
95
|
+
|
|
96
|
+
.TP
|
|
97
|
+
.B \--no-interact
|
|
98
|
+
Disable mouse simulation and page interaction globally.
|
|
99
|
+
|
|
100
|
+
.TP
|
|
101
|
+
.BR \--custom-json " \fIFILE\fR"
|
|
102
|
+
Use \fIFILE\fR instead of \fBconfig.json\fR for configuration.
|
|
103
|
+
|
|
104
|
+
.TP
|
|
105
|
+
.B \--headful
|
|
106
|
+
Launch browser with GUI instead of headless mode.
|
|
107
|
+
|
|
108
|
+
.TP
|
|
109
|
+
.B \--cdp
|
|
110
|
+
Enable Chrome DevTools Protocol logging for network analysis.
|
|
111
|
+
|
|
112
|
+
.TP
|
|
113
|
+
.B \--remove-dupes
|
|
114
|
+
Remove duplicate domains from output (only with \fB\-o\fR).
|
|
115
|
+
|
|
116
|
+
.TP
|
|
117
|
+
.B \--eval-on-doc
|
|
118
|
+
Globally enable JavaScript injection for Fetch/XHR interception.
|
|
119
|
+
|
|
120
|
+
.TP
|
|
121
|
+
.B \--dry-run
|
|
122
|
+
Console output only: show matching regex, titles, whois/dig/searchstring results, and adblock rules without writing files.
|
|
123
|
+
|
|
124
|
+
.TP
|
|
125
|
+
.B \--remove-tempfiles
|
|
126
|
+
Remove Chrome/Puppeteer temporary files before exit.
|
|
127
|
+
|
|
128
|
+
.TP
|
|
129
|
+
.BR \-h ", " \--help
|
|
130
|
+
Show help message and exit.
|
|
131
|
+
|
|
132
|
+
.TP
|
|
133
|
+
.B \--version
|
|
134
|
+
Show version information and exit.
|
|
135
|
+
|
|
136
|
+
.SS Validation Options
|
|
137
|
+
.TP
|
|
138
|
+
.B \--validate-config
|
|
139
|
+
Validate config.json file and exit.
|
|
140
|
+
|
|
141
|
+
.TP
|
|
142
|
+
.B \--validate-rules [\fIFILE\fR]
|
|
143
|
+
Validate rule file format (uses \fB\--output\fR/\fB\--compare\fR files if no file specified).
|
|
144
|
+
|
|
145
|
+
.TP
|
|
146
|
+
.B \--clean-rules [\fIFILE\fR]
|
|
147
|
+
Clean rule files by removing invalid lines and optionally duplicates (uses \fB\--output\fR/\fB\--compare\fR files if no file specified).
|
|
148
|
+
|
|
149
|
+
.TP
|
|
150
|
+
.B \--test-validation
|
|
151
|
+
Run domain validation tests and exit.
|
|
152
|
+
|
|
153
|
+
.SH CONFIGURATION
|
|
154
|
+
|
|
155
|
+
Configuration is provided via JSON files. The default configuration file is \fBconfig.json\fR.
|
|
156
|
+
|
|
157
|
+
.SS Global Configuration Options
|
|
158
|
+
|
|
159
|
+
.TP
|
|
160
|
+
.B ignoreDomains
|
|
161
|
+
Array of domains to completely ignore. Supports wildcards (e.g., \fB"*.ads.com"\fR).
|
|
162
|
+
|
|
163
|
+
.TP
|
|
164
|
+
.B blocked
|
|
165
|
+
Array of global regex patterns to block requests.
|
|
166
|
+
|
|
167
|
+
.TP
|
|
168
|
+
.B whois_delay
|
|
169
|
+
Default delay between whois requests in milliseconds (default: 3000).
|
|
170
|
+
|
|
171
|
+
.TP
|
|
172
|
+
.B whois_server_mode
|
|
173
|
+
Default server selection mode for all sites: \fB"random"\fR or \fB"cycle"\fR (default: "random").
|
|
174
|
+
|
|
175
|
+
.TP
|
|
176
|
+
.B ignore_similar
|
|
177
|
+
Boolean. Ignore domains similar to already found domains (default: true).
|
|
178
|
+
|
|
179
|
+
.TP
|
|
180
|
+
.B ignore_similar_threshold
|
|
181
|
+
Number. Similarity threshold percentage for ignore_similar (default: 80).
|
|
182
|
+
|
|
183
|
+
.TP
|
|
184
|
+
.B ignore_similar_ignored_domains
|
|
185
|
+
Boolean. Ignore domains similar to ignoreDomains list (default: true).
|
|
186
|
+
|
|
187
|
+
.SS Per-Site Configuration Options
|
|
188
|
+
|
|
189
|
+
.TP
|
|
190
|
+
.B url
|
|
191
|
+
Single URL string or array of URLs to scan.
|
|
192
|
+
|
|
193
|
+
.TP
|
|
194
|
+
.B filterRegex
|
|
195
|
+
Regex pattern(s) to match suspicious requests.
|
|
196
|
+
|
|
197
|
+
.TP
|
|
198
|
+
.B comments
|
|
199
|
+
Documentation strings or notes - completely ignored by the scanner. Can be a single string or array of strings. Used for adding context, URLs, timestamps, or any documentation notes to configuration files.
|
|
200
|
+
|
|
201
|
+
.TP
|
|
202
|
+
.B searchstring
|
|
203
|
+
Text string(s) to search for in response content (OR logic).
|
|
204
|
+
|
|
205
|
+
.TP
|
|
206
|
+
.B searchstring_and
|
|
207
|
+
Text string(s) that must ALL be present in content (AND logic).
|
|
208
|
+
|
|
209
|
+
.TP
|
|
210
|
+
.B curl
|
|
211
|
+
Boolean. Use curl to download and analyze content.
|
|
212
|
+
|
|
213
|
+
.TP
|
|
214
|
+
.B grep
|
|
215
|
+
Boolean. Use system grep for faster pattern matching (requires \fBcurl=true\fR).
|
|
216
|
+
|
|
217
|
+
.TP
|
|
218
|
+
.B resourceTypes
|
|
219
|
+
Array of resource types to process (e.g., \fB["script", "xhr", "fetch"]\fR).
|
|
220
|
+
|
|
221
|
+
.TP
|
|
222
|
+
.B blocked
|
|
223
|
+
Array of regex patterns to block requests for this site.
|
|
224
|
+
|
|
225
|
+
.TP
|
|
226
|
+
.B css_blocked
|
|
227
|
+
Array of CSS selectors to hide elements on the page.
|
|
228
|
+
|
|
229
|
+
.TP
|
|
230
|
+
.B userAgent
|
|
231
|
+
Spoof User-Agent: \fB"chrome"\fR, \fB"firefox"\fR, or \fB"safari"\fR.
|
|
232
|
+
|
|
233
|
+
.TP
|
|
234
|
+
.B interact
|
|
235
|
+
Boolean. Simulate mouse movements and clicks.
|
|
236
|
+
|
|
237
|
+
.TP
|
|
238
|
+
.B delay
|
|
239
|
+
Milliseconds to wait after page load (default: 4000).
|
|
240
|
+
|
|
241
|
+
.TP
|
|
242
|
+
.B reload
|
|
243
|
+
Number of times to reload the page (default: 1).
|
|
244
|
+
|
|
245
|
+
.TP
|
|
246
|
+
.B timeout
|
|
247
|
+
Request timeout in milliseconds (default: 30000).
|
|
248
|
+
|
|
249
|
+
.TP
|
|
250
|
+
.B firstParty
|
|
251
|
+
Boolean. Allow first-party request matching (default: false).
|
|
252
|
+
|
|
253
|
+
.TP
|
|
254
|
+
.B thirdParty
|
|
255
|
+
Boolean. Allow third-party request matching (default: true).
|
|
256
|
+
|
|
257
|
+
.TP
|
|
258
|
+
.B fingerprint_protection
|
|
259
|
+
Boolean or \fB"random"\fR. Enable browser fingerprint spoofing.
|
|
260
|
+
|
|
261
|
+
.TP
|
|
262
|
+
.B ignore_similar
|
|
263
|
+
Boolean. Override global ignore_similar setting for this site.
|
|
264
|
+
|
|
265
|
+
.TP
|
|
266
|
+
.B ignore_similar_threshold
|
|
267
|
+
Number. Override global similarity threshold for this site.
|
|
268
|
+
|
|
269
|
+
.TP
|
|
270
|
+
.B ignore_similar_ignored_domains
|
|
271
|
+
Boolean. Override global ignore_similar_ignored_domains for this site.
|
|
272
|
+
|
|
273
|
+
.TP
|
|
274
|
+
.B even_blocked
|
|
275
|
+
Boolean. Add matching rules even if requests are blocked (default: false).
|
|
276
|
+
|
|
277
|
+
.TP
|
|
278
|
+
.B whois
|
|
279
|
+
Array of terms that must ALL be found in WHOIS data (AND logic).
|
|
280
|
+
|
|
281
|
+
.TP
|
|
282
|
+
.B whois-or
|
|
283
|
+
Array of terms where ANY must be found in WHOIS data (OR logic).
|
|
284
|
+
|
|
285
|
+
.TP
|
|
286
|
+
.B whois_server
|
|
287
|
+
Custom WHOIS server(s) to use for lookups.
|
|
288
|
+
|
|
289
|
+
.TP
|
|
290
|
+
.B whois_server_mode
|
|
291
|
+
Server selection mode: \fB"random"\fR (default) or \fB"cycle"\fR through list.
|
|
292
|
+
|
|
293
|
+
.TP
|
|
294
|
+
.B whois_max_retries
|
|
295
|
+
Number. Maximum retry attempts per domain for WHOIS queries (default: 2).
|
|
296
|
+
|
|
297
|
+
.TP
|
|
298
|
+
.B whois_timeout_multiplier
|
|
299
|
+
Number. Timeout increase multiplier per retry (default: 1.5).
|
|
300
|
+
|
|
301
|
+
.TP
|
|
302
|
+
.B whois_use_fallback
|
|
303
|
+
Boolean. Add TLD-specific fallback servers for WHOIS (default: true).
|
|
304
|
+
|
|
305
|
+
.TP
|
|
306
|
+
.B whois_retry_on_timeout
|
|
307
|
+
Boolean. Retry on timeout errors (default: true).
|
|
308
|
+
|
|
309
|
+
.TP
|
|
310
|
+
.B whois_retry_on_error
|
|
311
|
+
Boolean. Retry on connection/other errors (default: false).
|
|
312
|
+
|
|
313
|
+
.TP
|
|
314
|
+
.B whois_delay
|
|
315
|
+
Milliseconds. Delay between whois requests for this site (default: global whois_delay).
|
|
316
|
+
|
|
317
|
+
.TP
|
|
318
|
+
.B dig
|
|
319
|
+
Array of terms that must ALL be found in DNS records (AND logic).
|
|
320
|
+
|
|
321
|
+
.TP
|
|
322
|
+
.B dig-or
|
|
323
|
+
Array of terms where ANY must be found in DNS records (OR logic).
|
|
324
|
+
|
|
325
|
+
.TP
|
|
326
|
+
.B digRecordType
|
|
327
|
+
DNS record type for dig queries (default: "A").
|
|
328
|
+
|
|
329
|
+
.TP
|
|
330
|
+
.B dig_subdomain
|
|
331
|
+
Boolean. Use subdomain for dig lookup instead of root domain (default: false).
|
|
332
|
+
|
|
333
|
+
.TP
|
|
334
|
+
.B goto_options
|
|
335
|
+
Object. Custom page.goto() options for Puppeteer navigation. Available options:
|
|
336
|
+
.RS
|
|
337
|
+
.IP \(bu 4
|
|
338
|
+
\fBwaitUntil\fR: When to consider navigation successful. Options:
|
|
339
|
+
.RS
|
|
340
|
+
.IP \(bu 4
|
|
341
|
+
\fB"load"\fR - Wait for all resources to load (default)
|
|
342
|
+
.IP \(bu 4
|
|
343
|
+
\fB"domcontentloaded"\fR - Wait for DOM only, faster loading
|
|
344
|
+
.IP \(bu 4
|
|
345
|
+
\fB"networkidle0"\fR - Wait until 0 network requests for 500ms
|
|
346
|
+
.IP \(bu 4
|
|
347
|
+
\fB"networkidle2"\fR - Wait until ≤2 network requests for 500ms
|
|
348
|
+
.RE
|
|
349
|
+
.IP \(bu 4
|
|
350
|
+
\fBtimeout\fR: Maximum navigation time in milliseconds (overrides site timeout)
|
|
351
|
+
.IP \(bu 4
|
|
352
|
+
\fBreferer\fR: Referer header to send with navigation request
|
|
353
|
+
.RE
|
|
354
|
+
Example: \fB{"waitUntil": "networkidle2", "timeout": 60000}\fR
|
|
355
|
+
|
|
356
|
+
.TP
|
|
357
|
+
.B forcereload
|
|
358
|
+
Boolean. Force an additional reload with cache disabled after normal reloads.
|
|
359
|
+
|
|
360
|
+
.TP
|
|
361
|
+
.B clear_sitedata
|
|
362
|
+
Boolean. Clear all cookies, cache, and storage before each page load (default: false).
|
|
363
|
+
|
|
364
|
+
.TP
|
|
365
|
+
.B isBrave
|
|
366
|
+
Boolean. Spoof Brave browser detection.
|
|
367
|
+
|
|
368
|
+
.TP
|
|
369
|
+
.B evaluateOnNewDocument
|
|
370
|
+
Boolean. Inject Fetch/XHR interceptor scripts into page context.
|
|
371
|
+
|
|
372
|
+
.TP
|
|
373
|
+
.B cdp
|
|
374
|
+
Boolean. Enable Chrome DevTools Protocol logging for this specific site.
|
|
375
|
+
|
|
376
|
+
.TP
|
|
377
|
+
.B source
|
|
378
|
+
Boolean. Save page source HTML after loading.
|
|
379
|
+
|
|
380
|
+
.TP
|
|
381
|
+
.B screenshot
|
|
382
|
+
Boolean. Capture screenshot on page load failure.
|
|
383
|
+
|
|
384
|
+
.TP
|
|
385
|
+
.B headful
|
|
386
|
+
Boolean. Launch browser with GUI for this specific site.
|
|
387
|
+
|
|
388
|
+
.TP
|
|
389
|
+
.B adblock_rules
|
|
390
|
+
Boolean. Generate adblock filter rules with resource types for this site.
|
|
391
|
+
|
|
392
|
+
.TP
|
|
393
|
+
.B cloudflare_phish
|
|
394
|
+
Boolean. Auto-click through Cloudflare phishing warnings (default: false).
|
|
395
|
+
|
|
396
|
+
.TP
|
|
397
|
+
.B cloudflare_bypass
|
|
398
|
+
Boolean. Auto-solve Cloudflare "Verify you are human" challenges (default: false).
|
|
399
|
+
|
|
400
|
+
.TP
|
|
401
|
+
.B flowproxy_detection
|
|
402
|
+
Boolean. Enable flowProxy protection detection and handling (default: false).
|
|
403
|
+
|
|
404
|
+
.TP
|
|
405
|
+
.B flowproxy_page_timeout
|
|
406
|
+
Milliseconds. Page timeout for flowProxy sites (default: 45000).
|
|
407
|
+
|
|
408
|
+
.TP
|
|
409
|
+
.B flowproxy_nav_timeout
|
|
410
|
+
Milliseconds. Navigation timeout for flowProxy sites (default: 45000).
|
|
411
|
+
|
|
412
|
+
.TP
|
|
413
|
+
.B flowproxy_js_timeout
|
|
414
|
+
Milliseconds. JavaScript challenge timeout (default: 15000).
|
|
415
|
+
|
|
416
|
+
.TP
|
|
417
|
+
.B flowproxy_delay
|
|
418
|
+
Milliseconds. Delay for rate limiting (default: 30000).
|
|
419
|
+
|
|
420
|
+
.TP
|
|
421
|
+
.B flowproxy_additional_delay
|
|
422
|
+
Milliseconds. Additional processing delay (default: 5000).
|
|
423
|
+
|
|
424
|
+
.TP
|
|
425
|
+
.B verbose
|
|
426
|
+
Boolean. Enable verbose output for this specific site.
|
|
427
|
+
|
|
428
|
+
.TP
|
|
429
|
+
.B subDomains
|
|
430
|
+
Number. Output full subdomains instead of root domains (1/0).
|
|
431
|
+
|
|
432
|
+
.TP
|
|
433
|
+
.B localhost
|
|
434
|
+
Boolean. Force localhost output format (127.0.0.1) for this site.
|
|
435
|
+
|
|
436
|
+
.TP
|
|
437
|
+
.B localhost_0_0_0_0
|
|
438
|
+
Boolean. Force localhost output format (0.0.0.0) for this site.
|
|
439
|
+
|
|
440
|
+
.TP
|
|
441
|
+
.B dnsmasq
|
|
442
|
+
Boolean. Force dnsmasq output format for this site.
|
|
443
|
+
|
|
444
|
+
.TP
|
|
445
|
+
.B dnsmasq_old
|
|
446
|
+
Boolean. Force dnsmasq old format for this site.
|
|
447
|
+
|
|
448
|
+
.TP
|
|
449
|
+
.B unbound
|
|
450
|
+
Boolean. Force unbound output format for this site.
|
|
451
|
+
|
|
452
|
+
.TP
|
|
453
|
+
.B privoxy
|
|
454
|
+
Boolean. Force Privoxy output format for this site.
|
|
455
|
+
|
|
456
|
+
.TP
|
|
457
|
+
.B pihole
|
|
458
|
+
Boolean. Force Pi-hole regex output format for this site.
|
|
459
|
+
|
|
460
|
+
.TP
|
|
461
|
+
.B plain
|
|
462
|
+
Boolean. Force plain domain output for this site.
|
|
463
|
+
|
|
464
|
+
.SH SIMILARITY FILTERING
|
|
465
|
+
|
|
466
|
+
The scanner includes advanced similarity filtering to reduce noise and improve detection accuracy by automatically ignoring domains that are very similar to ones already found or explicitly ignored.
|
|
467
|
+
|
|
468
|
+
.SS Two-Layer Similarity Protection
|
|
469
|
+
|
|
470
|
+
.TP
|
|
471
|
+
.B Standard Similarity Filtering
|
|
472
|
+
Ignores domains similar to already-found domains during scanning. For example, if \fBanimerco.com\fR is found, \fBanimerco.org\fR and \fBanimerco.net\fR will be automatically ignored (100% base domain similarity).
|
|
473
|
+
|
|
474
|
+
.TP
|
|
475
|
+
.B Ignored Domains Similarity Filtering
|
|
476
|
+
Ignores domains similar to those in the \fBignoreDomains\fR list. For example, if \fBgoogle.com\fR is in ignoreDomains, then \fBgoogle.co.uk\fR, \fBgoogle.com.au\fR, and \fBgooglee.com\fR will be automatically ignored.
|
|
477
|
+
|
|
478
|
+
.SS Multi-Part TLD Support
|
|
479
|
+
|
|
480
|
+
The similarity engine correctly handles 70+ international multi-part TLDs including:
|
|
481
|
+
.RS
|
|
482
|
+
.IP \(bu 4
|
|
483
|
+
\fBEurope\fR: .co.uk, .org.uk, .com.de, .com.fr, .com.es, .com.it, .com.pl, .com.ru
|
|
484
|
+
.IP \(bu 4
|
|
485
|
+
\fBAsia-Pacific\fR: .co.jp, .or.jp, .com.au, .org.au, .co.nz, .org.nz, .com.cn, .org.cn
|
|
486
|
+
.IP \(bu 4
|
|
487
|
+
\fBAmericas\fR: .com.br, .org.br, .com.ar, .org.ar, .com.mx, .org.mx, .com.co
|
|
488
|
+
.IP \(bu 4
|
|
489
|
+
\fBOthers\fR: .co.za, .org.za, .co.il, .org.il, .com.eg, .org.eg
|
|
490
|
+
.RE
|
|
491
|
+
|
|
492
|
+
.SS Similarity Configuration
|
|
493
|
+
|
|
494
|
+
.TP
|
|
495
|
+
.B ignore_similar
|
|
496
|
+
Global and per-site boolean to enable/disable similarity filtering (default: true).
|
|
497
|
+
|
|
498
|
+
.TP
|
|
499
|
+
.B ignore_similar_threshold
|
|
500
|
+
Similarity threshold percentage 0-100. Higher values = more strict filtering (default: 80).
|
|
501
|
+
|
|
502
|
+
.TP
|
|
503
|
+
.B ignore_similar_ignored_domains
|
|
504
|
+
Global and per-site boolean to enable similarity filtering against ignoreDomains (default: true).
|
|
505
|
+
|
|
506
|
+
.SS Similarity Examples
|
|
507
|
+
|
|
508
|
+
With default settings (\fBignore_similar_threshold: 80\fR):
|
|
509
|
+
.RS
|
|
510
|
+
.IP \(bu 4
|
|
511
|
+
\fBanimerco.com\fR vs \fBanimerco.org\fR → 100% similar → Ignored
|
|
512
|
+
.IP \(bu 4
|
|
513
|
+
\fBgoogle.com\fR vs \fBgoogle.co.uk\fR → 100% similar → Ignored
|
|
514
|
+
.IP \(bu 4
|
|
515
|
+
\fBamazon.com\fR vs \fBamazon2.org\fR → 89% similar → Ignored
|
|
516
|
+
.IP \(bu 4
|
|
517
|
+
\fBfacebook.com\fR vs \fBfaceboook.com\fR → 91% similar → Ignored
|
|
518
|
+
.IP \(bu 4
|
|
519
|
+
\fBapple.com\fR vs \fBmicrosoft.com\fR → 0% similar → Kept
|
|
520
|
+
.RE
|
|
521
|
+
|
|
522
|
+
.SH EXAMPLES
|
|
523
|
+
|
|
524
|
+
.SS Basic malware domain detection:
|
|
525
|
+
.EX
|
|
526
|
+
{
|
|
527
|
+
"url": "https://suspicious-site.com",
|
|
528
|
+
"filterRegex": "\\\\.(space|website|tech|buzz)\\\\b",
|
|
529
|
+
"resourceTypes": ["script", "xhr", "fetch"]
|
|
530
|
+
}
|
|
531
|
+
.EE
|
|
532
|
+
|
|
533
|
+
.SS Configuration with similarity filtering:
|
|
534
|
+
.EX
|
|
535
|
+
{
|
|
536
|
+
"ignoreDomains": ["google.com", "facebook.com", "amazon.com"],
|
|
537
|
+
"ignore_similar": true,
|
|
538
|
+
"ignore_similar_threshold": 80,
|
|
539
|
+
"ignore_similar_ignored_domains": true,
|
|
540
|
+
"sites": [
|
|
541
|
+
{
|
|
542
|
+
"url": "https://ad-network.com",
|
|
543
|
+
"filterRegex": "\\\\.(top|click|buzz)\\\\b",
|
|
544
|
+
"ignore_similar": true,
|
|
545
|
+
"ignore_similar_threshold": 85,
|
|
546
|
+
"resourceTypes": ["script", "fetch"]
|
|
547
|
+
}
|
|
548
|
+
]
|
|
549
|
+
}
|
|
550
|
+
.EE
|
|
551
|
+
|
|
552
|
+
.SS Content analysis with OR logic search:
|
|
553
|
+
.EX
|
|
554
|
+
{
|
|
555
|
+
"url": "https://ad-network.com",
|
|
556
|
+
"filterRegex": "\\\\.(top|click|buzz)\\\\b",
|
|
557
|
+
"searchstring": ["tracking", "analytics", "pixel"],
|
|
558
|
+
"curl": true,
|
|
559
|
+
"resourceTypes": ["script", "fetch"]
|
|
560
|
+
}
|
|
561
|
+
.EE
|
|
562
|
+
|
|
563
|
+
.SS Content analysis with AND logic (all terms required):
|
|
564
|
+
.EX
|
|
565
|
+
{
|
|
566
|
+
"url": "https://crypto-site.com",
|
|
567
|
+
"filterRegex": "\\\\.(space|website)\\\\b",
|
|
568
|
+
"searchstring_and": ["mining", "crypto", "wallet"],
|
|
569
|
+
"curl": true,
|
|
570
|
+
"grep": true
|
|
571
|
+
}
|
|
572
|
+
.EE
|
|
573
|
+
|
|
574
|
+
.SS WHOIS-based malicious domain detection:
|
|
575
|
+
.EX
|
|
576
|
+
{
|
|
577
|
+
"url": "https://phishing-target.com",
|
|
578
|
+
"filterRegex": "\\\\.(top|click|buzz|space)\\\\b",
|
|
579
|
+
"whois": ["privacy", "protection"],
|
|
580
|
+
"whois_server": "whois.verisign-grs.com",
|
|
581
|
+
"resourceTypes": ["script", "image", "fetch"]
|
|
582
|
+
}
|
|
583
|
+
.EE
|
|
584
|
+
|
|
585
|
+
.SS Combined content and network analysis with similarity filtering:
|
|
586
|
+
.EX
|
|
587
|
+
{
|
|
588
|
+
"ignoreDomains": ["google.com", "googlee.com"],
|
|
589
|
+
"ignore_similar": true,
|
|
590
|
+
"ignore_similar_threshold": 75,
|
|
591
|
+
"ignore_similar_ignored_domains": true,
|
|
592
|
+
"sites": [
|
|
593
|
+
{
|
|
594
|
+
"url": "https://complex-threat.com",
|
|
595
|
+
"filterRegex": "\\\\.(space|website|tech)\\\\b",
|
|
596
|
+
"searchstring_and": ["bitcoin", "mining"],
|
|
597
|
+
"whois": ["privacy"],
|
|
598
|
+
"dig-or": ["tor", "onion"],
|
|
599
|
+
"curl": true,
|
|
600
|
+
"ignore_similar_threshold": 90,
|
|
601
|
+
"resourceTypes": ["script", "fetch", "xhr"]
|
|
602
|
+
}
|
|
603
|
+
]
|
|
604
|
+
}
|
|
605
|
+
.EE
|
|
606
|
+
|
|
607
|
+
.SS Configuration with documentation comments:
|
|
608
|
+
.EX
|
|
609
|
+
{
|
|
610
|
+
"comments": ["Testing malware sites", "Updated 2025-01-15", "https://docs.example.com/config"],
|
|
611
|
+
"ignore_similar": true,
|
|
612
|
+
"ignore_similar_threshold": 80,
|
|
613
|
+
"sites": [
|
|
614
|
+
{
|
|
615
|
+
"url": "https://suspicious-site.com",
|
|
616
|
+
"comments": "Main phishing target for Q1 testing",
|
|
617
|
+
"filterRegex": "\\\\.(space|website|tech|buzz)\\\\b",
|
|
618
|
+
"resourceTypes": ["script", "xhr", "fetch"]
|
|
619
|
+
},
|
|
620
|
+
{
|
|
621
|
+
"url": "https://crypto-mining.com",
|
|
622
|
+
"comments": ["Cryptojacking site", "Added by security team", "Ticket #12345"],
|
|
623
|
+
"filterRegex": "\\\\.(top|click)\\\\b",
|
|
624
|
+
"searchstring": ["mining", "crypto"],
|
|
625
|
+
"curl": true,
|
|
626
|
+
"ignore_similar": false
|
|
627
|
+
}
|
|
628
|
+
]
|
|
629
|
+
}
|
|
630
|
+
.EE
|
|
631
|
+
|
|
632
|
+
.SS Command line usage examples:
|
|
633
|
+
|
|
634
|
+
.SS Run with debug mode and similarity filtering:
|
|
635
|
+
.EX
|
|
636
|
+
node nwss.js --debug --dry-run --verbose
|
|
637
|
+
.EE
|
|
638
|
+
|
|
639
|
+
.SS Run with adblock output format:
|
|
640
|
+
.EX
|
|
641
|
+
node nwss.js --output rules.txt --adblock-rules --remove-dupes
|
|
642
|
+
.EE
|
|
643
|
+
|
|
644
|
+
.SS Validate configuration and rules:
|
|
645
|
+
.EX
|
|
646
|
+
node nwss.js --validate-config
|
|
647
|
+
node nwss.js --validate-rules rules.txt
|
|
648
|
+
node nwss.js --clean-rules --remove-dupes --dry-run
|
|
649
|
+
.EE
|
|
650
|
+
|
|
651
|
+
.SS Advanced validation and cleaning:
|
|
652
|
+
.EX
|
|
653
|
+
node nwss.js --clean-rules rules.txt --remove-dupes
|
|
654
|
+
node nwss.js --test-validation
|
|
655
|
+
.EE
|
|
656
|
+
|
|
657
|
+
.SS Multiple output formats:
|
|
658
|
+
.EX
|
|
659
|
+
node nwss.js -o hosts.txt --localhost --remove-dupes
|
|
660
|
+
node nwss.js -o dnsmasq.conf --dnsmasq --titles
|
|
661
|
+
node nwss.js -o pihole_regex.txt --pihole --debug
|
|
662
|
+
.EE
|
|
663
|
+
|
|
664
|
+
.SS Cloudflare bypass and fingerprint spoofing:
|
|
665
|
+
.EX
|
|
666
|
+
{
|
|
667
|
+
"url": "https://protected-site.com",
|
|
668
|
+
"filterRegex": "\\\\.(top|buzz)\\\\b",
|
|
669
|
+
"cloudflare_bypass": true,
|
|
670
|
+
"cloudflare_phish": true,
|
|
671
|
+
"fingerprint_protection": "random",
|
|
672
|
+
"isBrave": true,
|
|
673
|
+
"userAgent": "chrome"
|
|
674
|
+
}
|
|
675
|
+
.EE
|
|
676
|
+
|
|
677
|
+
.SS FlowProxy protection handling:
|
|
678
|
+
.EX
|
|
679
|
+
{
|
|
680
|
+
"url": "https://flowproxy-protected.com",
|
|
681
|
+
"filterRegex": "\\\\.(space|website)\\\\b",
|
|
682
|
+
"flowproxy_detection": true,
|
|
683
|
+
"flowproxy_page_timeout": 45000,
|
|
684
|
+
"flowproxy_nav_timeout": 45000,
|
|
685
|
+
"flowproxy_js_timeout": 15000,
|
|
686
|
+
"flowproxy_delay": 30000,
|
|
687
|
+
"flowproxy_additional_delay": 5000
|
|
688
|
+
}
|
|
689
|
+
.EE
|
|
690
|
+
|
|
691
|
+
.SH OUTPUT FORMATS
|
|
692
|
+
|
|
693
|
+
The scanner supports multiple output formats for different blocking systems:
|
|
694
|
+
|
|
695
|
+
.SS Standard Adblock Format
|
|
696
|
+
Default format: \fB||domain.com^\fR
|
|
697
|
+
.br
|
|
698
|
+
Compatible with uBlock Origin, AdBlock Plus, and other browser ad blockers.
|
|
699
|
+
|
|
700
|
+
.SS Privoxy Format
|
|
701
|
+
Flag: \fB\--privoxy\fR
|
|
702
|
+
.br
|
|
703
|
+
Format: \fB{ +block } .domain.com\fR
|
|
704
|
+
.br
|
|
705
|
+
For use in Privoxy action files. The leading dot blocks domain and all subdomains.
|
|
706
|
+
|
|
707
|
+
.SS Pi-hole Regex Format
|
|
708
|
+
Flag: \fB\--pihole\fR
|
|
709
|
+
.br
|
|
710
|
+
Format: \fB(^|\\.)domain\\.com$\fR
|
|
711
|
+
.br
|
|
712
|
+
For Pi-hole regex filters. Blocks domain and subdomains at DNS level.
|
|
713
|
+
|
|
714
|
+
.SS Hosts File Formats
|
|
715
|
+
Flags: \fB\--localhost\fR, \fB\--localhost-0.0.0.0\fR
|
|
716
|
+
.br
|
|
717
|
+
Formats: \fB127.0.0.1 domain.com\fR, \fB0.0.0.0 domain.com\fR
|
|
718
|
+
.br
|
|
719
|
+
For system hosts files.
|
|
720
|
+
|
|
721
|
+
.SS DNS Server Formats
|
|
722
|
+
Flags: \fB\--dnsmasq\fR, \fB\--dnsmasq-old\fR, \fB\--unbound\fR
|
|
723
|
+
.br
|
|
724
|
+
For dnsmasq and Unbound DNS servers.
|
|
725
|
+
|
|
726
|
+
.SS Plain Domain Format
|
|
727
|
+
Flag: \fB\--plain\fR
|
|
728
|
+
.br
|
|
729
|
+
Format: \fBdomain.com\fR
|
|
730
|
+
.br
|
|
731
|
+
Simple domain list without formatting.
|
|
732
|
+
|
|
733
|
+
.SH FILES
|
|
734
|
+
|
|
735
|
+
.TP
|
|
736
|
+
.B config.json
|
|
737
|
+
Default configuration file containing scan targets and rules.
|
|
738
|
+
|
|
739
|
+
.TP
|
|
740
|
+
.B logs/
|
|
741
|
+
Directory created for debug and matched URL logs when \fB\--debug\fR or \fB\--dumpurls\fR is used.
|
|
742
|
+
|
|
743
|
+
.TP
|
|
744
|
+
.B user.action
|
|
745
|
+
Common Privoxy action file when using \fB\--privoxy\fR output.
|
|
746
|
+
|
|
747
|
+
.SH DETECTION METHODS
|
|
748
|
+
|
|
749
|
+
.SS URL Pattern Matching
|
|
750
|
+
Uses regex patterns to identify suspicious domains and request URLs.
|
|
751
|
+
|
|
752
|
+
.SS Content Analysis
|
|
753
|
+
Downloads page content with curl and searches for malicious strings using JavaScript or grep.
|
|
754
|
+
|
|
755
|
+
.SS Network Tools Integration
|
|
756
|
+
Performs WHOIS and DNS lookups to identify suspicious domain registrations.
|
|
757
|
+
|
|
758
|
+
.SS Browser-Based Analysis
|
|
759
|
+
Uses Puppeteer to monitor network requests, analyze frames, and detect dynamic threats.
|
|
760
|
+
|
|
761
|
+
.SS Resource Type Filtering
|
|
762
|
+
Filters analysis by HTTP resource type (script, xhr, fetch, image, etc.).
|
|
763
|
+
|
|
764
|
+
.SS Similarity-Based Filtering
|
|
765
|
+
Automatically filters out domains similar to already-found domains or those in the ignore list, supporting 70+ international TLD formats.
|
|
766
|
+
|
|
767
|
+
.SH SECURITY FEATURES
|
|
768
|
+
|
|
769
|
+
.SS Fingerprint Spoofing
|
|
770
|
+
Randomizes browser fingerprints to avoid detection by malicious sites.
|
|
771
|
+
|
|
772
|
+
.SS Request Blocking
|
|
773
|
+
Blocks suspicious requests during scanning to prevent malware execution.
|
|
774
|
+
|
|
775
|
+
.SS Frame Isolation
|
|
776
|
+
Safely analyzes iframe content without executing malicious scripts.
|
|
777
|
+
|
|
778
|
+
.SS Cloudflare Bypass
|
|
779
|
+
Automatically handles Cloudflare protection challenges.
|
|
780
|
+
|
|
781
|
+
.SS FlowProxy Protection
|
|
782
|
+
Detects and handles FlowProxy protection systems.
|
|
783
|
+
|
|
784
|
+
.SS Intelligent Domain Filtering
|
|
785
|
+
Advanced similarity algorithms prevent duplicate detection across international domains and variations.
|
|
786
|
+
|
|
787
|
+
.SH EXIT STATUS
|
|
788
|
+
.TP
|
|
789
|
+
.B 0
|
|
790
|
+
Success. All URLs processed successfully.
|
|
791
|
+
.TP
|
|
792
|
+
.B 1
|
|
793
|
+
Error in configuration, file access, or critical failure.
|
|
794
|
+
|
|
795
|
+
.SH BUGS
|
|
796
|
+
Frame navigation errors may appear in debug output but do not affect detection functionality.
|
|
797
|
+
|
|
798
|
+
Report bugs to the project repository or maintainer.
|
|
799
|
+
|
|
800
|
+
.SH SEE ALSO
|
|
801
|
+
.BR curl (1),
|
|
802
|
+
.BR grep (1),
|
|
803
|
+
.BR whois (1),
|
|
804
|
+
.BR dig (1),
|
|
805
|
+
.BR dnsmasq (8),
|
|
806
|
+
.BR unbound (8),
|
|
807
|
+
.BR privoxy (8)
|
|
808
|
+
|
|
809
|
+
.SH AUTHORS
|
|
810
|
+
Written for malware research and network security analysis.
|
|
811
|
+
|
|
812
|
+
.SH COPYRIGHT
|
|
813
|
+
Copyright (C) 2025 Free Software Foundation, Inc.
|
|
814
|
+
This is free software; you can redistribute it and/or modify it under the
|
|
815
|
+
terms of the GNU General Public License as published by the Free Software
|
|
816
|
+
Foundation; either version 3 of the License, or (at your option) any later
|
|
817
|
+
version.
|
|
818
|
+
|
|
819
|
+
This program is distributed in the hope that it will be useful, but WITHOUT
|
|
820
|
+
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
|
|
821
|
+
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
|
|
822
|
+
|
|
823
|
+
You should have received a copy of the GNU General Public License along with
|
|
824
|
+
this program. If not, see <https://www.gnu.org/licenses/>.
|