@akotliar/sitemap-qa 1.0.0-alpha.4 → 1.0.0-alpha.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -11,6 +11,29 @@ Sitemap-QA is a command-line tool that automatically discovers, parses, and anal
11
11
 
12
12
  ---
13
13
 
14
+ ## šŸ“‘ Table of Contents
15
+
16
+ - [Why Sitemap-QA?](#-why-sitemap-qa)
17
+ - [Quick Start](#-quick-start)
18
+ - [Installation](#installation)
19
+ - [Basic Usage](#basic-usage)
20
+ - [Features](#-features)
21
+ - [Automatic Sitemap Discovery](#automatic-sitemap-discovery)
22
+ - [Risk Detection Patterns](#risk-detection-patterns)
23
+ - [Customizing Risks](#customizing-risks)
24
+ - [Output Formats](#output-formats)
25
+ - [CLI Commands](#-cli-commands)
26
+ - [analyze](#analyze-command)
27
+ - [init](#init-command)
28
+ - [Configuration](#-configuration)
29
+ - [Configuration Options](#configuration-options)
30
+
31
+ - [License](#-license)
32
+ - [Acknowledgments](#-acknowledgments)
33
+ - [Support](#-support)
34
+
35
+ ---
36
+
14
37
  ## šŸŽÆ Why Sitemap-QA?
15
38
 
16
39
  Unlike SEO-focused sitemap validators, Sitemap-QA is designed specifically for **QA validation and risk detection** using a **Policy-as-Code** approach:
@@ -18,6 +41,8 @@ Unlike SEO-focused sitemap validators, Sitemap-QA is designed specifically for *
18
41
  - āœ… **Detect environment leakage** — Find staging, dev, or test URLs that shouldn't be in production sitemaps
19
42
  - āœ… **Identify exposed admin paths** — Catch `/admin`, `/dashboard`, and internal routes in public indexes
20
43
  - āœ… **Flag sensitive files** — Detect database backups, environment files, and archives
44
+ - āœ… **Domain Consistency** — Automatically flag URLs that point to external or incorrect domains (handles `www.` normalization)
45
+ - āœ… **Acceptable Patterns (Allowlist)** — Exclude known safe URLs from being flagged as risks
21
46
  - āœ… **Fully Customizable** — Define your own risk categories and patterns using Literal, Glob, or Regex matching
22
47
  - āœ… **Fast and automated** — Analyze thousands of URLs in seconds with detailed reports
23
48
 
@@ -37,11 +62,17 @@ npm install -g @akotliar/sitemap-qa@alpha
37
62
  ### Basic Usage
38
63
 
39
64
  ```bash
40
- # Analyze a website's sitemap
65
+ # Step 1: Initialize a configuration file (optional but recommended)
66
+ sitemap-qa init
67
+
68
+ # Step 2: Analyze a website's sitemap
41
69
  sitemap-qa analyze https://example.com
42
70
 
43
- # Generate JSON output for CI/CD
44
- sitemap-qa analyze https://example.com --output json > report.json
71
+ # Generate JSON output only for CI/CD
72
+ sitemap-qa analyze https://example.com --output json
73
+
74
+ # Use a custom configuration file
75
+ sitemap-qa analyze https://example.com --config ./custom-config.yaml
45
76
  ```
46
77
 
47
78
  ---
@@ -63,19 +94,11 @@ The tool comes with a set of default policies, but you can fully customize them
63
94
  | **Security & Admin** | Detects exposed administrative interfaces and sensitive configuration files. | `**/admin/**`, `**/.env*`, `/wp-admin` |
64
95
  | **Environment Leakage** | Finds staging or development URLs that shouldn't be in production sitemaps. | `**/staging.**`, `**/dev.**` |
65
96
  | **Sensitive Files** | Flags database backups, archives, and other sensitive file types. | `**/*.{sql,bak,zip,tar}`, `**/*.tar.gz` |
97
+ | **Domain Consistency** | Detects URLs that don't match the target domain (ignoring `www.` differences). | `example.com` vs `other.com` |
66
98
 
67
99
  ### Customizing Risks
68
100
 
69
- You can add your own categories and patterns to the `sitemap-qa.yaml` file. Patterns support `literal`, `glob`, and `regex` matching.
70
-
71
- ```yaml
72
- policies:
73
- - category: "Internal API"
74
- patterns:
75
- - type: "glob"
76
- value: "**/api/v1/internal/**"
77
- reason: "Internal API version 1 should not be exposed."
78
- ```
101
+ You can add your own categories and patterns to the `sitemap-qa.yaml` file. Patterns support `literal`, `glob`, and `regex` matching. See the [Configuration](#-configuration) section for details.
79
102
 
80
103
 
81
104
  ### Output Formats
@@ -97,7 +120,8 @@ The HTML report provides an interactive, visually appealing view with:
97
120
  "summary": {
98
121
  "totalUrls": 895,
99
122
  "totalRisks": 2,
100
- "urlsWithRisksCount": 1
123
+ "urlsWithRisksCount": 1,
124
+ "ignoredUrlsCount": 5
101
125
  },
102
126
  "findings": [
103
127
  {
@@ -117,7 +141,44 @@ The HTML report provides an interactive, visually appealing view with:
117
141
 
118
142
  ---
119
143
 
120
- ## šŸ› ļø CLI Options
144
+ ## šŸ› ļø CLI Commands
145
+
146
+ Sitemap-QA provides two main commands: `init` and `analyze`.
147
+
148
+
149
+ ### init Command
150
+
151
+ Initialize a default `sitemap-qa.yaml` configuration file in the current directory.
152
+
153
+ ```
154
+ Usage: sitemap-qa init [options]
155
+
156
+ Initialize a default sitemap-qa.yaml configuration file
157
+
158
+ Options:
159
+ -h, --help Display help for command
160
+ ```
161
+
162
+ #### Example
163
+
164
+ ```bash
165
+ # Create a default configuration file
166
+ sitemap-qa init
167
+
168
+ # This creates sitemap-qa.yaml with:
169
+ # - Default risk policies (Security & Admin, Environment Leakage, Sensitive Files)
170
+ # - Example acceptable patterns
171
+ # - Default output settings
172
+ ```
173
+
174
+ **Note:** The `init` command will fail if `sitemap-qa.yaml` already exists in the current directory to prevent accidental overwrites.
175
+
176
+ ---
177
+
178
+
179
+ ### analyze Command
180
+
181
+ Analyze a website's sitemap for quality issues and security risks.
121
182
 
122
183
  ```
123
184
  Usage: sitemap-qa analyze <url> [options]
@@ -128,13 +189,13 @@ Arguments:
128
189
  url Base URL of the website to analyze
129
190
 
130
191
  Options:
131
- -c, --config <path> Path to sitemap-qa.yaml
192
+ -c, --config <path> Path to sitemap-qa.yaml configuration file
132
193
  -o, --output <format> Output format: json, html, or all (default: "all")
133
194
  -d, --out-dir <path> Output directory for reports (default: ".")
134
195
  -h, --help Display help for command
135
196
  ```
136
197
 
137
- ### Examples
198
+ #### Examples
138
199
 
139
200
  ```bash
140
201
  # Basic analysis with both HTML and JSON reports (default)
@@ -143,14 +204,18 @@ sitemap-qa analyze https://example.com
143
204
  # JSON output only
144
205
  sitemap-qa analyze https://example.com --output json
145
206
 
207
+ # HTML output only
208
+ sitemap-qa analyze https://example.com --output html
209
+
146
210
  # Custom output directory
147
211
  sitemap-qa analyze https://example.com --out-dir ./reports
148
212
 
149
213
  # Use a specific configuration file
150
214
  sitemap-qa analyze https://example.com --config ./custom-config.yaml
151
- ```
152
215
 
153
- ---
216
+ # Combine options
217
+ sitemap-qa analyze https://example.com --config ./custom-config.yaml --output json --out-dir ./reports
218
+ ```
154
219
 
155
220
  ## šŸ”§ Configuration
156
221
 
@@ -161,8 +226,17 @@ Create a `sitemap-qa.yaml` file in your project root to define your monitoring p
161
226
  # Default outDir is "."; this example uses a custom reports directory
162
227
  outDir: "./sitemap-qa/report" # custom output directory
163
228
  outputFormat: "all" # Options: json, html, all
229
+ enforceDomainConsistency: true # Flag URLs from other domains
164
230
 
165
231
  # Monitoring Policies
232
+ acceptable_patterns:
233
+ - type: "literal"
234
+ value: "/acceptable-path"
235
+ reason: "Example of an acceptable path that should not be flagged."
236
+ - type: "glob"
237
+ value: "**/public-docs/**"
238
+ reason: "Public documentation is always acceptable."
239
+
166
240
  policies:
167
241
  - category: "Security & Admin"
168
242
  patterns:
@@ -183,35 +257,16 @@ policies:
183
257
  |--------|------|---------|-------------|
184
258
  | `outDir` | string | `"."` | Directory for generated reports (current working directory by default) |
185
259
  | `outputFormat` | string | `"all"` | Report types to generate: `json`, `html`, or `all` |
260
+ | `enforceDomainConsistency` | boolean | `true` | If true, flags URLs that don't match the root sitemap domain (ignoring `www.`) |
261
+ | `acceptable_patterns` | array | `[]` | List of patterns to exclude from risk analysis |
186
262
  | `policies` | array | `[]` | List of monitoring policies with patterns |
187
263
 
188
- > Note: The earlier `sitemap-qa.yaml` example sets `outDir: "./sitemap-qa/report"` as a recommended path. If you omit `outDir`, the default is `"."` (the current working directory).
189
- ### Policy Patterns
190
264
 
191
- Define patterns to detect risks in your sitemaps:
265
+ **Priority:** CLI options > Project config (`sitemap-qa.yaml`) > Defaults
192
266
 
193
- ```yaml
194
- policies:
195
- - category: "Custom Rules"
196
- patterns:
197
- - type: "literal"
198
- value: "test"
199
- reason: "Test URL found"
200
- - type: "glob"
201
- value: "**/internal/*"
202
- reason: "Internal path exposed"
203
- - type: "regex"
204
- value: "api/v[0-9]/"
205
- reason: "API versioning detected"
206
- ```
207
267
 
208
- **Rule Types:**
209
- - `literal`: Exact string match
210
- - `glob`: Wildcard patterns (e.g., `**/admin/**`)
211
- - `regex`: Regular expression matching (patterns are YAML strings and must use proper escaping)
212
- - When defining regex patterns in `sitemap-qa.yaml`, remember they are YAML strings, so you must escape backslashes (for example, `".*\\\\.php$"` in YAML corresponds to the regex `.*\.php$`).
213
268
 
214
- **Priority:** CLI options > Project config (`sitemap-qa.yaml`) > Defaults
269
+ ---
215
270
 
216
271
  ## šŸ“ License
217
272
 
@@ -235,7 +290,7 @@ Built with:
235
290
 
236
291
  ## šŸ“§ Support
237
292
 
238
- - **Issues**: [GitHub Issues](https://github.com/akotliar/sitemap-qa/issues)-
293
+ - **Issues**: [GitHub Issues](https://github.com/akotliar/sitemap-qa/issues)
239
294
 
240
295
  ---
241
296
 
package/dist/index.js CHANGED
@@ -27,13 +27,18 @@ var PolicySchema = z.object({
27
27
  patterns: z.array(PatternSchema).min(1, "At least one pattern is required per category")
28
28
  });
29
29
  var ConfigSchema = z.object({
30
+ acceptable_patterns: z.array(PatternSchema).default([]),
30
31
  policies: z.array(PolicySchema).default([]),
31
32
  outDir: z.string().optional(),
32
- outputFormat: z.enum(["json", "html", "all"]).default("all")
33
+ outputFormat: z.enum(["json", "html", "all"]).default("all"),
34
+ enforceDomainConsistency: z.boolean().default(true)
33
35
  });
34
36
 
35
37
  // src/config/defaults.ts
36
38
  var DEFAULT_POLICIES = {
39
+ acceptable_patterns: [],
40
+ outputFormat: "all",
41
+ enforceDomainConsistency: true,
37
42
  policies: [
38
43
  {
39
44
  category: "Security & Admin",
@@ -125,6 +130,7 @@ var ConfigLoader = class {
125
130
  });
126
131
  const merged = {
127
132
  ...defaults,
133
+ acceptable_patterns: [...defaults.acceptable_patterns || [], ...user.acceptable_patterns || []],
128
134
  policies: mergedPolicies
129
135
  };
130
136
  if (user.outDir !== void 0) {
@@ -133,6 +139,9 @@ var ConfigLoader = class {
133
139
  if (user.outputFormat !== void 0) {
134
140
  merged.outputFormat = user.outputFormat;
135
141
  }
142
+ if (user.enforceDomainConsistency !== void 0) {
143
+ merged.enforceDomainConsistency = user.enforceDomainConsistency;
144
+ }
136
145
  return merged;
137
146
  }
138
147
  };
@@ -191,6 +200,7 @@ var DiscoveryService = class {
191
200
  }
192
201
  /**
193
202
  * Recursively discovers all leaf sitemaps from a root URL.
203
+ * Returns both the sitemap URL and its XML data to avoid duplicate fetches.
194
204
  */
195
205
  async *discover(rootUrl) {
196
206
  const queue = [rootUrl];
@@ -211,7 +221,7 @@ var DiscoveryService = class {
211
221
  }
212
222
  }
213
223
  } else if (jsonObj.urlset) {
214
- yield currentUrl;
224
+ yield { url: currentUrl, xmlData };
215
225
  }
216
226
  } catch (error) {
217
227
  console.error(`Failed to fetch or parse sitemap at ${currentUrl}:`, error);
@@ -233,15 +243,22 @@ var SitemapParser = class {
233
243
  }
234
244
  /**
235
245
  * Parses a leaf sitemap and yields SitemapUrl objects.
246
+ * Can accept either a URL to fetch or pre-fetched XML data with the source URL.
236
247
  * Note: For true streaming of massive files, we'd use a SAX-like approach.
237
248
  * fast-xml-parser's parse() is fast but loads the whole string.
238
249
  * Given the 50k URL requirement, we'll use a more memory-efficient approach if needed,
239
250
  * but let's start with a clean AsyncGenerator interface.
240
251
  */
241
- async *parse(sitemapUrl) {
252
+ async *parse(sitemapUrlOrData) {
253
+ let sitemapUrl = typeof sitemapUrlOrData === "string" ? sitemapUrlOrData : sitemapUrlOrData.url;
242
254
  try {
243
- const response = await fetch2(sitemapUrl);
244
- const xmlData = await response.text();
255
+ let xmlData;
256
+ if (typeof sitemapUrlOrData === "string") {
257
+ const response = await fetch2(sitemapUrl);
258
+ xmlData = await response.text();
259
+ } else {
260
+ xmlData = sitemapUrlOrData.xmlData;
261
+ }
245
262
  const jsonObj = this.parser.parse(xmlData);
246
263
  if (jsonObj.urlset && jsonObj.urlset.url) {
247
264
  const urls = Array.isArray(jsonObj.urlset.url) ? jsonObj.urlset.url : [jsonObj.urlset.url];
@@ -308,9 +325,9 @@ var ExtractorService = class {
308
325
  }
309
326
  }
310
327
  for (const startUrl of startUrls) {
311
- for await (const sitemapUrl of this.discovery.discover(startUrl)) {
312
- this.discoveredSitemaps.add(sitemapUrl);
313
- for await (const urlObj of this.parser.parse(sitemapUrl)) {
328
+ for await (const discovered of this.discovery.discover(startUrl)) {
329
+ this.discoveredSitemaps.add(discovered.url);
330
+ for await (const urlObj of this.parser.parse(discovered)) {
314
331
  const normalized = this.normalizeUrl(urlObj.loc);
315
332
  if (!this.seenUrls.has(normalized)) {
316
333
  this.seenUrls.add(normalized);
@@ -326,14 +343,42 @@ var ExtractorService = class {
326
343
  import micromatch from "micromatch";
327
344
  var MatcherService = class {
328
345
  config;
329
- constructor(config) {
346
+ rootDomain;
347
+ constructor(config, rootUrl) {
330
348
  this.config = config;
349
+ if (rootUrl) {
350
+ try {
351
+ this.rootDomain = new URL(rootUrl).hostname.replace(/^www\./, "");
352
+ } catch {
353
+ }
354
+ }
331
355
  }
332
356
  /**
333
357
  * Matches a URL against all policies and returns detected risks.
334
358
  */
335
359
  match(urlObj) {
336
360
  const risks = [];
361
+ if (this.config.enforceDomainConsistency && this.rootDomain) {
362
+ try {
363
+ const currentDomain = new URL(urlObj.loc).hostname.replace(/^www\./, "");
364
+ if (currentDomain !== this.rootDomain) {
365
+ risks.push({
366
+ category: "Domain Consistency",
367
+ pattern: this.rootDomain,
368
+ type: "literal",
369
+ reason: `URL domain mismatch: expected ${this.rootDomain} (or www.${this.rootDomain}), but found ${currentDomain}.`
370
+ });
371
+ }
372
+ } catch {
373
+ }
374
+ }
375
+ for (const pattern of this.config.acceptable_patterns) {
376
+ if (this.isMatch(urlObj.loc, pattern)) {
377
+ urlObj.ignored = true;
378
+ urlObj.ignoredBy = pattern.reason;
379
+ return risks;
380
+ }
381
+ }
337
382
  for (const policy of this.config.policies) {
338
383
  for (const pattern of policy.patterns) {
339
384
  if (this.isMatch(urlObj.loc, pattern)) {
@@ -375,6 +420,7 @@ var ConsoleReporter = class {
375
420
  console.log(`Total URLs Scanned: ${data.totalUrls}`);
376
421
  console.log(`Total Risks Found: ${data.totalRisks > 0 ? chalk2.red(data.totalRisks) : chalk2.green(0)}`);
377
422
  console.log(`URLs with Risks: ${data.urlsWithRisks.length}`);
423
+ console.log(`URLs Ignored: ${data.ignoredUrls.length > 0 ? chalk2.yellow(data.ignoredUrls.length) : 0}`);
378
424
  console.log(`Duration: ${((data.endTime.getTime() - data.startTime.getTime()) / 1e3).toFixed(2)}s`);
379
425
  if (data.urlsWithRisks.length > 0) {
380
426
  console.log("\n" + chalk2.bold.yellow("Top Findings:"));
@@ -410,9 +456,11 @@ var JsonReporter = class {
410
456
  summary: {
411
457
  totalUrls: data.totalUrls,
412
458
  totalRisks: data.totalRisks,
413
- urlsWithRisksCount: data.urlsWithRisks.length
459
+ urlsWithRisksCount: data.urlsWithRisks.length,
460
+ ignoredUrlsCount: data.ignoredUrls.length
414
461
  },
415
- findings: data.urlsWithRisks
462
+ findings: data.urlsWithRisks,
463
+ ignored: data.ignoredUrls
416
464
  };
417
465
  await fs2.writeFile(this.outputPath, JSON.stringify(report, null, 2), "utf8");
418
466
  console.log(`JSON report generated at ${this.outputPath}`);
@@ -453,13 +501,14 @@ var HtmlReporter = class {
453
501
  generateHtml(data, categories) {
454
502
  const duration = ((data.endTime.getTime() - data.startTime.getTime()) / 1e3).toFixed(1);
455
503
  const timestamp = data.endTime.toLocaleString();
504
+ const esc = this.escapeHtml.bind(this);
456
505
  return `
457
506
  <!DOCTYPE html>
458
507
  <html lang="en">
459
508
  <head>
460
509
  <meta charset="UTF-8">
461
510
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
462
- <title>Sitemap Analysis - ${data.rootUrl}</title>
511
+ <title>Sitemap Analysis - ${esc(data.rootUrl)}</title>
463
512
  <style>
464
513
  :root {
465
514
  --bg-dark: #0f172a;
@@ -495,7 +544,7 @@ var HtmlReporter = class {
495
544
 
496
545
  .summary-grid {
497
546
  display: grid;
498
- grid-template-columns: repeat(4, 1fr);
547
+ grid-template-columns: repeat(5, 1fr);
499
548
  border-bottom: 1px solid var(--border);
500
549
  margin-bottom: 40px;
501
550
  }
@@ -647,8 +696,8 @@ var HtmlReporter = class {
647
696
  <div class="container">
648
697
  <h1>Sitemap Analysis</h1>
649
698
  <div class="meta">
650
- <div>${data.rootUrl}</div>
651
- <div>${timestamp}</div>
699
+ <div>${esc(data.rootUrl)}</div>
700
+ <div>${esc(timestamp)}</div>
652
701
  </div>
653
702
  </div>
654
703
  </header>
@@ -666,6 +715,10 @@ var HtmlReporter = class {
666
715
  <h3>Issues Found</h3>
667
716
  <p>${data.totalRisks}</p>
668
717
  </div>
718
+ <div class="summary-card">
719
+ <h3>URLs Ignored</h3>
720
+ <p>${data.ignoredUrls.length}</p>
721
+ </div>
669
722
  <div class="summary-card">
670
723
  <h3>Scan Time</h3>
671
724
  <p>${duration}s</p>
@@ -676,37 +729,50 @@ var HtmlReporter = class {
676
729
  <details>
677
730
  <summary>Sitemaps Discovered (${data.discoveredSitemaps.length})</summary>
678
731
  <div style="padding: 20px; background: var(--bg-light);">
679
- ${data.discoveredSitemaps.map((s) => `<div class="url-item">${s}</div>`).join("")}
732
+ ${data.discoveredSitemaps.map((s) => `<div class="url-item">${esc(s)}</div>`).join("")}
733
+ </div>
734
+ </details>
735
+
736
+ ${data.ignoredUrls.length > 0 ? `
737
+ <details>
738
+ <summary>Ignored URLs (${data.ignoredUrls.length})</summary>
739
+ <div style="padding: 20px; background: var(--bg-light);">
740
+ ${data.ignoredUrls.map((u) => {
741
+ const suppressedRisks = u.risks.length > 0 ? ` <span style="color: var(--danger); font-size: 11px; font-weight: bold;">[Suppressed Risks: ${[...new Set(u.risks.map((r) => r.category))].map(esc).join(", ")}]</span>` : "";
742
+ const ignoredBy = u.ignoredBy ?? "Unknown";
743
+ return `<div class="url-item" title="Ignored by: ${esc(ignoredBy)}">${esc(u.loc)} <span style="color: var(--text-muted); font-size: 11px;">(by ${esc(ignoredBy)})</span>${suppressedRisks}</div>`;
744
+ }).join("")}
680
745
  </div>
681
746
  </details>
747
+ ` : ""}
682
748
 
683
749
  ${Object.entries(categories).map(([category, findings]) => {
684
750
  const totalCategoryUrls = Object.values(findings).reduce((acc, f) => acc + f.urls.length, 0);
685
751
  return `
686
752
  <div class="category-section">
687
753
  <div class="category-header">
688
- <span>${category} (${totalCategoryUrls} URLs)</span>
754
+ <span>${esc(category)} (${totalCategoryUrls} URLs)</span>
689
755
  <span>\u25BC</span>
690
756
  </div>
691
757
  <div class="category-content">
692
758
  ${Object.entries(findings).map(([pattern, finding]) => `
693
759
  <div class="finding-group">
694
760
  <div class="finding-header">
695
- <h4>${pattern}</h4>
761
+ <h4>${esc(pattern)}</h4>
696
762
  <span class="badge">${finding.urls.length} URLs</span>
697
763
  </div>
698
764
  <div class="finding-description">
699
- ${finding.reason}
765
+ ${esc(finding.reason)}
700
766
  </div>
701
767
  <div class="url-list">
702
768
  ${finding.urls.slice(0, 3).map((url) => `
703
- <div class="url-item">${url}</div>
769
+ <div class="url-item">${esc(url)}</div>
704
770
  `).join("")}
705
771
  </div>
706
772
  ${finding.urls.length > 3 ? `
707
773
  <div class="more-count">... and ${finding.urls.length - 3} more</div>
708
774
  ` : ""}
709
- <a href="#" class="btn" onclick="downloadUrls('${pattern}', ${JSON.stringify(finding.urls).replace(/"/g, "&quot;")})">
775
+ <a href="#" class="btn" onclick="downloadUrls(${JSON.stringify(pattern).replace(/"/g, "&quot;")}, ${JSON.stringify(finding.urls).replace(/"/g, "&quot;")})">
710
776
  \u{1F4E5} Download All ${finding.urls.length} URLs
711
777
  </a>
712
778
  </div>
@@ -738,6 +804,9 @@ var HtmlReporter = class {
738
804
  </html>
739
805
  `;
740
806
  }
807
+ escapeHtml(str) {
808
+ return str.replace(/&/g, "&amp;").replace(/</g, "&lt;").replace(/>/g, "&gt;").replace(/"/g, "&quot;").replace(/'/g, "&#039;");
809
+ }
741
810
  };
742
811
 
743
812
  // src/commands/analyze.ts
@@ -747,12 +816,13 @@ var analyzeCommand = new Command("analyze").description("Analyze a sitemap for p
747
816
  const outDir = options.outDir || config.outDir || ".";
748
817
  const outputFormat = options.output || config.outputFormat || "all";
749
818
  const extractor = new ExtractorService();
750
- const matcher = new MatcherService(config);
819
+ const matcher = new MatcherService(config, url);
751
820
  const urlsWithRisks = [];
821
+ const ignoredUrls = [];
752
822
  let totalUrls = 0;
753
823
  let totalRisks = 0;
754
824
  console.log(chalk3.blue(`
755
- \uFFFD\uFFFD\uFFFD Starting analysis of ${url}...`));
825
+ \u{1F680} Starting analysis of ${url}...`));
756
826
  try {
757
827
  for await (const urlObj of extractor.extract(url)) {
758
828
  totalUrls++;
@@ -761,6 +831,8 @@ var analyzeCommand = new Command("analyze").description("Analyze a sitemap for p
761
831
  urlObj.risks = risks;
762
832
  urlsWithRisks.push(urlObj);
763
833
  totalRisks += risks.length;
834
+ } else if (urlObj.ignored) {
835
+ ignoredUrls.push(urlObj);
764
836
  }
765
837
  if (totalUrls % 100 === 0) {
766
838
  process.stdout.write(chalk3.gray(`\rProcessed ${totalUrls} URLs...`));
@@ -774,6 +846,7 @@ var analyzeCommand = new Command("analyze").description("Analyze a sitemap for p
774
846
  totalUrls,
775
847
  totalRisks,
776
848
  urlsWithRisks,
849
+ ignoredUrls,
777
850
  startTime,
778
851
  endTime
779
852
  };
@@ -809,6 +882,11 @@ import chalk4 from "chalk";
809
882
  var DEFAULT_CONFIG = `# sitemap-qa configuration
810
883
  # This file defines the risk categories and patterns to monitor.
811
884
 
885
+ # Tool Settings
886
+ outDir: "./sitemap-qa/report"
887
+ outputFormat: "all" # Options: json, html, all
888
+ enforceDomainConsistency: true
889
+
812
890
  # Risk Categories
813
891
  # Each category contains a list of patterns to match against URLs found in sitemaps.
814
892
  # Patterns can be:
@@ -816,6 +894,16 @@ var DEFAULT_CONFIG = `# sitemap-qa configuration
816
894
  # - glob: Glob pattern (e.g., **/admin/**)
817
895
  # - regex: Regular expression (e.g., /\\/v[0-9]+\\//)
818
896
 
897
+ # Acceptable Patterns
898
+ # URLs matching these patterns will be ignored and not flagged as risks.
899
+ acceptable_patterns:
900
+ - type: "literal"
901
+ value: "/acceptable-path"
902
+ reason: "Example of an acceptable path that should not be flagged."
903
+ - type: "glob"
904
+ value: "**/public-docs/**"
905
+ reason: "Public documentation is always acceptable."
906
+
819
907
  policies:
820
908
  - category: "Security & Admin"
821
909
  patterns:
package/dist/index.js.map CHANGED
@@ -1 +1 @@
1
- {"version":3,"sources":["../src/index.ts","../src/commands/analyze.ts","../src/config/loader.ts","../src/config/schema.ts","../src/config/defaults.ts","../src/core/discovery.ts","../src/core/parser.ts","../src/core/extractor.ts","../src/core/matcher.ts","../src/reporters/console-reporter.ts","../src/reporters/json-reporter.ts","../src/reporters/html-reporter.ts","../src/commands/init.ts"],"sourcesContent":["#!/usr/bin/env node\r\nimport { Command } from 'commander';\r\nimport { analyzeCommand } from '@/commands/analyze';\r\nimport { initCommand } from '@/commands/init';\r\n\r\nconst program = new Command();\r\n\r\nprogram\r\n .name('sitemap-qa')\r\n .version('1.0.0')\r\n .description('sitemap analysis for QA teams');\r\n\r\nprogram.addCommand(analyzeCommand);\r\nprogram.addCommand(initCommand);\r\n\r\n// Global error handler\r\nprocess.on('unhandledRejection', (reason, promise) => {\r\n console.error('Unhandled Rejection at:', promise, 'reason:', reason);\r\n process.exit(1);\r\n});\r\n\r\n// Graceful shutdown handlers\r\nprocess.on('SIGINT', () => {\r\n console.log('\\nGracefully shutting down...');\r\n process.exit(0);\r\n});\r\n\r\nprocess.on('SIGTERM', () => {\r\n console.log('\\nGracefully shutting down...');\r\n process.exit(0);\r\n});\r\n\r\nprogram.parse();\r\n","import { Command } from 'commander';\r\nimport chalk from 'chalk';\r\nimport path from 'node:path';\r\nimport fs from 'node:fs/promises';\r\nimport { ConfigLoader } from '../config/loader';\r\nimport { ExtractorService } from '../core/extractor';\r\nimport { MatcherService } from '../core/matcher';\r\nimport { ConsoleReporter } from '../reporters/console-reporter';\r\nimport { JsonReporter } from '../reporters/json-reporter';\r\nimport { HtmlReporter } from '../reporters/html-reporter';\r\nimport { ReportData, Reporter } from '../reporters/base';\r\nimport { SitemapUrl } from '../types/sitemap';\r\n\r\nexport const analyzeCommand = new Command('analyze')\r\n .description('Analyze a sitemap for potential risks')\r\n .argument('<url>', 'Root sitemap URL')\r\n .option('-c, --config <path>', 'Path to sitemap-qa.yaml')\r\n .option('-o, --output <format>', 'Output format (json, html, all)')\r\n .option('-d, --out-dir <path>', 'Directory to save reports')\r\n .action(async (url: string, options: { config?: string; output?: string; outDir?: string }) => {\r\n const startTime = new Date();\r\n \r\n // 1. Load Config\r\n const config = ConfigLoader.load(options.config);\r\n const outDir = options.outDir || config.outDir || '.';\r\n const outputFormat = options.output || config.outputFormat || 'all';\r\n \r\n // 2. Initialize Services\r\n const extractor = new ExtractorService();\r\n const matcher = new MatcherService(config);\r\n \r\n const urlsWithRisks: SitemapUrl[] = [];\r\n let totalUrls = 0;\r\n let totalRisks = 0;\r\n\r\n console.log(chalk.blue(`\\n��� Starting analysis of ${url}...`));\r\n\r\n try {\r\n // 3. Pipeline: Extract -> Match\r\n for await (const urlObj of extractor.extract(url)) {\r\n totalUrls++;\r\n const risks = matcher.match(urlObj);\r\n \r\n if (risks.length > 0) {\r\n urlObj.risks = risks;\r\n urlsWithRisks.push(urlObj);\r\n totalRisks += risks.length;\r\n }\r\n\r\n if (totalUrls % 100 === 0) {\r\n process.stdout.write(chalk.gray(`\\rProcessed ${totalUrls} URLs...`));\r\n }\r\n }\r\n process.stdout.write('\\n');\r\n\r\n const endTime = new Date();\r\n const reportData: ReportData = {\r\n rootUrl: url,\r\n discoveredSitemaps: extractor.getDiscoveredSitemaps(),\r\n totalUrls,\r\n totalRisks,\r\n urlsWithRisks,\r\n startTime,\r\n endTime,\r\n };\r\n\r\n // 4. Reporting\r\n const reporters: Reporter[] = [new ConsoleReporter()];\r\n \r\n await fs.mkdir(outDir, { recursive: true });\r\n\r\n if (outputFormat === 'json' || outputFormat === 'all') {\r\n const jsonPath = path.join(outDir, 'sitemap-qa-report.json');\r\n reporters.push(new JsonReporter(jsonPath));\r\n }\r\n if (outputFormat === 'html' || outputFormat === 'all') {\r\n const htmlPath = path.join(outDir, 'sitemap-qa-report.html');\r\n reporters.push(new HtmlReporter(htmlPath));\r\n }\r\n\r\n for (const reporter of reporters) {\r\n await reporter.generate(reportData);\r\n }\r\n\r\n // 5. Exit Code\r\n if (totalRisks > 0) {\r\n process.exit(1);\r\n } else {\r\n process.exit(0);\r\n }\r\n\r\n } catch (error) {\r\n console.error(chalk.red('\\nAnalysis failed:'), error);\r\n process.exit(1);\r\n }\r\n });\r\n","import fs from 'node:fs';\r\nimport path from 'node:path';\r\nimport yaml from 'js-yaml';\r\nimport { ConfigSchema, type Config } from './schema';\r\nimport { DEFAULT_POLICIES } from './defaults';\r\nimport chalk from 'chalk';\r\n\r\nexport class ConfigLoader {\r\n private static readonly DEFAULT_CONFIG_PATH = 'sitemap-qa.yaml';\r\n\r\n static load(configPath?: string): Config {\r\n const targetPath = configPath || path.join(process.cwd(), this.DEFAULT_CONFIG_PATH);\r\n let userConfig: Config = { policies: [] };\r\n\r\n // Load YAML config\r\n if (fs.existsSync(targetPath)) {\r\n try {\r\n const fileContent = fs.readFileSync(targetPath, 'utf8');\r\n const parsedYaml = yaml.load(fileContent);\r\n \r\n const result = ConfigSchema.safeParse(parsedYaml);\r\n \r\n if (!result.success) {\r\n console.error(chalk.red('Configuration Validation Error:'));\r\n result.error.issues.forEach((issue) => {\r\n console.error(chalk.yellow(` - ${issue.path.join('.')}: ${issue.message}`));\r\n });\r\n process.exit(2);\r\n }\r\n\r\n userConfig = result.data;\r\n } catch (error) {\r\n console.error(chalk.red('Failed to load configuration:'), error);\r\n process.exit(2);\r\n }\r\n } else if (configPath) {\r\n console.error(chalk.red(`Error: Configuration file not found at ${targetPath}`));\r\n process.exit(2);\r\n }\r\n\r\n return this.mergeConfigs(DEFAULT_POLICIES, userConfig);\r\n }\r\n\r\n private static mergeConfigs(defaults: Config, user: Config): Config {\r\n const mergedPolicies = [...defaults.policies];\r\n\r\n user.policies.forEach((userPolicy) => {\r\n const index = mergedPolicies.findIndex(p => p.category === userPolicy.category);\r\n if (index !== -1) {\r\n // Replace default category with user category (precedence)\r\n mergedPolicies[index] = userPolicy;\r\n } else {\r\n // Add new user category\r\n mergedPolicies.push(userPolicy);\r\n }\r\n });\r\n\r\n // Start from defaults, then apply merged policies and any user-specified top-level options\r\n const merged: Config = {\r\n ...defaults,\r\n policies: mergedPolicies,\r\n };\r\n\r\n if (user.outDir !== undefined) {\r\n merged.outDir = user.outDir;\r\n }\r\n\r\n if (user.outputFormat !== undefined) {\r\n merged.outputFormat = user.outputFormat;\r\n }\r\n\r\n return merged;\r\n }\r\n}\r\n","import { z } from 'zod';\r\n\r\nexport const PatternTypeSchema = z.enum(['literal', 'glob', 'regex']);\r\n\r\nexport const PatternSchema = z.object({\r\n type: PatternTypeSchema,\r\n value: z.string().min(1, \"Pattern value cannot be empty\"),\r\n reason: z.string().min(1, \"Reason is mandatory for each pattern\"),\r\n});\r\n\r\nexport const PolicySchema = z.object({\r\n category: z.string().min(1, \"Category name is mandatory\"),\r\n patterns: z.array(PatternSchema).min(1, \"At least one pattern is required per category\"),\r\n});\r\n\r\nexport const ConfigSchema = z.object({\r\n policies: z.array(PolicySchema).default([]),\r\n outDir: z.string().optional(),\r\n outputFormat: z.enum(['json', 'html', 'all']).default('all'),\r\n});\r\n\r\nexport type Config = z.infer<typeof ConfigSchema>;\r\nexport type Policy = z.infer<typeof PolicySchema>;\r\nexport type Pattern = z.infer<typeof PatternSchema>;\r\nexport type PatternType = z.infer<typeof PatternTypeSchema>;\r\n","import { type Config } from './schema';\r\n\r\nexport const DEFAULT_POLICIES: Config = {\r\n policies: [\r\n {\r\n category: \"Security & Admin\",\r\n patterns: [\r\n {\r\n type: \"glob\",\r\n value: \"**/admin/**\",\r\n reason: \"Administrative interfaces should not be publicly indexed.\"\r\n },\r\n {\r\n type: \"glob\",\r\n value: \"**/.env*\",\r\n reason: \"Environment files contain sensitive secrets.\"\r\n },\r\n {\r\n type: \"literal\",\r\n value: \"/wp-admin\",\r\n reason: \"WordPress admin paths are common attack vectors.\"\r\n }\r\n ]\r\n },\r\n {\r\n category: \"Environment Leakage\",\r\n patterns: [\r\n {\r\n type: \"glob\",\r\n value: \"**/staging.**\",\r\n reason: \"Staging environments should be restricted.\"\r\n },\r\n {\r\n type: \"glob\",\r\n value: \"**/dev.**\",\r\n reason: \"Development subdomains detected in production sitemap.\"\r\n }\r\n ]\r\n },\r\n {\r\n category: \"Sensitive Files\",\r\n patterns: [\r\n {\r\n type: \"glob\",\r\n value: \"**/*.{sql,bak,zip,tar.gz}\",\r\n reason: \"Archive or database backup files exposed.\"\r\n }\r\n ]\r\n }\r\n ]\r\n};\r\n","import { fetch } from 'undici';\r\nimport { XMLParser } from 'fast-xml-parser';\r\n\r\nexport class DiscoveryService {\r\n private readonly parser: XMLParser;\r\n private readonly visited = new Set<string>();\r\n private readonly STANDARD_PATHS = [\r\n '/sitemap.xml',\r\n '/sitemap_index.xml',\r\n '/sitemap-index.xml',\r\n '/sitemap.php',\r\n '/sitemap.xml.gz'\r\n ];\r\n\r\n constructor() {\r\n this.parser = new XMLParser({\r\n ignoreAttributes: false,\r\n attributeNamePrefix: \"@_\",\r\n });\r\n }\r\n\r\n /**\r\n * Attempts to find sitemaps for a given base website URL.\r\n */\r\n async findSitemaps(baseUrl: string): Promise<string[]> {\r\n const sitemaps = new Set<string>();\r\n const url = new URL(baseUrl);\r\n const origin = url.origin;\r\n\r\n // 1. Try robots.txt\r\n try {\r\n const robotsUrl = `${origin}/robots.txt`;\r\n const response = await fetch(robotsUrl);\r\n if (response.status === 200) {\r\n const text = await response.text();\r\n const matches = text.matchAll(/^Sitemap:\\s*(.+)$/gim);\r\n for (const match of matches) {\r\n if (match[1]) sitemaps.add(match[1].trim());\r\n }\r\n }\r\n } catch (e) {\r\n // Ignore robots.txt errors\r\n }\r\n\r\n // 2. Try standard paths if none found in robots.txt\r\n if (sitemaps.size === 0) {\r\n for (const path of this.STANDARD_PATHS) {\r\n try {\r\n const sitemapUrl = `${origin}${path}`;\r\n const response = await fetch(sitemapUrl, { method: 'HEAD' });\r\n if (response.status === 200) {\r\n sitemaps.add(sitemapUrl);\r\n }\r\n } catch (e) {\r\n // Ignore path errors\r\n }\r\n }\r\n }\r\n\r\n return Array.from(sitemaps);\r\n }\r\n\r\n /**\r\n * Recursively discovers all leaf sitemaps from a root URL.\r\n */\r\n async *discover(rootUrl: string): AsyncGenerator<string> {\r\n const queue: string[] = [rootUrl];\r\n\r\n while (queue.length > 0) {\r\n const currentUrl = queue.shift()!;\r\n if (this.visited.has(currentUrl)) continue;\r\n this.visited.add(currentUrl);\r\n\r\n try {\r\n const response = await fetch(currentUrl);\r\n if (response.status !== 200) continue;\r\n \r\n const xmlData = await response.text();\r\n const jsonObj = this.parser.parse(xmlData);\r\n\r\n if (jsonObj.sitemapindex) {\r\n const sitemaps = Array.isArray(jsonObj.sitemapindex.sitemap)\r\n ? jsonObj.sitemapindex.sitemap\r\n : [jsonObj.sitemapindex.sitemap];\r\n\r\n for (const sitemap of sitemaps) {\r\n if (sitemap?.loc) {\r\n queue.push(sitemap.loc);\r\n }\r\n }\r\n } else if (jsonObj.urlset) {\r\n // This is a leaf sitemap\r\n yield currentUrl;\r\n }\r\n } catch (error) {\r\n console.error(`Failed to fetch or parse sitemap at ${currentUrl}:`, error);\r\n }\r\n }\r\n }\r\n}\r\n","import { XMLParser } from 'fast-xml-parser';\r\nimport { fetch } from 'undici';\r\nimport { SitemapUrl } from '../types/sitemap';\r\n\r\nexport class SitemapParser {\r\n private readonly parser: XMLParser;\r\n\r\n constructor() {\r\n this.parser = new XMLParser({\r\n ignoreAttributes: false,\r\n attributeNamePrefix: \"@_\",\r\n });\r\n }\r\n\r\n /**\r\n * Parses a leaf sitemap and yields SitemapUrl objects.\r\n * Note: For true streaming of massive files, we'd use a SAX-like approach.\r\n * fast-xml-parser's parse() is fast but loads the whole string.\r\n * Given the 50k URL requirement, we'll use a more memory-efficient approach if needed,\r\n * but let's start with a clean AsyncGenerator interface.\r\n */\r\n async *parse(sitemapUrl: string): AsyncGenerator<SitemapUrl> {\r\n try {\r\n const response = await fetch(sitemapUrl);\r\n const xmlData = await response.text();\r\n const jsonObj = this.parser.parse(xmlData);\r\n\r\n if (jsonObj.urlset && jsonObj.urlset.url) {\r\n const urls = Array.isArray(jsonObj.urlset.url)\r\n ? jsonObj.urlset.url\r\n : [jsonObj.urlset.url];\r\n\r\n for (const url of urls) {\r\n if (url.loc) {\r\n yield {\r\n loc: url.loc,\r\n source: sitemapUrl,\r\n lastmod: url.lastmod,\r\n changefreq: url.changefreq,\r\n priority: url.priority,\r\n risks: [],\r\n };\r\n }\r\n }\r\n }\r\n } catch (error) {\r\n console.error(`Failed to parse sitemap at ${sitemapUrl}:`, error);\r\n }\r\n }\r\n}\r\n","import { DiscoveryService } from './discovery';\r\nimport { SitemapParser } from './parser';\r\nimport { SitemapUrl } from '../types/sitemap';\r\n\r\nexport class ExtractorService {\r\n private readonly discovery: DiscoveryService;\r\n private readonly parser: SitemapParser;\r\n private readonly seenUrls = new Set<string>();\r\n private readonly discoveredSitemaps = new Set<string>();\r\n\r\n constructor() {\r\n this.discovery = new DiscoveryService();\r\n this.parser = new SitemapParser();\r\n }\r\n\r\n /**\r\n * Returns the list of sitemaps discovered during the extraction process.\r\n */\r\n getDiscoveredSitemaps(): string[] {\r\n return Array.from(this.discoveredSitemaps);\r\n }\r\n\r\n /**\r\n * Normalizes a URL by removing trailing slashes and converting to lowercase.\r\n */\r\n private normalizeUrl(url: string): string {\r\n try {\r\n const parsed = new URL(url);\r\n let normalized = parsed.origin + parsed.pathname.replace(/\\/$/, '');\r\n if (parsed.search) normalized += parsed.search;\r\n return normalized.toLowerCase();\r\n } catch {\r\n return url.toLowerCase().replace(/\\/$/, '');\r\n }\r\n }\r\n\r\n /**\r\n * Extracts all unique URLs from a root sitemap URL or website base URL.\r\n */\r\n async *extract(inputUrl: string): AsyncGenerator<SitemapUrl> {\r\n let startUrls = [inputUrl];\r\n\r\n // If the URL doesn't end in .xml or .gz, it might be a website root\r\n if (!inputUrl.endsWith('.xml') && !inputUrl.endsWith('.gz')) {\r\n const discovered = await this.discovery.findSitemaps(inputUrl);\r\n if (discovered.length > 0) {\r\n console.log(`āœ… Discovered ${discovered.length} sitemap(s): ${discovered.join(', ')}`);\r\n startUrls = discovered;\r\n } else {\r\n console.log(`āš ļø No sitemaps discovered via robots.txt or standard paths. Proceeding with input URL.`);\r\n }\r\n }\r\n\r\n for (const startUrl of startUrls) {\r\n for await (const sitemapUrl of this.discovery.discover(startUrl)) {\r\n this.discoveredSitemaps.add(sitemapUrl);\r\n for await (const urlObj of this.parser.parse(sitemapUrl)) {\r\n const normalized = this.normalizeUrl(urlObj.loc);\r\n if (!this.seenUrls.has(normalized)) {\r\n this.seenUrls.add(normalized);\r\n yield urlObj;\r\n }\r\n }\r\n }\r\n }\r\n }\r\n}\r\n","import micromatch from 'micromatch';\r\nimport { type Config, type Pattern } from '../config/schema';\r\nimport { type SitemapUrl, type Risk } from '../types/sitemap';\r\n\r\nexport class MatcherService {\r\n private readonly config: Config;\r\n\r\n constructor(config: Config) {\r\n this.config = config;\r\n }\r\n\r\n /**\r\n * Matches a URL against all policies and returns detected risks.\r\n */\r\n match(urlObj: SitemapUrl): Risk[] {\r\n const risks: Risk[] = [];\r\n\r\n for (const policy of this.config.policies) {\r\n for (const pattern of policy.patterns) {\r\n if (this.isMatch(urlObj.loc, pattern)) {\r\n risks.push({\r\n category: policy.category,\r\n pattern: pattern.value,\r\n type: pattern.type,\r\n reason: pattern.reason,\r\n });\r\n }\r\n }\r\n }\r\n\r\n return risks;\r\n }\r\n\r\n private isMatch(url: string, pattern: Pattern): boolean {\r\n switch (pattern.type) {\r\n case 'literal':\r\n return url.includes(pattern.value);\r\n case 'glob':\r\n return micromatch.isMatch(url, pattern.value, { contains: true });\r\n case 'regex':\r\n try {\r\n const regex = new RegExp(pattern.value, 'i');\r\n return regex.test(url);\r\n } catch {\r\n return false;\r\n }\r\n default:\r\n return false;\r\n }\r\n }\r\n}\r\n","import chalk from 'chalk';\r\nimport { Reporter, ReportData } from './base';\r\n\r\nexport class ConsoleReporter implements Reporter {\r\n async generate(data: ReportData): Promise<void> {\r\n console.log('\\n' + chalk.bold.blue('=== sitemap-qa Analysis Summary ==='));\r\n console.log(`Total URLs Scanned: ${data.totalUrls}`);\r\n console.log(`Total Risks Found: ${data.totalRisks > 0 ? chalk.red(data.totalRisks) : chalk.green(0)}`);\r\n console.log(`URLs with Risks: ${data.urlsWithRisks.length}`);\r\n console.log(`Duration: ${((data.endTime.getTime() - data.startTime.getTime()) / 1000).toFixed(2)}s`);\r\n\r\n if (data.urlsWithRisks.length > 0) {\r\n console.log('\\n' + chalk.bold.yellow('Top Findings:'));\r\n data.urlsWithRisks.slice(0, 10).forEach((url) => {\r\n console.log(`\\n${chalk.cyan(url.loc)}`);\r\n url.risks.forEach((risk) => {\r\n console.log(` - [${chalk.red(risk.category)}] ${risk.reason} (${chalk.gray(risk.pattern)})`);\r\n });\r\n });\r\n\r\n if (data.urlsWithRisks.length > 10) {\r\n console.log(`\\n... and ${data.urlsWithRisks.length - 10} more. See JSON/HTML report for full details.`);\r\n }\r\n }\r\n\r\n console.log('\\n' + chalk.bold.blue('==================================='));\r\n }\r\n}\r\n","import fs from 'node:fs/promises';\r\nimport { Reporter, ReportData } from './base';\r\n\r\nexport class JsonReporter implements Reporter {\r\n private readonly outputPath: string;\r\n\r\n constructor(outputPath: string = 'sitemap-qa-report.json') {\r\n this.outputPath = outputPath;\r\n }\r\n\r\n async generate(data: ReportData): Promise<void> {\r\n const report = {\r\n metadata: {\r\n generatedAt: new Date().toISOString(),\r\n durationMs: data.endTime.getTime() - data.startTime.getTime(),\r\n },\r\n summary: {\r\n totalUrls: data.totalUrls,\r\n totalRisks: data.totalRisks,\r\n urlsWithRisksCount: data.urlsWithRisks.length,\r\n },\r\n findings: data.urlsWithRisks,\r\n };\r\n\r\n await fs.writeFile(this.outputPath, JSON.stringify(report, null, 2), 'utf8');\r\n console.log(`JSON report generated at ${this.outputPath}`);\r\n }\r\n}\r\n","import fs from 'node:fs/promises';\r\nimport { Reporter, ReportData } from './base';\r\n\r\nexport class HtmlReporter implements Reporter {\r\n private readonly outputPath: string;\r\n\r\n constructor(outputPath: string = 'sitemap-qa-report.html') {\r\n this.outputPath = outputPath;\r\n }\r\n\r\n async generate(data: ReportData): Promise<void> {\r\n const categories = this.groupRisks(data);\r\n const html = this.generateHtml(data, categories);\r\n\r\n await fs.writeFile(this.outputPath, html, 'utf8');\r\n console.log(`HTML report generated at ${this.outputPath}`);\r\n }\r\n\r\n private groupRisks(data: ReportData) {\r\n const categories: Record<string, Record<string, { reason: string, urls: string[] }>> = {};\r\n\r\n for (const urlObj of data.urlsWithRisks) {\r\n for (const risk of urlObj.risks) {\r\n if (!categories[risk.category]) {\r\n categories[risk.category] = {};\r\n }\r\n if (!categories[risk.category][risk.pattern]) {\r\n categories[risk.category][risk.pattern] = {\r\n reason: risk.reason,\r\n urls: []\r\n };\r\n }\r\n categories[risk.category][risk.pattern].urls.push(urlObj.loc);\r\n }\r\n }\r\n\r\n return categories;\r\n }\r\n\r\n private generateHtml(data: ReportData, categories: any): string {\r\n const duration = ((data.endTime.getTime() - data.startTime.getTime()) / 1000).toFixed(1);\r\n const timestamp = data.endTime.toLocaleString();\r\n\r\n return `\r\n<!DOCTYPE html>\r\n<html lang=\"en\">\r\n<head>\r\n <meta charset=\"UTF-8\">\r\n <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\">\r\n <title>Sitemap Analysis - ${data.rootUrl}</title>\r\n <style>\r\n :root {\r\n --bg-dark: #0f172a;\r\n --bg-light: #f8fafc;\r\n --text-main: #1e293b;\r\n --text-muted: #64748b;\r\n --primary: #3b82f6;\r\n --danger: #ef4444;\r\n --warning: #f59e0b;\r\n --border: #e2e8f0;\r\n }\r\n body { \r\n font-family: -apple-system, BlinkMacSystemFont, \"Segoe UI\", Roboto, Helvetica, Arial, sans-serif;\r\n line-height: 1.5;\r\n color: var(--text-main);\r\n background-color: #fff;\r\n margin: 0;\r\n padding: 0;\r\n }\r\n header {\r\n background-color: var(--bg-dark);\r\n color: white;\r\n padding: 40px 20px;\r\n text-align: left;\r\n }\r\n .container {\r\n max-width: 1200px;\r\n margin: 0 auto;\r\n padding: 0 20px;\r\n }\r\n header h1 { margin: 0; font-size: 24px; }\r\n header .meta { margin-top: 10px; color: #94a3b8; font-size: 14px; }\r\n \r\n .summary-grid {\r\n display: grid;\r\n grid-template-columns: repeat(4, 1fr);\r\n border-bottom: 1px solid var(--border);\r\n margin-bottom: 40px;\r\n }\r\n .summary-card {\r\n padding: 30px 20px;\r\n text-align: center;\r\n border-right: 1px solid var(--border);\r\n }\r\n .summary-card:last-child { border-right: none; }\r\n .summary-card h3 { \r\n margin: 0; \r\n font-size: 12px; \r\n text-transform: uppercase; \r\n color: var(--text-muted);\r\n letter-spacing: 0.05em;\r\n }\r\n .summary-card p { \r\n margin: 10px 0 0; \r\n font-size: 32px; \r\n font-weight: 700; \r\n color: var(--text-main);\r\n }\r\n .summary-card.highlight p { color: var(--danger); }\r\n\r\n details {\r\n margin-bottom: 20px;\r\n border: 1px solid var(--border);\r\n border-radius: 8px;\r\n overflow: hidden;\r\n }\r\n summary {\r\n padding: 15px 20px;\r\n background-color: #fff;\r\n cursor: pointer;\r\n font-weight: 600;\r\n display: flex;\r\n justify-content: space-between;\r\n align-items: center;\r\n list-style: none;\r\n }\r\n summary::-webkit-details-marker { display: none; }\r\n summary::after {\r\n content: 'ā–¶';\r\n font-size: 12px;\r\n color: var(--text-muted);\r\n transition: transform 0.2s;\r\n }\r\n details[open] summary::after { transform: rotate(90deg); }\r\n \r\n .category-section {\r\n border: 1px solid var(--warning);\r\n border-radius: 8px;\r\n margin-bottom: 20px;\r\n }\r\n .category-header {\r\n padding: 15px 20px;\r\n background-color: #fffbeb;\r\n color: var(--warning);\r\n font-weight: 600;\r\n cursor: pointer;\r\n display: flex;\r\n justify-content: space-between;\r\n align-items: center;\r\n }\r\n .category-content {\r\n padding: 20px;\r\n background-color: #fff;\r\n }\r\n\r\n .finding-group {\r\n border: 1px solid var(--border);\r\n border-radius: 8px;\r\n padding: 20px;\r\n margin-bottom: 20px;\r\n }\r\n .finding-header {\r\n display: flex;\r\n align-items: center;\r\n gap: 10px;\r\n margin-bottom: 10px;\r\n }\r\n .finding-header h4 { margin: 0; font-size: 16px; }\r\n .badge {\r\n background-color: var(--primary);\r\n color: white;\r\n padding: 2px 8px;\r\n border-radius: 12px;\r\n font-size: 12px;\r\n }\r\n .finding-description {\r\n color: var(--text-muted);\r\n font-size: 14px;\r\n margin-bottom: 20px;\r\n }\r\n \r\n .url-list {\r\n background-color: var(--bg-light);\r\n border-radius: 4px;\r\n padding: 15px;\r\n margin-bottom: 15px;\r\n }\r\n .url-item {\r\n font-family: monospace;\r\n font-size: 13px;\r\n padding: 8px 12px;\r\n background: white;\r\n border: 1px solid var(--border);\r\n border-radius: 4px;\r\n margin-bottom: 8px;\r\n white-space: nowrap;\r\n overflow: hidden;\r\n text-overflow: ellipsis;\r\n }\r\n .url-item:last-child { margin-bottom: 0; }\r\n \r\n .more-count {\r\n font-size: 12px;\r\n color: var(--text-muted);\r\n font-style: italic;\r\n margin-bottom: 15px;\r\n }\r\n\r\n .btn {\r\n display: inline-flex;\r\n align-items: center;\r\n gap: 8px;\r\n background-color: var(--primary);\r\n color: white;\r\n padding: 8px 16px;\r\n border-radius: 6px;\r\n text-decoration: none;\r\n font-size: 13px;\r\n font-weight: 500;\r\n }\r\n .btn:hover { opacity: 0.9; }\r\n\r\n footer {\r\n text-align: center;\r\n padding: 40px;\r\n color: var(--text-muted);\r\n font-size: 12px;\r\n border-top: 1px solid var(--border);\r\n margin-top: 40px;\r\n }\r\n </style>\r\n</head>\r\n<body>\r\n <header>\r\n <div class=\"container\">\r\n <h1>Sitemap Analysis</h1>\r\n <div class=\"meta\">\r\n <div>${data.rootUrl}</div>\r\n <div>${timestamp}</div>\r\n </div>\r\n </div>\r\n </header>\r\n\r\n <div class=\"summary-grid\">\r\n <div class=\"summary-card\">\r\n <h3>Sitemaps</h3>\r\n <p>${data.discoveredSitemaps.length}</p>\r\n </div>\r\n <div class=\"summary-card\">\r\n <h3>URLs Analyzed</h3>\r\n <p>${data.totalUrls.toLocaleString()}</p>\r\n </div>\r\n <div class=\"summary-card highlight\">\r\n <h3>Issues Found</h3>\r\n <p>${data.totalRisks}</p>\r\n </div>\r\n <div class=\"summary-card\">\r\n <h3>Scan Time</h3>\r\n <p>${duration}s</p>\r\n </div>\r\n </div>\r\n\r\n <div class=\"container\">\r\n <details>\r\n <summary>Sitemaps Discovered (${data.discoveredSitemaps.length})</summary>\r\n <div style=\"padding: 20px; background: var(--bg-light);\">\r\n ${data.discoveredSitemaps.map(s => `<div class=\"url-item\">${s}</div>`).join('')}\r\n </div>\r\n </details>\r\n\r\n ${Object.entries(categories).map(([category, findings]: [string, any]) => {\r\n const totalCategoryUrls = Object.values(findings).reduce((acc: number, f: any) => acc + f.urls.length, 0);\r\n return `\r\n <div class=\"category-section\">\r\n <div class=\"category-header\">\r\n <span>${category} (${totalCategoryUrls} URLs)</span>\r\n <span>ā–¼</span>\r\n </div>\r\n <div class=\"category-content\">\r\n ${Object.entries(findings).map(([pattern, finding]: [string, any]) => `\r\n <div class=\"finding-group\">\r\n <div class=\"finding-header\">\r\n <h4>${pattern}</h4>\r\n <span class=\"badge\">${finding.urls.length} URLs</span>\r\n </div>\r\n <div class=\"finding-description\">\r\n ${finding.reason}\r\n </div>\r\n <div class=\"url-list\">\r\n ${finding.urls.slice(0, 3).map((url: string) => `\r\n <div class=\"url-item\">${url}</div>\r\n `).join('')}\r\n </div>\r\n ${finding.urls.length > 3 ? `\r\n <div class=\"more-count\">... and ${finding.urls.length - 3} more</div>\r\n ` : ''}\r\n <a href=\"#\" class=\"btn\" onclick=\"downloadUrls('${pattern}', ${JSON.stringify(finding.urls).replace(/\"/g, '&quot;')})\">\r\n šŸ“„ Download All ${finding.urls.length} URLs\r\n </a>\r\n </div>\r\n `).join('')}\r\n </div>\r\n </div>\r\n `;\r\n }).join('')}\r\n </div>\r\n\r\n <footer>\r\n Generated by sitemap-qa v1.0.0\r\n </footer>\r\n\r\n <script>\r\n function downloadUrls(name, urls) {\r\n const blob = new Blob([urls.join('\\\\n')], { type: 'text/plain' });\r\n const url = window.URL.createObjectURL(blob);\r\n const a = document.createElement('a');\r\n a.href = url;\r\n a.download = \\`\\${name.replace(/[^a-z0-9]/gi, '_').toLowerCase()}_urls.txt\\`;\r\n document.body.appendChild(a);\r\n a.click();\r\n window.URL.revokeObjectURL(url);\r\n document.body.removeChild(a);\r\n }\r\n </script>\r\n</body>\r\n</html>\r\n`;\r\n }\r\n}\r\n","import { Command } from 'commander';\r\nimport fs from 'node:fs';\r\nimport path from 'node:path';\r\nimport chalk from 'chalk';\r\n\r\nconst DEFAULT_CONFIG = `# sitemap-qa configuration\r\n# This file defines the risk categories and patterns to monitor.\r\n\r\n# Risk Categories\r\n# Each category contains a list of patterns to match against URLs found in sitemaps.\r\n# Patterns can be:\r\n# - literal: Exact string match\r\n# - glob: Glob pattern (e.g., **/admin/**)\r\n# - regex: Regular expression (e.g., /\\\\/v[0-9]+\\\\//)\r\n\r\npolicies:\r\n - category: \"Security & Admin\"\r\n patterns:\r\n - type: \"glob\"\r\n value: \"**/admin/**\"\r\n reason: \"Administrative interfaces should not be publicly indexed.\"\r\n - type: \"glob\"\r\n value: \"**/.env*\"\r\n reason: \"Environment files contain sensitive secrets.\"\r\n - type: \"literal\"\r\n value: \"/wp-admin\"\r\n reason: \"WordPress admin paths are common attack vectors.\"\r\n\r\n - category: \"Environment Leakage\"\r\n patterns:\r\n - type: \"glob\"\r\n value: \"**/staging.**\"\r\n reason: \"Staging environments should be restricted.\"\r\n - type: \"glob\"\r\n value: \"**/dev.**\"\r\n reason: \"Development subdomains detected in production sitemap.\"\r\n\r\n - category: \"Sensitive Files\"\r\n patterns:\r\n - type: \"glob\"\r\n value: \"**/*.{sql,bak,zip,tar.gz}\"\r\n reason: \"Archive or database backup files exposed.\"\r\n`;\r\n\r\nexport const initCommand = new Command('init')\r\n .description('Initialize a default sitemap-qa.yaml configuration file')\r\n .action(() => {\r\n const configPath = path.join(process.cwd(), 'sitemap-qa.yaml');\r\n\r\n if (fs.existsSync(configPath)) {\r\n console.error(chalk.red(`Error: ${configPath} already exists.`));\r\n process.exit(1);\r\n }\r\n\r\n try {\r\n fs.writeFileSync(configPath, DEFAULT_CONFIG, 'utf8');\r\n console.log(chalk.green(`Successfully created ${configPath}`));\r\n } catch (error) {\r\n console.error(chalk.red('Failed to create configuration file:'), error);\r\n process.exit(1);\r\n }\r\n });\r\n"],"mappings":";;;AACA,SAAS,WAAAA,gBAAe;;;ACDxB,SAAS,eAAe;AACxB,OAAOC,YAAW;AAClB,OAAOC,WAAU;AACjB,OAAOC,SAAQ;;;ACHf,OAAO,QAAQ;AACf,OAAO,UAAU;AACjB,OAAO,UAAU;;;ACFjB,SAAS,SAAS;AAEX,IAAM,oBAAoB,EAAE,KAAK,CAAC,WAAW,QAAQ,OAAO,CAAC;AAE7D,IAAM,gBAAgB,EAAE,OAAO;AAAA,EACpC,MAAM;AAAA,EACN,OAAO,EAAE,OAAO,EAAE,IAAI,GAAG,+BAA+B;AAAA,EACxD,QAAQ,EAAE,OAAO,EAAE,IAAI,GAAG,sCAAsC;AAClE,CAAC;AAEM,IAAM,eAAe,EAAE,OAAO;AAAA,EACnC,UAAU,EAAE,OAAO,EAAE,IAAI,GAAG,4BAA4B;AAAA,EACxD,UAAU,EAAE,MAAM,aAAa,EAAE,IAAI,GAAG,+CAA+C;AACzF,CAAC;AAEM,IAAM,eAAe,EAAE,OAAO;AAAA,EACnC,UAAU,EAAE,MAAM,YAAY,EAAE,QAAQ,CAAC,CAAC;AAAA,EAC1C,QAAQ,EAAE,OAAO,EAAE,SAAS;AAAA,EAC5B,cAAc,EAAE,KAAK,CAAC,QAAQ,QAAQ,KAAK,CAAC,EAAE,QAAQ,KAAK;AAC7D,CAAC;;;ACjBM,IAAM,mBAA2B;AAAA,EACtC,UAAU;AAAA,IACR;AAAA,MACE,UAAU;AAAA,MACV,UAAU;AAAA,QACR;AAAA,UACE,MAAM;AAAA,UACN,OAAO;AAAA,UACP,QAAQ;AAAA,QACV;AAAA,QACA;AAAA,UACE,MAAM;AAAA,UACN,OAAO;AAAA,UACP,QAAQ;AAAA,QACV;AAAA,QACA;AAAA,UACE,MAAM;AAAA,UACN,OAAO;AAAA,UACP,QAAQ;AAAA,QACV;AAAA,MACF;AAAA,IACF;AAAA,IACA;AAAA,MACE,UAAU;AAAA,MACV,UAAU;AAAA,QACR;AAAA,UACE,MAAM;AAAA,UACN,OAAO;AAAA,UACP,QAAQ;AAAA,QACV;AAAA,QACA;AAAA,UACE,MAAM;AAAA,UACN,OAAO;AAAA,UACP,QAAQ;AAAA,QACV;AAAA,MACF;AAAA,IACF;AAAA,IACA;AAAA,MACE,UAAU;AAAA,MACV,UAAU;AAAA,QACR;AAAA,UACE,MAAM;AAAA,UACN,OAAO;AAAA,UACP,QAAQ;AAAA,QACV;AAAA,MACF;AAAA,IACF;AAAA,EACF;AACF;;;AF7CA,OAAO,WAAW;AAEX,IAAM,eAAN,MAAmB;AAAA,EACxB,OAAwB,sBAAsB;AAAA,EAE9C,OAAO,KAAK,YAA6B;AACvC,UAAM,aAAa,cAAc,KAAK,KAAK,QAAQ,IAAI,GAAG,KAAK,mBAAmB;AAClF,QAAI,aAAqB,EAAE,UAAU,CAAC,EAAE;AAGxC,QAAI,GAAG,WAAW,UAAU,GAAG;AAC7B,UAAI;AACF,cAAM,cAAc,GAAG,aAAa,YAAY,MAAM;AACtD,cAAM,aAAa,KAAK,KAAK,WAAW;AAExC,cAAM,SAAS,aAAa,UAAU,UAAU;AAEhD,YAAI,CAAC,OAAO,SAAS;AACnB,kBAAQ,MAAM,MAAM,IAAI,iCAAiC,CAAC;AAC1D,iBAAO,MAAM,OAAO,QAAQ,CAAC,UAAU;AACrC,oBAAQ,MAAM,MAAM,OAAO,OAAO,MAAM,KAAK,KAAK,GAAG,CAAC,KAAK,MAAM,OAAO,EAAE,CAAC;AAAA,UAC7E,CAAC;AACD,kBAAQ,KAAK,CAAC;AAAA,QAChB;AAEA,qBAAa,OAAO;AAAA,MACtB,SAAS,OAAO;AACd,gBAAQ,MAAM,MAAM,IAAI,+BAA+B,GAAG,KAAK;AAC/D,gBAAQ,KAAK,CAAC;AAAA,MAChB;AAAA,IACF,WAAW,YAAY;AACrB,cAAQ,MAAM,MAAM,IAAI,0CAA0C,UAAU,EAAE,CAAC;AAC/E,cAAQ,KAAK,CAAC;AAAA,IAChB;AAEA,WAAO,KAAK,aAAa,kBAAkB,UAAU;AAAA,EACvD;AAAA,EAEA,OAAe,aAAa,UAAkB,MAAsB;AAClE,UAAM,iBAAiB,CAAC,GAAG,SAAS,QAAQ;AAE5C,SAAK,SAAS,QAAQ,CAAC,eAAe;AACpC,YAAM,QAAQ,eAAe,UAAU,OAAK,EAAE,aAAa,WAAW,QAAQ;AAC9E,UAAI,UAAU,IAAI;AAEhB,uBAAe,KAAK,IAAI;AAAA,MAC1B,OAAO;AAEL,uBAAe,KAAK,UAAU;AAAA,MAChC;AAAA,IACF,CAAC;AAGD,UAAM,SAAiB;AAAA,MACrB,GAAG;AAAA,MACH,UAAU;AAAA,IACZ;AAEA,QAAI,KAAK,WAAW,QAAW;AAC7B,aAAO,SAAS,KAAK;AAAA,IACvB;AAEA,QAAI,KAAK,iBAAiB,QAAW;AACnC,aAAO,eAAe,KAAK;AAAA,IAC7B;AAEA,WAAO;AAAA,EACT;AACF;;;AGzEA,SAAS,aAAa;AACtB,SAAS,iBAAiB;AAEnB,IAAM,mBAAN,MAAuB;AAAA,EACX;AAAA,EACA,UAAU,oBAAI,IAAY;AAAA,EAC1B,iBAAiB;AAAA,IAChC;AAAA,IACA;AAAA,IACA;AAAA,IACA;AAAA,IACA;AAAA,EACF;AAAA,EAEA,cAAc;AACZ,SAAK,SAAS,IAAI,UAAU;AAAA,MAC1B,kBAAkB;AAAA,MAClB,qBAAqB;AAAA,IACvB,CAAC;AAAA,EACH;AAAA;AAAA;AAAA;AAAA,EAKA,MAAM,aAAa,SAAoC;AACrD,UAAM,WAAW,oBAAI,IAAY;AACjC,UAAM,MAAM,IAAI,IAAI,OAAO;AAC3B,UAAM,SAAS,IAAI;AAGnB,QAAI;AACF,YAAM,YAAY,GAAG,MAAM;AAC3B,YAAM,WAAW,MAAM,MAAM,SAAS;AACtC,UAAI,SAAS,WAAW,KAAK;AAC3B,cAAM,OAAO,MAAM,SAAS,KAAK;AACjC,cAAM,UAAU,KAAK,SAAS,sBAAsB;AACpD,mBAAW,SAAS,SAAS;AAC3B,cAAI,MAAM,CAAC,EAAG,UAAS,IAAI,MAAM,CAAC,EAAE,KAAK,CAAC;AAAA,QAC5C;AAAA,MACF;AAAA,IACF,SAAS,GAAG;AAAA,IAEZ;AAGA,QAAI,SAAS,SAAS,GAAG;AACvB,iBAAWC,SAAQ,KAAK,gBAAgB;AACtC,YAAI;AACF,gBAAM,aAAa,GAAG,MAAM,GAAGA,KAAI;AACnC,gBAAM,WAAW,MAAM,MAAM,YAAY,EAAE,QAAQ,OAAO,CAAC;AAC3D,cAAI,SAAS,WAAW,KAAK;AAC3B,qBAAS,IAAI,UAAU;AAAA,UACzB;AAAA,QACF,SAAS,GAAG;AAAA,QAEZ;AAAA,MACF;AAAA,IACF;AAEA,WAAO,MAAM,KAAK,QAAQ;AAAA,EAC5B;AAAA;AAAA;AAAA;AAAA,EAKA,OAAO,SAAS,SAAyC;AACvD,UAAM,QAAkB,CAAC,OAAO;AAEhC,WAAO,MAAM,SAAS,GAAG;AACvB,YAAM,aAAa,MAAM,MAAM;AAC/B,UAAI,KAAK,QAAQ,IAAI,UAAU,EAAG;AAClC,WAAK,QAAQ,IAAI,UAAU;AAE3B,UAAI;AACF,cAAM,WAAW,MAAM,MAAM,UAAU;AACvC,YAAI,SAAS,WAAW,IAAK;AAE7B,cAAM,UAAU,MAAM,SAAS,KAAK;AACpC,cAAM,UAAU,KAAK,OAAO,MAAM,OAAO;AAEzC,YAAI,QAAQ,cAAc;AACxB,gBAAM,WAAW,MAAM,QAAQ,QAAQ,aAAa,OAAO,IACvD,QAAQ,aAAa,UACrB,CAAC,QAAQ,aAAa,OAAO;AAEjC,qBAAW,WAAW,UAAU;AAC9B,gBAAI,SAAS,KAAK;AAChB,oBAAM,KAAK,QAAQ,GAAG;AAAA,YACxB;AAAA,UACF;AAAA,QACF,WAAW,QAAQ,QAAQ;AAEzB,gBAAM;AAAA,QACR;AAAA,MACF,SAAS,OAAO;AACd,gBAAQ,MAAM,uCAAuC,UAAU,KAAK,KAAK;AAAA,MAC3E;AAAA,IACF;AAAA,EACF;AACF;;;ACnGA,SAAS,aAAAC,kBAAiB;AAC1B,SAAS,SAAAC,cAAa;AAGf,IAAM,gBAAN,MAAoB;AAAA,EACR;AAAA,EAEjB,cAAc;AACZ,SAAK,SAAS,IAAID,WAAU;AAAA,MAC1B,kBAAkB;AAAA,MAClB,qBAAqB;AAAA,IACvB,CAAC;AAAA,EACH;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,EASA,OAAO,MAAM,YAAgD;AAC3D,QAAI;AACF,YAAM,WAAW,MAAMC,OAAM,UAAU;AACvC,YAAM,UAAU,MAAM,SAAS,KAAK;AACpC,YAAM,UAAU,KAAK,OAAO,MAAM,OAAO;AAEzC,UAAI,QAAQ,UAAU,QAAQ,OAAO,KAAK;AACxC,cAAM,OAAO,MAAM,QAAQ,QAAQ,OAAO,GAAG,IACzC,QAAQ,OAAO,MACf,CAAC,QAAQ,OAAO,GAAG;AAEvB,mBAAW,OAAO,MAAM;AACtB,cAAI,IAAI,KAAK;AACX,kBAAM;AAAA,cACJ,KAAK,IAAI;AAAA,cACT,QAAQ;AAAA,cACR,SAAS,IAAI;AAAA,cACb,YAAY,IAAI;AAAA,cAChB,UAAU,IAAI;AAAA,cACd,OAAO,CAAC;AAAA,YACV;AAAA,UACF;AAAA,QACF;AAAA,MACF;AAAA,IACF,SAAS,OAAO;AACd,cAAQ,MAAM,8BAA8B,UAAU,KAAK,KAAK;AAAA,IAClE;AAAA,EACF;AACF;;;AC7CO,IAAM,mBAAN,MAAuB;AAAA,EACX;AAAA,EACA;AAAA,EACA,WAAW,oBAAI,IAAY;AAAA,EAC3B,qBAAqB,oBAAI,IAAY;AAAA,EAEtD,cAAc;AACZ,SAAK,YAAY,IAAI,iBAAiB;AACtC,SAAK,SAAS,IAAI,cAAc;AAAA,EAClC;AAAA;AAAA;AAAA;AAAA,EAKA,wBAAkC;AAChC,WAAO,MAAM,KAAK,KAAK,kBAAkB;AAAA,EAC3C;AAAA;AAAA;AAAA;AAAA,EAKQ,aAAa,KAAqB;AACxC,QAAI;AACF,YAAM,SAAS,IAAI,IAAI,GAAG;AAC1B,UAAI,aAAa,OAAO,SAAS,OAAO,SAAS,QAAQ,OAAO,EAAE;AAClE,UAAI,OAAO,OAAQ,eAAc,OAAO;AACxC,aAAO,WAAW,YAAY;AAAA,IAChC,QAAQ;AACN,aAAO,IAAI,YAAY,EAAE,QAAQ,OAAO,EAAE;AAAA,IAC5C;AAAA,EACF;AAAA;AAAA;AAAA;AAAA,EAKA,OAAO,QAAQ,UAA8C;AAC3D,QAAI,YAAY,CAAC,QAAQ;AAGzB,QAAI,CAAC,SAAS,SAAS,MAAM,KAAK,CAAC,SAAS,SAAS,KAAK,GAAG;AAC3D,YAAM,aAAa,MAAM,KAAK,UAAU,aAAa,QAAQ;AAC7D,UAAI,WAAW,SAAS,GAAG;AACzB,gBAAQ,IAAI,qBAAgB,WAAW,MAAM,gBAAgB,WAAW,KAAK,IAAI,CAAC,EAAE;AACpF,oBAAY;AAAA,MACd,OAAO;AACL,gBAAQ,IAAI,kGAAwF;AAAA,MACtG;AAAA,IACF;AAEA,eAAW,YAAY,WAAW;AAChC,uBAAiB,cAAc,KAAK,UAAU,SAAS,QAAQ,GAAG;AAChE,aAAK,mBAAmB,IAAI,UAAU;AACtC,yBAAiB,UAAU,KAAK,OAAO,MAAM,UAAU,GAAG;AACxD,gBAAM,aAAa,KAAK,aAAa,OAAO,GAAG;AAC/C,cAAI,CAAC,KAAK,SAAS,IAAI,UAAU,GAAG;AAClC,iBAAK,SAAS,IAAI,UAAU;AAC5B,kBAAM;AAAA,UACR;AAAA,QACF;AAAA,MACF;AAAA,IACF;AAAA,EACF;AACF;;;AClEA,OAAO,gBAAgB;AAIhB,IAAM,iBAAN,MAAqB;AAAA,EACT;AAAA,EAEjB,YAAY,QAAgB;AAC1B,SAAK,SAAS;AAAA,EAChB;AAAA;AAAA;AAAA;AAAA,EAKA,MAAM,QAA4B;AAChC,UAAM,QAAgB,CAAC;AAEvB,eAAW,UAAU,KAAK,OAAO,UAAU;AACzC,iBAAW,WAAW,OAAO,UAAU;AACrC,YAAI,KAAK,QAAQ,OAAO,KAAK,OAAO,GAAG;AACrC,gBAAM,KAAK;AAAA,YACT,UAAU,OAAO;AAAA,YACjB,SAAS,QAAQ;AAAA,YACjB,MAAM,QAAQ;AAAA,YACd,QAAQ,QAAQ;AAAA,UAClB,CAAC;AAAA,QACH;AAAA,MACF;AAAA,IACF;AAEA,WAAO;AAAA,EACT;AAAA,EAEQ,QAAQ,KAAa,SAA2B;AACtD,YAAQ,QAAQ,MAAM;AAAA,MACpB,KAAK;AACH,eAAO,IAAI,SAAS,QAAQ,KAAK;AAAA,MACnC,KAAK;AACH,eAAO,WAAW,QAAQ,KAAK,QAAQ,OAAO,EAAE,UAAU,KAAK,CAAC;AAAA,MAClE,KAAK;AACH,YAAI;AACF,gBAAM,QAAQ,IAAI,OAAO,QAAQ,OAAO,GAAG;AAC3C,iBAAO,MAAM,KAAK,GAAG;AAAA,QACvB,QAAQ;AACN,iBAAO;AAAA,QACT;AAAA,MACF;AACE,eAAO;AAAA,IACX;AAAA,EACF;AACF;;;AClDA,OAAOC,YAAW;AAGX,IAAM,kBAAN,MAA0C;AAAA,EAC/C,MAAM,SAAS,MAAiC;AAC9C,YAAQ,IAAI,OAAOA,OAAM,KAAK,KAAK,qCAAqC,CAAC;AACzE,YAAQ,IAAI,uBAAuB,KAAK,SAAS,EAAE;AACnD,YAAQ,IAAI,uBAAuB,KAAK,aAAa,IAAIA,OAAM,IAAI,KAAK,UAAU,IAAIA,OAAM,MAAM,CAAC,CAAC,EAAE;AACtG,YAAQ,IAAI,uBAAuB,KAAK,cAAc,MAAM,EAAE;AAC9D,YAAQ,IAAI,yBAAyB,KAAK,QAAQ,QAAQ,IAAI,KAAK,UAAU,QAAQ,KAAK,KAAM,QAAQ,CAAC,CAAC,GAAG;AAE7G,QAAI,KAAK,cAAc,SAAS,GAAG;AACjC,cAAQ,IAAI,OAAOA,OAAM,KAAK,OAAO,eAAe,CAAC;AACrD,WAAK,cAAc,MAAM,GAAG,EAAE,EAAE,QAAQ,CAAC,QAAQ;AAC/C,gBAAQ,IAAI;AAAA,EAAKA,OAAM,KAAK,IAAI,GAAG,CAAC,EAAE;AACtC,YAAI,MAAM,QAAQ,CAAC,SAAS;AAC1B,kBAAQ,IAAI,QAAQA,OAAM,IAAI,KAAK,QAAQ,CAAC,KAAK,KAAK,MAAM,KAAKA,OAAM,KAAK,KAAK,OAAO,CAAC,GAAG;AAAA,QAC9F,CAAC;AAAA,MACH,CAAC;AAED,UAAI,KAAK,cAAc,SAAS,IAAI;AAClC,gBAAQ,IAAI;AAAA,UAAa,KAAK,cAAc,SAAS,EAAE,+CAA+C;AAAA,MACxG;AAAA,IACF;AAEA,YAAQ,IAAI,OAAOA,OAAM,KAAK,KAAK,qCAAqC,CAAC;AAAA,EAC3E;AACF;;;AC3BA,OAAOC,SAAQ;AAGR,IAAM,eAAN,MAAuC;AAAA,EAC3B;AAAA,EAEjB,YAAY,aAAqB,0BAA0B;AACzD,SAAK,aAAa;AAAA,EACpB;AAAA,EAEA,MAAM,SAAS,MAAiC;AAC9C,UAAM,SAAS;AAAA,MACb,UAAU;AAAA,QACR,cAAa,oBAAI,KAAK,GAAE,YAAY;AAAA,QACpC,YAAY,KAAK,QAAQ,QAAQ,IAAI,KAAK,UAAU,QAAQ;AAAA,MAC9D;AAAA,MACA,SAAS;AAAA,QACP,WAAW,KAAK;AAAA,QAChB,YAAY,KAAK;AAAA,QACjB,oBAAoB,KAAK,cAAc;AAAA,MACzC;AAAA,MACA,UAAU,KAAK;AAAA,IACjB;AAEA,UAAMA,IAAG,UAAU,KAAK,YAAY,KAAK,UAAU,QAAQ,MAAM,CAAC,GAAG,MAAM;AAC3E,YAAQ,IAAI,4BAA4B,KAAK,UAAU,EAAE;AAAA,EAC3D;AACF;;;AC3BA,OAAOC,SAAQ;AAGR,IAAM,eAAN,MAAuC;AAAA,EAC3B;AAAA,EAEjB,YAAY,aAAqB,0BAA0B;AACzD,SAAK,aAAa;AAAA,EACpB;AAAA,EAEA,MAAM,SAAS,MAAiC;AAC9C,UAAM,aAAa,KAAK,WAAW,IAAI;AACvC,UAAM,OAAO,KAAK,aAAa,MAAM,UAAU;AAE/C,UAAMA,IAAG,UAAU,KAAK,YAAY,MAAM,MAAM;AAChD,YAAQ,IAAI,4BAA4B,KAAK,UAAU,EAAE;AAAA,EAC3D;AAAA,EAEQ,WAAW,MAAkB;AACnC,UAAM,aAAiF,CAAC;AAExF,eAAW,UAAU,KAAK,eAAe;AACvC,iBAAW,QAAQ,OAAO,OAAO;AAC/B,YAAI,CAAC,WAAW,KAAK,QAAQ,GAAG;AAC9B,qBAAW,KAAK,QAAQ,IAAI,CAAC;AAAA,QAC/B;AACA,YAAI,CAAC,WAAW,KAAK,QAAQ,EAAE,KAAK,OAAO,GAAG;AAC5C,qBAAW,KAAK,QAAQ,EAAE,KAAK,OAAO,IAAI;AAAA,YACxC,QAAQ,KAAK;AAAA,YACb,MAAM,CAAC;AAAA,UACT;AAAA,QACF;AACA,mBAAW,KAAK,QAAQ,EAAE,KAAK,OAAO,EAAE,KAAK,KAAK,OAAO,GAAG;AAAA,MAC9D;AAAA,IACF;AAEA,WAAO;AAAA,EACT;AAAA,EAEQ,aAAa,MAAkB,YAAyB;AAC9D,UAAM,aAAa,KAAK,QAAQ,QAAQ,IAAI,KAAK,UAAU,QAAQ,KAAK,KAAM,QAAQ,CAAC;AACvF,UAAM,YAAY,KAAK,QAAQ,eAAe;AAE9C,WAAO;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,gCAMquBA4LrB,KAAK,OAAO;AAAA,uBACZ,SAAS;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,iBAQf,KAAK,mBAAmB,MAAM;AAAA;AAAA;AAAA;AAAA,iBAI9B,KAAK,UAAU,eAAe,CAAC;AAAA;AAAA;AAAA;AAAA,iBAI/B,KAAK,UAAU;AAAA;AAAA;AAAA;AAAA,iBAIf,QAAQ;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,4CAMmB,KAAK,mBAAmB,MAAM;AAAA;AAAA,kBAExD,KAAK,mBAAmB,IAAI,OAAK,yBAAyB,CAAC,QAAQ,EAAE,KAAK,EAAE,CAAC;AAAA;AAAA;AAAA;AAAA,UAIrF,OAAO,QAAQ,UAAU,EAAE,IAAI,CAAC,CAAC,UAAU,QAAQ,MAAqB;AACtE,YAAM,oBAAoB,OAAO,OAAO,QAAQ,EAAE,OAAO,CAAC,KAAa,MAAW,MAAM,EAAE,KAAK,QAAQ,CAAC;AACxG,aAAO;AAAA;AAAA;AAAA,4BAGS,QAAQ,KAAK,iBAAiB;AAAA;AAAA;AAAA;AAAA,sBAIpC,OAAO,QAAQ,QAAQ,EAAE,IAAI,CAAC,CAAC,SAAS,OAAO,MAAqB;AAAA;AAAA;AAAA,sCAGpD,OAAO;AAAA,sDACS,QAAQ,KAAK,MAAM;AAAA;AAAA;AAAA,kCAGvC,QAAQ,MAAM;AAAA;AAAA;AAAA,kCAGd,QAAQ,KAAK,MAAM,GAAG,CAAC,EAAE,IAAI,CAAC,QAAgB;AAAA,4DACpB,GAAG;AAAA,iCAC9B,EAAE,KAAK,EAAE,CAAC;AAAA;AAAA,8BAEb,QAAQ,KAAK,SAAS,IAAI;AAAA,kEACU,QAAQ,KAAK,SAAS,CAAC;AAAA,gCACzD,EAAE;AAAA,6EAC2C,OAAO,MAAM,KAAK,UAAU,QAAQ,IAAI,EAAE,QAAQ,MAAM,QAAQ,CAAC;AAAA,yDAC5F,QAAQ,KAAK,MAAM;AAAA;AAAA;AAAA,qBAGhD,EAAE,KAAK,EAAE,CAAC;AAAA;AAAA;AAAA;AAAA,IAIvB,CAAC,EAAE,KAAK,EAAE,CAAC;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,EAuBjB;AACF;;;AV3TO,IAAM,iBAAiB,IAAI,QAAQ,SAAS,EAChD,YAAY,uCAAuC,EACnD,SAAS,SAAS,kBAAkB,EACpC,OAAO,uBAAuB,yBAAyB,EACvD,OAAO,yBAAyB,iCAAiC,EACjE,OAAO,wBAAwB,2BAA2B,EAC1D,OAAO,OAAO,KAAa,YAAmE;AAC7F,QAAM,YAAY,oBAAI,KAAK;AAG3B,QAAM,SAAS,aAAa,KAAK,QAAQ,MAAM;AAC/C,QAAM,SAAS,QAAQ,UAAU,OAAO,UAAU;AAClD,QAAM,eAAe,QAAQ,UAAU,OAAO,gBAAgB;AAG9D,QAAM,YAAY,IAAI,iBAAiB;AACvC,QAAM,UAAU,IAAI,eAAe,MAAM;AAEzC,QAAM,gBAA8B,CAAC;AACrC,MAAI,YAAY;AAChB,MAAI,aAAa;AAEjB,UAAQ,IAAIC,OAAM,KAAK;AAAA,0CAA8B,GAAG,KAAK,CAAC;AAE9D,MAAI;AAEF,qBAAiB,UAAU,UAAU,QAAQ,GAAG,GAAG;AACjD;AACA,YAAM,QAAQ,QAAQ,MAAM,MAAM;AAElC,UAAI,MAAM,SAAS,GAAG;AACpB,eAAO,QAAQ;AACf,sBAAc,KAAK,MAAM;AACzB,sBAAc,MAAM;AAAA,MACtB;AAEA,UAAI,YAAY,QAAQ,GAAG;AACzB,gBAAQ,OAAO,MAAMA,OAAM,KAAK,eAAe,SAAS,UAAU,CAAC;AAAA,MACrE;AAAA,IACF;AACA,YAAQ,OAAO,MAAM,IAAI;AAEzB,UAAM,UAAU,oBAAI,KAAK;AACzB,UAAM,aAAyB;AAAA,MAC7B,SAAS;AAAA,MACT,oBAAoB,UAAU,sBAAsB;AAAA,MACpD;AAAA,MACA;AAAA,MACA;AAAA,MACA;AAAA,MACA;AAAA,IACF;AAGA,UAAM,YAAwB,CAAC,IAAI,gBAAgB,CAAC;AAEpD,UAAMC,IAAG,MAAM,QAAQ,EAAE,WAAW,KAAK,CAAC;AAE1C,QAAI,iBAAiB,UAAU,iBAAiB,OAAO;AACrD,YAAM,WAAWC,MAAK,KAAK,QAAQ,wBAAwB;AAC3D,gBAAU,KAAK,IAAI,aAAa,QAAQ,CAAC;AAAA,IAC3C;AACA,QAAI,iBAAiB,UAAU,iBAAiB,OAAO;AACrD,YAAM,WAAWA,MAAK,KAAK,QAAQ,wBAAwB;AAC3D,gBAAU,KAAK,IAAI,aAAa,QAAQ,CAAC;AAAA,IAC3C;AAEA,eAAW,YAAY,WAAW;AAChC,YAAM,SAAS,SAAS,UAAU;AAAA,IACpC;AAGA,QAAI,aAAa,GAAG;AAClB,cAAQ,KAAK,CAAC;AAAA,IAChB,OAAO;AACL,cAAQ,KAAK,CAAC;AAAA,IAChB;AAAA,EAEF,SAAS,OAAO;AACd,YAAQ,MAAMF,OAAM,IAAI,oBAAoB,GAAG,KAAK;AACpD,YAAQ,KAAK,CAAC;AAAA,EAChB;AACF,CAAC;;;AW/FH,SAAS,WAAAG,gBAAe;AACxB,OAAOC,SAAQ;AACf,OAAOC,WAAU;AACjB,OAAOC,YAAW;AAElB,IAAM,iBAAiB;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAuChB,IAAM,cAAc,IAAIH,SAAQ,MAAM,EAC1C,YAAY,yDAAyD,EACrE,OAAO,MAAM;AACZ,QAAM,aAAaE,MAAK,KAAK,QAAQ,IAAI,GAAG,iBAAiB;AAE7D,MAAID,IAAG,WAAW,UAAU,GAAG;AAC7B,YAAQ,MAAME,OAAM,IAAI,UAAU,UAAU,kBAAkB,CAAC;AAC/D,YAAQ,KAAK,CAAC;AAAA,EAChB;AAEA,MAAI;AACF,IAAAF,IAAG,cAAc,YAAY,gBAAgB,MAAM;AACnD,YAAQ,IAAIE,OAAM,MAAM,wBAAwB,UAAU,EAAE,CAAC;AAAA,EAC/D,SAAS,OAAO;AACd,YAAQ,MAAMA,OAAM,IAAI,sCAAsC,GAAG,KAAK;AACtE,YAAQ,KAAK,CAAC;AAAA,EAChB;AACF,CAAC;;;AZxDH,IAAM,UAAU,IAAIC,SAAQ;AAE5B,QACG,KAAK,YAAY,EACjB,QAAQ,OAAO,EACf,YAAY,+BAA+B;AAE9C,QAAQ,WAAW,cAAc;AACjC,QAAQ,WAAW,WAAW;AAG9B,QAAQ,GAAG,sBAAsB,CAAC,QAAQ,YAAY;AACpD,UAAQ,MAAM,2BAA2B,SAAS,WAAW,MAAM;AACnE,UAAQ,KAAK,CAAC;AAChB,CAAC;AAGD,QAAQ,GAAG,UAAU,MAAM;AACzB,UAAQ,IAAI,+BAA+B;AAC3C,UAAQ,KAAK,CAAC;AAChB,CAAC;AAED,QAAQ,GAAG,WAAW,MAAM;AAC1B,UAAQ,IAAI,+BAA+B;AAC3C,UAAQ,KAAK,CAAC;AAChB,CAAC;AAED,QAAQ,MAAM;","names":["Command","chalk","path","fs","path","XMLParser","fetch","chalk","fs","fs","chalk","fs","path","Command","fs","path","chalk","Command"]}
1
+ {"version":3,"sources":["../src/index.ts","../src/commands/analyze.ts","../src/config/loader.ts","../src/config/schema.ts","../src/config/defaults.ts","../src/core/discovery.ts","../src/core/parser.ts","../src/core/extractor.ts","../src/core/matcher.ts","../src/reporters/console-reporter.ts","../src/reporters/json-reporter.ts","../src/reporters/html-reporter.ts","../src/commands/init.ts"],"sourcesContent":["#!/usr/bin/env node\r\nimport { Command } from 'commander';\r\nimport { analyzeCommand } from '@/commands/analyze';\r\nimport { initCommand } from '@/commands/init';\r\n\r\nconst program = new Command();\r\n\r\nprogram\r\n .name('sitemap-qa')\r\n .version('1.0.0')\r\n .description('sitemap analysis for QA teams');\r\n\r\nprogram.addCommand(analyzeCommand);\r\nprogram.addCommand(initCommand);\r\n\r\n// Global error handler\r\nprocess.on('unhandledRejection', (reason, promise) => {\r\n console.error('Unhandled Rejection at:', promise, 'reason:', reason);\r\n process.exit(1);\r\n});\r\n\r\n// Graceful shutdown handlers\r\nprocess.on('SIGINT', () => {\r\n console.log('\\nGracefully shutting down...');\r\n process.exit(0);\r\n});\r\n\r\nprocess.on('SIGTERM', () => {\r\n console.log('\\nGracefully shutting down...');\r\n process.exit(0);\r\n});\r\n\r\nprogram.parse();\r\n","import { Command } from 'commander';\r\nimport chalk from 'chalk';\r\nimport path from 'node:path';\r\nimport fs from 'node:fs/promises';\r\nimport { ConfigLoader } from '../config/loader';\r\nimport { ExtractorService } from '../core/extractor';\r\nimport { MatcherService } from '../core/matcher';\r\nimport { ConsoleReporter } from '../reporters/console-reporter';\r\nimport { JsonReporter } from '../reporters/json-reporter';\r\nimport { HtmlReporter } from '../reporters/html-reporter';\r\nimport { ReportData, Reporter } from '../reporters/base';\r\nimport { SitemapUrl } from '../types/sitemap';\r\n\r\nexport const analyzeCommand = new Command('analyze')\r\n .description('Analyze a sitemap for potential risks')\r\n .argument('<url>', 'Root sitemap URL')\r\n .option('-c, --config <path>', 'Path to sitemap-qa.yaml')\r\n .option('-o, --output <format>', 'Output format (json, html, all)')\r\n .option('-d, --out-dir <path>', 'Directory to save reports')\r\n .action(async (url: string, options: { config?: string; output?: string; outDir?: string }) => {\r\n const startTime = new Date();\r\n \r\n // 1. Load Config\r\n const config = ConfigLoader.load(options.config);\r\n const outDir = options.outDir || config.outDir || '.';\r\n const outputFormat = options.output || config.outputFormat || 'all';\r\n \r\n // 2. Initialize Services\r\n const extractor = new ExtractorService();\r\n const matcher = new MatcherService(config, url);\r\n \r\n const urlsWithRisks: SitemapUrl[] = [];\r\n const ignoredUrls: SitemapUrl[] = [];\r\n let totalUrls = 0;\r\n let totalRisks = 0;\r\n\r\n console.log(chalk.blue(`\\nšŸš€ Starting analysis of ${url}...`));\r\n\r\n try {\r\n // 3. Pipeline: Extract -> Match\r\n for await (const urlObj of extractor.extract(url)) {\r\n totalUrls++;\r\n const risks = matcher.match(urlObj);\r\n \r\n if (risks.length > 0) {\r\n urlObj.risks = risks;\r\n urlsWithRisks.push(urlObj);\r\n totalRisks += risks.length;\r\n } else if (urlObj.ignored) {\r\n ignoredUrls.push(urlObj);\r\n }\r\n\r\n if (totalUrls % 100 === 0) {\r\n process.stdout.write(chalk.gray(`\\rProcessed ${totalUrls} URLs...`));\r\n }\r\n }\r\n process.stdout.write('\\n');\r\n\r\n const endTime = new Date();\r\n const reportData: ReportData = {\r\n rootUrl: url,\r\n discoveredSitemaps: extractor.getDiscoveredSitemaps(),\r\n totalUrls,\r\n totalRisks,\r\n urlsWithRisks,\r\n ignoredUrls,\r\n startTime,\r\n endTime,\r\n };\r\n\r\n // 4. Reporting\r\n const reporters: Reporter[] = [new ConsoleReporter()];\r\n \r\n await fs.mkdir(outDir, { recursive: true });\r\n\r\n if (outputFormat === 'json' || outputFormat === 'all') {\r\n const jsonPath = path.join(outDir, 'sitemap-qa-report.json');\r\n reporters.push(new JsonReporter(jsonPath));\r\n }\r\n if (outputFormat === 'html' || outputFormat === 'all') {\r\n const htmlPath = path.join(outDir, 'sitemap-qa-report.html');\r\n reporters.push(new HtmlReporter(htmlPath));\r\n }\r\n\r\n for (const reporter of reporters) {\r\n await reporter.generate(reportData);\r\n }\r\n\r\n // 5. Exit Code\r\n if (totalRisks > 0) {\r\n process.exit(1);\r\n } else {\r\n process.exit(0);\r\n }\r\n\r\n } catch (error) {\r\n console.error(chalk.red('\\nAnalysis failed:'), error);\r\n process.exit(1);\r\n }\r\n });\r\n","import fs from 'node:fs';\r\nimport path from 'node:path';\r\nimport yaml from 'js-yaml';\r\nimport { ConfigSchema, type Config } from './schema';\r\nimport { DEFAULT_POLICIES } from './defaults';\r\nimport chalk from 'chalk';\r\n\r\nexport class ConfigLoader {\r\n private static readonly DEFAULT_CONFIG_PATH = 'sitemap-qa.yaml';\r\n\r\n static load(configPath?: string): Config {\r\n const targetPath = configPath || path.join(process.cwd(), this.DEFAULT_CONFIG_PATH);\r\n let userConfig: Config = { policies: [] };\r\n\r\n // Load YAML config\r\n if (fs.existsSync(targetPath)) {\r\n try {\r\n const fileContent = fs.readFileSync(targetPath, 'utf8');\r\n const parsedYaml = yaml.load(fileContent);\r\n \r\n const result = ConfigSchema.safeParse(parsedYaml);\r\n \r\n if (!result.success) {\r\n console.error(chalk.red('Configuration Validation Error:'));\r\n result.error.issues.forEach((issue) => {\r\n console.error(chalk.yellow(` - ${issue.path.join('.')}: ${issue.message}`));\r\n });\r\n process.exit(2);\r\n }\r\n\r\n userConfig = result.data;\r\n } catch (error) {\r\n console.error(chalk.red('Failed to load configuration:'), error);\r\n process.exit(2);\r\n }\r\n } else if (configPath) {\r\n console.error(chalk.red(`Error: Configuration file not found at ${targetPath}`));\r\n process.exit(2);\r\n }\r\n\r\n return this.mergeConfigs(DEFAULT_POLICIES, userConfig);\r\n }\r\n\r\n private static mergeConfigs(defaults: Config, user: Config): Config {\r\n const mergedPolicies = [...defaults.policies];\r\n\r\n user.policies.forEach((userPolicy) => {\r\n const index = mergedPolicies.findIndex(p => p.category === userPolicy.category);\r\n if (index !== -1) {\r\n // Replace default category with user category (precedence)\r\n mergedPolicies[index] = userPolicy;\r\n } else {\r\n // Add new user category\r\n mergedPolicies.push(userPolicy);\r\n }\r\n });\r\n\r\n // Start from defaults, then apply merged policies and any user-specified top-level options\r\n const merged: Config = {\r\n ...defaults,\r\n acceptable_patterns: [...(defaults.acceptable_patterns || []), ...(user.acceptable_patterns || [])],\r\n policies: mergedPolicies,\r\n };\r\n\r\n if (user.outDir !== undefined) {\r\n merged.outDir = user.outDir;\r\n }\r\n\r\n if (user.outputFormat !== undefined) {\r\n merged.outputFormat = user.outputFormat;\r\n }\r\n\r\n if (user.enforceDomainConsistency !== undefined) {\r\n merged.enforceDomainConsistency = user.enforceDomainConsistency;\r\n }\r\n\r\n return merged;\r\n }\r\n}\r\n","import { z } from 'zod';\r\n\r\nexport const PatternTypeSchema = z.enum(['literal', 'glob', 'regex']);\r\n\r\nexport const PatternSchema = z.object({\r\n type: PatternTypeSchema,\r\n value: z.string().min(1, \"Pattern value cannot be empty\"),\r\n reason: z.string().min(1, \"Reason is mandatory for each pattern\"),\r\n});\r\n\r\nexport const PolicySchema = z.object({\r\n category: z.string().min(1, \"Category name is mandatory\"),\r\n patterns: z.array(PatternSchema).min(1, \"At least one pattern is required per category\"),\r\n});\r\n\r\nexport const ConfigSchema = z.object({\r\n acceptable_patterns: z.array(PatternSchema).default([]),\r\n policies: z.array(PolicySchema).default([]),\r\n outDir: z.string().optional(),\r\n outputFormat: z.enum(['json', 'html', 'all']).default('all'),\r\n enforceDomainConsistency: z.boolean().default(true),\r\n});\r\n\r\nexport type Config = z.infer<typeof ConfigSchema>;\r\nexport type Policy = z.infer<typeof PolicySchema>;\r\nexport type Pattern = z.infer<typeof PatternSchema>;\r\nexport type PatternType = z.infer<typeof PatternTypeSchema>;\r\n","import { type Config } from './schema';\r\n\r\nexport const DEFAULT_POLICIES: Config = {\r\n acceptable_patterns: [],\r\n outputFormat: \"all\",\r\n enforceDomainConsistency: true,\r\n policies: [\r\n {\r\n category: \"Security & Admin\",\r\n patterns: [\r\n {\r\n type: \"glob\",\r\n value: \"**/admin/**\",\r\n reason: \"Administrative interfaces should not be publicly indexed.\"\r\n },\r\n {\r\n type: \"glob\",\r\n value: \"**/.env*\",\r\n reason: \"Environment files contain sensitive secrets.\"\r\n },\r\n {\r\n type: \"literal\",\r\n value: \"/wp-admin\",\r\n reason: \"WordPress admin paths are common attack vectors.\"\r\n }\r\n ]\r\n },\r\n {\r\n category: \"Environment Leakage\",\r\n patterns: [\r\n {\r\n type: \"glob\",\r\n value: \"**/staging.**\",\r\n reason: \"Staging environments should be restricted.\"\r\n },\r\n {\r\n type: \"glob\",\r\n value: \"**/dev.**\",\r\n reason: \"Development subdomains detected in production sitemap.\"\r\n }\r\n ]\r\n },\r\n {\r\n category: \"Sensitive Files\",\r\n patterns: [\r\n {\r\n type: \"glob\",\r\n value: \"**/*.{sql,bak,zip,tar.gz}\",\r\n reason: \"Archive or database backup files exposed.\"\r\n }\r\n ]\r\n }\r\n ]\r\n};\r\n","import { fetch } from 'undici';\r\nimport { XMLParser } from 'fast-xml-parser';\r\n\r\nexport interface DiscoveredSitemap {\r\n url: string;\r\n xmlData: string;\r\n}\r\n\r\nexport class DiscoveryService {\r\n private readonly parser: XMLParser;\r\n private readonly visited = new Set<string>();\r\n private readonly STANDARD_PATHS = [\r\n '/sitemap.xml',\r\n '/sitemap_index.xml',\r\n '/sitemap-index.xml',\r\n '/sitemap.php',\r\n '/sitemap.xml.gz'\r\n ];\r\n\r\n constructor() {\r\n this.parser = new XMLParser({\r\n ignoreAttributes: false,\r\n attributeNamePrefix: \"@_\",\r\n });\r\n }\r\n\r\n /**\r\n * Attempts to find sitemaps for a given base website URL.\r\n */\r\n async findSitemaps(baseUrl: string): Promise<string[]> {\r\n const sitemaps = new Set<string>();\r\n const url = new URL(baseUrl);\r\n const origin = url.origin;\r\n\r\n // 1. Try robots.txt\r\n try {\r\n const robotsUrl = `${origin}/robots.txt`;\r\n const response = await fetch(robotsUrl);\r\n if (response.status === 200) {\r\n const text = await response.text();\r\n const matches = text.matchAll(/^Sitemap:\\s*(.+)$/gim);\r\n for (const match of matches) {\r\n if (match[1]) sitemaps.add(match[1].trim());\r\n }\r\n }\r\n } catch (e) {\r\n // Ignore robots.txt errors\r\n }\r\n\r\n // 2. Try standard paths if none found in robots.txt\r\n if (sitemaps.size === 0) {\r\n for (const path of this.STANDARD_PATHS) {\r\n try {\r\n const sitemapUrl = `${origin}${path}`;\r\n const response = await fetch(sitemapUrl, { method: 'HEAD' });\r\n if (response.status === 200) {\r\n sitemaps.add(sitemapUrl);\r\n }\r\n } catch (e) {\r\n // Ignore path errors\r\n }\r\n }\r\n }\r\n\r\n return Array.from(sitemaps);\r\n }\r\n\r\n /**\r\n * Recursively discovers all leaf sitemaps from a root URL.\r\n * Returns both the sitemap URL and its XML data to avoid duplicate fetches.\r\n */\r\n async *discover(rootUrl: string): AsyncGenerator<DiscoveredSitemap> {\r\n const queue: string[] = [rootUrl];\r\n\r\n while (queue.length > 0) {\r\n const currentUrl = queue.shift()!;\r\n if (this.visited.has(currentUrl)) continue;\r\n this.visited.add(currentUrl);\r\n\r\n try {\r\n const response = await fetch(currentUrl);\r\n if (response.status !== 200) continue;\r\n \r\n const xmlData = await response.text();\r\n const jsonObj = this.parser.parse(xmlData);\r\n\r\n if (jsonObj.sitemapindex) {\r\n const sitemaps = Array.isArray(jsonObj.sitemapindex.sitemap)\r\n ? jsonObj.sitemapindex.sitemap\r\n : [jsonObj.sitemapindex.sitemap];\r\n\r\n for (const sitemap of sitemaps) {\r\n if (sitemap?.loc) {\r\n queue.push(sitemap.loc);\r\n }\r\n }\r\n } else if (jsonObj.urlset) {\r\n // This is a leaf sitemap - yield both URL and XML data\r\n yield { url: currentUrl, xmlData };\r\n }\r\n } catch (error) {\r\n console.error(`Failed to fetch or parse sitemap at ${currentUrl}:`, error);\r\n }\r\n }\r\n }\r\n}\r\n","import { XMLParser } from 'fast-xml-parser';\r\nimport { fetch } from 'undici';\r\nimport { SitemapUrl } from '../types/sitemap';\r\n\r\nexport class SitemapParser {\r\n private readonly parser: XMLParser;\r\n\r\n constructor() {\r\n this.parser = new XMLParser({\r\n ignoreAttributes: false,\r\n attributeNamePrefix: \"@_\",\r\n });\r\n }\r\n\r\n /**\r\n * Parses a leaf sitemap and yields SitemapUrl objects.\r\n * Can accept either a URL to fetch or pre-fetched XML data with the source URL.\r\n * Note: For true streaming of massive files, we'd use a SAX-like approach.\r\n * fast-xml-parser's parse() is fast but loads the whole string.\r\n * Given the 50k URL requirement, we'll use a more memory-efficient approach if needed,\r\n * but let's start with a clean AsyncGenerator interface.\r\n */\r\n async *parse(sitemapUrlOrData: string | { url: string; xmlData: string }): AsyncGenerator<SitemapUrl> {\r\n let sitemapUrl: string = typeof sitemapUrlOrData === 'string' ? sitemapUrlOrData : sitemapUrlOrData.url;\r\n try {\r\n let xmlData: string;\r\n\r\n if (typeof sitemapUrlOrData === 'string') {\r\n // Legacy behavior: fetch the sitemap\r\n const response = await fetch(sitemapUrl);\r\n xmlData = await response.text();\r\n } else {\r\n // New behavior: use pre-fetched data\r\n xmlData = sitemapUrlOrData.xmlData;\r\n }\r\n\r\n const jsonObj = this.parser.parse(xmlData);\r\n\r\n if (jsonObj.urlset && jsonObj.urlset.url) {\r\n const urls = Array.isArray(jsonObj.urlset.url)\r\n ? jsonObj.urlset.url\r\n : [jsonObj.urlset.url];\r\n\r\n for (const url of urls) {\r\n if (url.loc) {\r\n yield {\r\n loc: url.loc,\r\n source: sitemapUrl,\r\n lastmod: url.lastmod,\r\n changefreq: url.changefreq,\r\n priority: url.priority,\r\n risks: [],\r\n };\r\n }\r\n }\r\n }\r\n } catch (error) {\r\n console.error(`Failed to parse sitemap at ${sitemapUrl}:`, error);\r\n }\r\n }\r\n}\r\n","import { DiscoveryService } from './discovery';\r\nimport { SitemapParser } from './parser';\r\nimport { SitemapUrl } from '../types/sitemap';\r\n\r\nexport class ExtractorService {\r\n private readonly discovery: DiscoveryService;\r\n private readonly parser: SitemapParser;\r\n private readonly seenUrls = new Set<string>();\r\n private readonly discoveredSitemaps = new Set<string>();\r\n\r\n constructor() {\r\n this.discovery = new DiscoveryService();\r\n this.parser = new SitemapParser();\r\n }\r\n\r\n /**\r\n * Returns the list of sitemaps discovered during the extraction process.\r\n */\r\n getDiscoveredSitemaps(): string[] {\r\n return Array.from(this.discoveredSitemaps);\r\n }\r\n\r\n /**\r\n * Normalizes a URL by removing trailing slashes and converting to lowercase.\r\n */\r\n private normalizeUrl(url: string): string {\r\n try {\r\n const parsed = new URL(url);\r\n let normalized = parsed.origin + parsed.pathname.replace(/\\/$/, '');\r\n if (parsed.search) normalized += parsed.search;\r\n return normalized.toLowerCase();\r\n } catch {\r\n return url.toLowerCase().replace(/\\/$/, '');\r\n }\r\n }\r\n\r\n /**\r\n * Extracts all unique URLs from a root sitemap URL or website base URL.\r\n */\r\n async *extract(inputUrl: string): AsyncGenerator<SitemapUrl> {\r\n let startUrls = [inputUrl];\r\n\r\n // If the URL doesn't end in .xml or .gz, it might be a website root\r\n if (!inputUrl.endsWith('.xml') && !inputUrl.endsWith('.gz')) {\r\n const discovered = await this.discovery.findSitemaps(inputUrl);\r\n if (discovered.length > 0) {\r\n console.log(`āœ… Discovered ${discovered.length} sitemap(s): ${discovered.join(', ')}`);\r\n startUrls = discovered;\r\n } else {\r\n console.log(`āš ļø No sitemaps discovered via robots.txt or standard paths. Proceeding with input URL.`);\r\n }\r\n }\r\n\r\n for (const startUrl of startUrls) {\r\n for await (const discovered of this.discovery.discover(startUrl)) {\r\n this.discoveredSitemaps.add(discovered.url);\r\n for await (const urlObj of this.parser.parse(discovered)) {\r\n const normalized = this.normalizeUrl(urlObj.loc);\r\n if (!this.seenUrls.has(normalized)) {\r\n this.seenUrls.add(normalized);\r\n yield urlObj;\r\n }\r\n }\r\n }\r\n }\r\n }\r\n}\r\n","import micromatch from 'micromatch';\r\nimport { type Config, type Pattern } from '../config/schema';\r\nimport { type SitemapUrl, type Risk } from '../types/sitemap';\r\n\r\nexport class MatcherService {\r\n private readonly config: Config;\r\n private readonly rootDomain?: string;\r\n\r\n constructor(config: Config, rootUrl?: string) {\r\n this.config = config;\r\n if (rootUrl) {\r\n try {\r\n this.rootDomain = new URL(rootUrl).hostname.replace(/^www\\./, '');\r\n } catch {\r\n // Invalid URL, ignore\r\n }\r\n }\r\n }\r\n\r\n /**\r\n * Matches a URL against all policies and returns detected risks.\r\n */\r\n match(urlObj: SitemapUrl): Risk[] {\r\n const risks: Risk[] = [];\r\n\r\n // 1. Domain Consistency Check (Highest Priority)\r\n // This check always runs and its risks are never suppressed by acceptable patterns.\r\n if (this.config.enforceDomainConsistency && this.rootDomain) {\r\n try {\r\n const currentDomain = new URL(urlObj.loc).hostname.replace(/^www\\./, '');\r\n if (currentDomain !== this.rootDomain) {\r\n risks.push({\r\n category: 'Domain Consistency',\r\n pattern: this.rootDomain,\r\n type: 'literal',\r\n reason: `URL domain mismatch: expected ${this.rootDomain} (or www.${this.rootDomain}), but found ${currentDomain}.`,\r\n });\r\n }\r\n } catch {\r\n // Invalid URL in sitemap\r\n }\r\n }\r\n\r\n // 2. Check acceptable patterns (Allowlist)\r\n // If a URL matches an acceptable pattern, it is marked as ignored.\r\n // We return early, but we MUST still return any domain consistency risks.\r\n for (const pattern of this.config.acceptable_patterns) {\r\n if (this.isMatch(urlObj.loc, pattern)) {\r\n urlObj.ignored = true;\r\n urlObj.ignoredBy = pattern.reason;\r\n return risks; // Return any existing risks (e.g. Domain Consistency)\r\n }\r\n }\r\n\r\n // 3. Policy Checks\r\n for (const policy of this.config.policies) {\r\n for (const pattern of policy.patterns) {\r\n if (this.isMatch(urlObj.loc, pattern)) {\r\n risks.push({\r\n category: policy.category,\r\n pattern: pattern.value,\r\n type: pattern.type,\r\n reason: pattern.reason,\r\n });\r\n }\r\n }\r\n }\r\n\r\n return risks;\r\n }\r\n\r\n private isMatch(url: string, pattern: Pattern): boolean {\r\n switch (pattern.type) {\r\n case 'literal':\r\n return url.includes(pattern.value);\r\n case 'glob':\r\n return micromatch.isMatch(url, pattern.value, { contains: true });\r\n case 'regex':\r\n try {\r\n const regex = new RegExp(pattern.value, 'i');\r\n return regex.test(url);\r\n } catch {\r\n return false;\r\n }\r\n default:\r\n return false;\r\n }\r\n }\r\n}\r\n","import chalk from 'chalk';\r\nimport { Reporter, ReportData } from './base';\r\n\r\nexport class ConsoleReporter implements Reporter {\r\n async generate(data: ReportData): Promise<void> {\r\n console.log('\\n' + chalk.bold.blue('=== sitemap-qa Analysis Summary ==='));\r\n console.log(`Total URLs Scanned: ${data.totalUrls}`);\r\n console.log(`Total Risks Found: ${data.totalRisks > 0 ? chalk.red(data.totalRisks) : chalk.green(0)}`);\r\n console.log(`URLs with Risks: ${data.urlsWithRisks.length}`);\r\n console.log(`URLs Ignored: ${data.ignoredUrls.length > 0 ? chalk.yellow(data.ignoredUrls.length) : 0}`);\r\n console.log(`Duration: ${((data.endTime.getTime() - data.startTime.getTime()) / 1000).toFixed(2)}s`);\r\n\r\n if (data.urlsWithRisks.length > 0) {\r\n console.log('\\n' + chalk.bold.yellow('Top Findings:'));\r\n data.urlsWithRisks.slice(0, 10).forEach((url) => {\r\n console.log(`\\n${chalk.cyan(url.loc)}`);\r\n url.risks.forEach((risk) => {\r\n console.log(` - [${chalk.red(risk.category)}] ${risk.reason} (${chalk.gray(risk.pattern)})`);\r\n });\r\n });\r\n\r\n if (data.urlsWithRisks.length > 10) {\r\n console.log(`\\n... and ${data.urlsWithRisks.length - 10} more. See JSON/HTML report for full details.`);\r\n }\r\n }\r\n\r\n console.log('\\n' + chalk.bold.blue('==================================='));\r\n }\r\n}\r\n","import fs from 'node:fs/promises';\r\nimport { Reporter, ReportData } from './base';\r\n\r\nexport class JsonReporter implements Reporter {\r\n private readonly outputPath: string;\r\n\r\n constructor(outputPath: string = 'sitemap-qa-report.json') {\r\n this.outputPath = outputPath;\r\n }\r\n\r\n async generate(data: ReportData): Promise<void> {\r\n const report = {\r\n metadata: {\r\n generatedAt: new Date().toISOString(),\r\n durationMs: data.endTime.getTime() - data.startTime.getTime(),\r\n },\r\n summary: {\r\n totalUrls: data.totalUrls,\r\n totalRisks: data.totalRisks,\r\n urlsWithRisksCount: data.urlsWithRisks.length,\r\n ignoredUrlsCount: data.ignoredUrls.length,\r\n },\r\n findings: data.urlsWithRisks,\r\n ignored: data.ignoredUrls,\r\n };\r\n\r\n await fs.writeFile(this.outputPath, JSON.stringify(report, null, 2), 'utf8');\r\n console.log(`JSON report generated at ${this.outputPath}`);\r\n }\r\n}\r\n","import fs from 'node:fs/promises';\r\nimport { Reporter, ReportData } from './base';\r\n\r\nexport class HtmlReporter implements Reporter {\r\n private readonly outputPath: string;\r\n\r\n constructor(outputPath: string = 'sitemap-qa-report.html') {\r\n this.outputPath = outputPath;\r\n }\r\n\r\n async generate(data: ReportData): Promise<void> {\r\n const categories = this.groupRisks(data);\r\n const html = this.generateHtml(data, categories);\r\n\r\n await fs.writeFile(this.outputPath, html, 'utf8');\r\n console.log(`HTML report generated at ${this.outputPath}`);\r\n }\r\n\r\n private groupRisks(data: ReportData) {\r\n const categories: Record<string, Record<string, { reason: string, urls: string[] }>> = {};\r\n\r\n for (const urlObj of data.urlsWithRisks) {\r\n for (const risk of urlObj.risks) {\r\n if (!categories[risk.category]) {\r\n categories[risk.category] = {};\r\n }\r\n if (!categories[risk.category][risk.pattern]) {\r\n categories[risk.category][risk.pattern] = {\r\n reason: risk.reason,\r\n urls: []\r\n };\r\n }\r\n categories[risk.category][risk.pattern].urls.push(urlObj.loc);\r\n }\r\n }\r\n\r\n return categories;\r\n }\r\n\r\n private generateHtml(data: ReportData, categories: any): string {\r\n const duration = ((data.endTime.getTime() - data.startTime.getTime()) / 1000).toFixed(1);\r\n const timestamp = data.endTime.toLocaleString();\r\n const esc = this.escapeHtml.bind(this);\r\n\r\n return `\r\n<!DOCTYPE html>\r\n<html lang=\"en\">\r\n<head>\r\n <meta charset=\"UTF-8\">\r\n <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\">\r\n <title>Sitemap Analysis - ${esc(data.rootUrl)}</title>\r\n <style>\r\n :root {\r\n --bg-dark: #0f172a;\r\n --bg-light: #f8fafc;\r\n --text-main: #1e293b;\r\n --text-muted: #64748b;\r\n --primary: #3b82f6;\r\n --danger: #ef4444;\r\n --warning: #f59e0b;\r\n --border: #e2e8f0;\r\n }\r\n body { \r\n font-family: -apple-system, BlinkMacSystemFont, \"Segoe UI\", Roboto, Helvetica, Arial, sans-serif;\r\n line-height: 1.5;\r\n color: var(--text-main);\r\n background-color: #fff;\r\n margin: 0;\r\n padding: 0;\r\n }\r\n header {\r\n background-color: var(--bg-dark);\r\n color: white;\r\n padding: 40px 20px;\r\n text-align: left;\r\n }\r\n .container {\r\n max-width: 1200px;\r\n margin: 0 auto;\r\n padding: 0 20px;\r\n }\r\n header h1 { margin: 0; font-size: 24px; }\r\n header .meta { margin-top: 10px; color: #94a3b8; font-size: 14px; }\r\n \r\n .summary-grid {\r\n display: grid;\r\n grid-template-columns: repeat(5, 1fr);\r\n border-bottom: 1px solid var(--border);\r\n margin-bottom: 40px;\r\n }\r\n .summary-card {\r\n padding: 30px 20px;\r\n text-align: center;\r\n border-right: 1px solid var(--border);\r\n }\r\n .summary-card:last-child { border-right: none; }\r\n .summary-card h3 { \r\n margin: 0; \r\n font-size: 12px; \r\n text-transform: uppercase; \r\n color: var(--text-muted);\r\n letter-spacing: 0.05em;\r\n }\r\n .summary-card p { \r\n margin: 10px 0 0; \r\n font-size: 32px; \r\n font-weight: 700; \r\n color: var(--text-main);\r\n }\r\n .summary-card.highlight p { color: var(--danger); }\r\n\r\n details {\r\n margin-bottom: 20px;\r\n border: 1px solid var(--border);\r\n border-radius: 8px;\r\n overflow: hidden;\r\n }\r\n summary {\r\n padding: 15px 20px;\r\n background-color: #fff;\r\n cursor: pointer;\r\n font-weight: 600;\r\n display: flex;\r\n justify-content: space-between;\r\n align-items: center;\r\n list-style: none;\r\n }\r\n summary::-webkit-details-marker { display: none; }\r\n summary::after {\r\n content: 'ā–¶';\r\n font-size: 12px;\r\n color: var(--text-muted);\r\n transition: transform 0.2s;\r\n }\r\n details[open] summary::after { transform: rotate(90deg); }\r\n \r\n .category-section {\r\n border: 1px solid var(--warning);\r\n border-radius: 8px;\r\n margin-bottom: 20px;\r\n }\r\n .category-header {\r\n padding: 15px 20px;\r\n background-color: #fffbeb;\r\n color: var(--warning);\r\n font-weight: 600;\r\n cursor: pointer;\r\n display: flex;\r\n justify-content: space-between;\r\n align-items: center;\r\n }\r\n .category-content {\r\n padding: 20px;\r\n background-color: #fff;\r\n }\r\n\r\n .finding-group {\r\n border: 1px solid var(--border);\r\n border-radius: 8px;\r\n padding: 20px;\r\n margin-bottom: 20px;\r\n }\r\n .finding-header {\r\n display: flex;\r\n align-items: center;\r\n gap: 10px;\r\n margin-bottom: 10px;\r\n }\r\n .finding-header h4 { margin: 0; font-size: 16px; }\r\n .badge {\r\n background-color: var(--primary);\r\n color: white;\r\n padding: 2px 8px;\r\n border-radius: 12px;\r\n font-size: 12px;\r\n }\r\n .finding-description {\r\n color: var(--text-muted);\r\n font-size: 14px;\r\n margin-bottom: 20px;\r\n }\r\n \r\n .url-list {\r\n background-color: var(--bg-light);\r\n border-radius: 4px;\r\n padding: 15px;\r\n margin-bottom: 15px;\r\n }\r\n .url-item {\r\n font-family: monospace;\r\n font-size: 13px;\r\n padding: 8px 12px;\r\n background: white;\r\n border: 1px solid var(--border);\r\n border-radius: 4px;\r\n margin-bottom: 8px;\r\n white-space: nowrap;\r\n overflow: hidden;\r\n text-overflow: ellipsis;\r\n }\r\n .url-item:last-child { margin-bottom: 0; }\r\n \r\n .more-count {\r\n font-size: 12px;\r\n color: var(--text-muted);\r\n font-style: italic;\r\n margin-bottom: 15px;\r\n }\r\n\r\n .btn {\r\n display: inline-flex;\r\n align-items: center;\r\n gap: 8px;\r\n background-color: var(--primary);\r\n color: white;\r\n padding: 8px 16px;\r\n border-radius: 6px;\r\n text-decoration: none;\r\n font-size: 13px;\r\n font-weight: 500;\r\n }\r\n .btn:hover { opacity: 0.9; }\r\n\r\n footer {\r\n text-align: center;\r\n padding: 40px;\r\n color: var(--text-muted);\r\n font-size: 12px;\r\n border-top: 1px solid var(--border);\r\n margin-top: 40px;\r\n }\r\n </style>\r\n</head>\r\n<body>\r\n <header>\r\n <div class=\"container\">\r\n <h1>Sitemap Analysis</h1>\r\n <div class=\"meta\">\r\n <div>${esc(data.rootUrl)}</div>\r\n <div>${esc(timestamp)}</div>\r\n </div>\r\n </div>\r\n </header>\r\n\r\n <div class=\"summary-grid\">\r\n <div class=\"summary-card\">\r\n <h3>Sitemaps</h3>\r\n <p>${data.discoveredSitemaps.length}</p>\r\n </div>\r\n <div class=\"summary-card\">\r\n <h3>URLs Analyzed</h3>\r\n <p>${data.totalUrls.toLocaleString()}</p>\r\n </div>\r\n <div class=\"summary-card highlight\">\r\n <h3>Issues Found</h3>\r\n <p>${data.totalRisks}</p>\r\n </div>\r\n <div class=\"summary-card\">\r\n <h3>URLs Ignored</h3>\r\n <p>${data.ignoredUrls.length}</p>\r\n </div>\r\n <div class=\"summary-card\">\r\n <h3>Scan Time</h3>\r\n <p>${duration}s</p>\r\n </div>\r\n </div>\r\n\r\n <div class=\"container\">\r\n <details>\r\n <summary>Sitemaps Discovered (${data.discoveredSitemaps.length})</summary>\r\n <div style=\"padding: 20px; background: var(--bg-light);\">\r\n ${data.discoveredSitemaps.map(s => `<div class=\"url-item\">${esc(s)}</div>`).join('')}\r\n </div>\r\n </details>\r\n\r\n ${data.ignoredUrls.length > 0 ? `\r\n <details>\r\n <summary>Ignored URLs (${data.ignoredUrls.length})</summary>\r\n <div style=\"padding: 20px; background: var(--bg-light);\">\r\n ${data.ignoredUrls.map(u => {\r\n const suppressedRisks = u.risks.length > 0 \r\n ? ` <span style=\"color: var(--danger); font-size: 11px; font-weight: bold;\">[Suppressed Risks: ${[...new Set(u.risks.map(r => r.category))].map(esc).join(', ')}]</span>`\r\n : '';\r\n\r\n const ignoredBy = u.ignoredBy ?? 'Unknown';\r\n return `<div class=\"url-item\" title=\"Ignored by: ${esc(ignoredBy)}\">${esc(u.loc)} <span style=\"color: var(--text-muted); font-size: 11px;\">(by ${esc(ignoredBy)})</span>${suppressedRisks}</div>`;\r\n }).join('')}\r\n </div>\r\n </details>\r\n ` : ''}\r\n\r\n ${Object.entries(categories).map(([category, findings]: [string, any]) => {\r\n const totalCategoryUrls = Object.values(findings).reduce((acc: number, f: any) => acc + f.urls.length, 0);\r\n return `\r\n <div class=\"category-section\">\r\n <div class=\"category-header\">\r\n <span>${esc(category)} (${totalCategoryUrls} URLs)</span>\r\n <span>ā–¼</span>\r\n </div>\r\n <div class=\"category-content\">\r\n ${Object.entries(findings).map(([pattern, finding]: [string, any]) => `\r\n <div class=\"finding-group\">\r\n <div class=\"finding-header\">\r\n <h4>${esc(pattern)}</h4>\r\n <span class=\"badge\">${finding.urls.length} URLs</span>\r\n </div>\r\n <div class=\"finding-description\">\r\n ${esc(finding.reason)}\r\n </div>\r\n <div class=\"url-list\">\r\n ${finding.urls.slice(0, 3).map((url: string) => `\r\n <div class=\"url-item\">${esc(url)}</div>\r\n `).join('')}\r\n </div>\r\n ${finding.urls.length > 3 ? `\r\n <div class=\"more-count\">... and ${finding.urls.length - 3} more</div>\r\n ` : ''}\r\n <a href=\"#\" class=\"btn\" onclick=\"downloadUrls(${JSON.stringify(pattern).replace(/\"/g, '&quot;')}, ${JSON.stringify(finding.urls).replace(/\"/g, '&quot;')})\">\r\n šŸ“„ Download All ${finding.urls.length} URLs\r\n </a>\r\n </div>\r\n `).join('')}\r\n </div>\r\n </div>\r\n `;\r\n }).join('')}\r\n </div>\r\n\r\n <footer>\r\n Generated by sitemap-qa v1.0.0\r\n </footer>\r\n\r\n <script>\r\n function downloadUrls(name, urls) {\r\n const blob = new Blob([urls.join('\\\\n')], { type: 'text/plain' });\r\n const url = window.URL.createObjectURL(blob);\r\n const a = document.createElement('a');\r\n a.href = url;\r\n a.download = \\`\\${name.replace(/[^a-z0-9]/gi, '_').toLowerCase()}_urls.txt\\`;\r\n document.body.appendChild(a);\r\n a.click();\r\n window.URL.revokeObjectURL(url);\r\n document.body.removeChild(a);\r\n }\r\n </script>\r\n</body>\r\n</html>\r\n`;\r\n }\r\n\r\n private escapeHtml(str: string): string {\r\n return str\r\n .replace(/&/g, '&amp;')\r\n .replace(/</g, '&lt;')\r\n .replace(/>/g, '&gt;')\r\n .replace(/\"/g, '&quot;')\r\n .replace(/'/g, '&#039;');\r\n }\r\n}\r\n","import { Command } from 'commander';\r\nimport fs from 'node:fs';\r\nimport path from 'node:path';\r\nimport chalk from 'chalk';\r\n\r\nconst DEFAULT_CONFIG = `# sitemap-qa configuration\r\n# This file defines the risk categories and patterns to monitor.\r\n\r\n# Tool Settings\r\noutDir: \"./sitemap-qa/report\"\r\noutputFormat: \"all\" # Options: json, html, all\r\nenforceDomainConsistency: true\r\n\r\n# Risk Categories\r\n# Each category contains a list of patterns to match against URLs found in sitemaps.\r\n# Patterns can be:\r\n# - literal: Exact string match\r\n# - glob: Glob pattern (e.g., **/admin/**)\r\n# - regex: Regular expression (e.g., /\\\\/v[0-9]+\\\\//)\r\n\r\n# Acceptable Patterns\r\n# URLs matching these patterns will be ignored and not flagged as risks.\r\nacceptable_patterns:\r\n - type: \"literal\"\r\n value: \"/acceptable-path\"\r\n reason: \"Example of an acceptable path that should not be flagged.\"\r\n - type: \"glob\"\r\n value: \"**/public-docs/**\"\r\n reason: \"Public documentation is always acceptable.\"\r\n\r\npolicies:\r\n - category: \"Security & Admin\"\r\n patterns:\r\n - type: \"glob\"\r\n value: \"**/admin/**\"\r\n reason: \"Administrative interfaces should not be publicly indexed.\"\r\n - type: \"glob\"\r\n value: \"**/.env*\"\r\n reason: \"Environment files contain sensitive secrets.\"\r\n - type: \"literal\"\r\n value: \"/wp-admin\"\r\n reason: \"WordPress admin paths are common attack vectors.\"\r\n\r\n - category: \"Environment Leakage\"\r\n patterns:\r\n - type: \"glob\"\r\n value: \"**/staging.**\"\r\n reason: \"Staging environments should be restricted.\"\r\n - type: \"glob\"\r\n value: \"**/dev.**\"\r\n reason: \"Development subdomains detected in production sitemap.\"\r\n\r\n - category: \"Sensitive Files\"\r\n patterns:\r\n - type: \"glob\"\r\n value: \"**/*.{sql,bak,zip,tar.gz}\"\r\n reason: \"Archive or database backup files exposed.\"\r\n`;\r\n\r\nexport const initCommand = new Command('init')\r\n .description('Initialize a default sitemap-qa.yaml configuration file')\r\n .action(() => {\r\n const configPath = path.join(process.cwd(), 'sitemap-qa.yaml');\r\n\r\n if (fs.existsSync(configPath)) {\r\n console.error(chalk.red(`Error: ${configPath} already exists.`));\r\n process.exit(1);\r\n }\r\n\r\n try {\r\n fs.writeFileSync(configPath, DEFAULT_CONFIG, 'utf8');\r\n console.log(chalk.green(`Successfully created ${configPath}`));\r\n } catch (error) {\r\n console.error(chalk.red('Failed to create configuration file:'), error);\r\n process.exit(1);\r\n }\r\n });\r\n"],"mappings":";;;AACA,SAAS,WAAAA,gBAAe;;;ACDxB,SAAS,eAAe;AACxB,OAAOC,YAAW;AAClB,OAAOC,WAAU;AACjB,OAAOC,SAAQ;;;ACHf,OAAO,QAAQ;AACf,OAAO,UAAU;AACjB,OAAO,UAAU;;;ACFjB,SAAS,SAAS;AAEX,IAAM,oBAAoB,EAAE,KAAK,CAAC,WAAW,QAAQ,OAAO,CAAC;AAE7D,IAAM,gBAAgB,EAAE,OAAO;AAAA,EACpC,MAAM;AAAA,EACN,OAAO,EAAE,OAAO,EAAE,IAAI,GAAG,+BAA+B;AAAA,EACxD,QAAQ,EAAE,OAAO,EAAE,IAAI,GAAG,sCAAsC;AAClE,CAAC;AAEM,IAAM,eAAe,EAAE,OAAO;AAAA,EACnC,UAAU,EAAE,OAAO,EAAE,IAAI,GAAG,4BAA4B;AAAA,EACxD,UAAU,EAAE,MAAM,aAAa,EAAE,IAAI,GAAG,+CAA+C;AACzF,CAAC;AAEM,IAAM,eAAe,EAAE,OAAO;AAAA,EACnC,qBAAqB,EAAE,MAAM,aAAa,EAAE,QAAQ,CAAC,CAAC;AAAA,EACtD,UAAU,EAAE,MAAM,YAAY,EAAE,QAAQ,CAAC,CAAC;AAAA,EAC1C,QAAQ,EAAE,OAAO,EAAE,SAAS;AAAA,EAC5B,cAAc,EAAE,KAAK,CAAC,QAAQ,QAAQ,KAAK,CAAC,EAAE,QAAQ,KAAK;AAAA,EAC3D,0BAA0B,EAAE,QAAQ,EAAE,QAAQ,IAAI;AACpD,CAAC;;;ACnBM,IAAM,mBAA2B;AAAA,EACtC,qBAAqB,CAAC;AAAA,EACtB,cAAc;AAAA,EACd,0BAA0B;AAAA,EAC1B,UAAU;AAAA,IACR;AAAA,MACE,UAAU;AAAA,MACV,UAAU;AAAA,QACR;AAAA,UACE,MAAM;AAAA,UACN,OAAO;AAAA,UACP,QAAQ;AAAA,QACV;AAAA,QACA;AAAA,UACE,MAAM;AAAA,UACN,OAAO;AAAA,UACP,QAAQ;AAAA,QACV;AAAA,QACA;AAAA,UACE,MAAM;AAAA,UACN,OAAO;AAAA,UACP,QAAQ;AAAA,QACV;AAAA,MACF;AAAA,IACF;AAAA,IACA;AAAA,MACE,UAAU;AAAA,MACV,UAAU;AAAA,QACR;AAAA,UACE,MAAM;AAAA,UACN,OAAO;AAAA,UACP,QAAQ;AAAA,QACV;AAAA,QACA;AAAA,UACE,MAAM;AAAA,UACN,OAAO;AAAA,UACP,QAAQ;AAAA,QACV;AAAA,MACF;AAAA,IACF;AAAA,IACA;AAAA,MACE,UAAU;AAAA,MACV,UAAU;AAAA,QACR;AAAA,UACE,MAAM;AAAA,UACN,OAAO;AAAA,UACP,QAAQ;AAAA,QACV;AAAA,MACF;AAAA,IACF;AAAA,EACF;AACF;;;AFhDA,OAAO,WAAW;AAEX,IAAM,eAAN,MAAmB;AAAA,EACxB,OAAwB,sBAAsB;AAAA,EAE9C,OAAO,KAAK,YAA6B;AACvC,UAAM,aAAa,cAAc,KAAK,KAAK,QAAQ,IAAI,GAAG,KAAK,mBAAmB;AAClF,QAAI,aAAqB,EAAE,UAAU,CAAC,EAAE;AAGxC,QAAI,GAAG,WAAW,UAAU,GAAG;AAC7B,UAAI;AACF,cAAM,cAAc,GAAG,aAAa,YAAY,MAAM;AACtD,cAAM,aAAa,KAAK,KAAK,WAAW;AAExC,cAAM,SAAS,aAAa,UAAU,UAAU;AAEhD,YAAI,CAAC,OAAO,SAAS;AACnB,kBAAQ,MAAM,MAAM,IAAI,iCAAiC,CAAC;AAC1D,iBAAO,MAAM,OAAO,QAAQ,CAAC,UAAU;AACrC,oBAAQ,MAAM,MAAM,OAAO,OAAO,MAAM,KAAK,KAAK,GAAG,CAAC,KAAK,MAAM,OAAO,EAAE,CAAC;AAAA,UAC7E,CAAC;AACD,kBAAQ,KAAK,CAAC;AAAA,QAChB;AAEA,qBAAa,OAAO;AAAA,MACtB,SAAS,OAAO;AACd,gBAAQ,MAAM,MAAM,IAAI,+BAA+B,GAAG,KAAK;AAC/D,gBAAQ,KAAK,CAAC;AAAA,MAChB;AAAA,IACF,WAAW,YAAY;AACrB,cAAQ,MAAM,MAAM,IAAI,0CAA0C,UAAU,EAAE,CAAC;AAC/E,cAAQ,KAAK,CAAC;AAAA,IAChB;AAEA,WAAO,KAAK,aAAa,kBAAkB,UAAU;AAAA,EACvD;AAAA,EAEA,OAAe,aAAa,UAAkB,MAAsB;AAClE,UAAM,iBAAiB,CAAC,GAAG,SAAS,QAAQ;AAE5C,SAAK,SAAS,QAAQ,CAAC,eAAe;AACpC,YAAM,QAAQ,eAAe,UAAU,OAAK,EAAE,aAAa,WAAW,QAAQ;AAC9E,UAAI,UAAU,IAAI;AAEhB,uBAAe,KAAK,IAAI;AAAA,MAC1B,OAAO;AAEL,uBAAe,KAAK,UAAU;AAAA,MAChC;AAAA,IACF,CAAC;AAGD,UAAM,SAAiB;AAAA,MACrB,GAAG;AAAA,MACH,qBAAqB,CAAC,GAAI,SAAS,uBAAuB,CAAC,GAAI,GAAI,KAAK,uBAAuB,CAAC,CAAE;AAAA,MAClG,UAAU;AAAA,IACZ;AAEA,QAAI,KAAK,WAAW,QAAW;AAC7B,aAAO,SAAS,KAAK;AAAA,IACvB;AAEA,QAAI,KAAK,iBAAiB,QAAW;AACnC,aAAO,eAAe,KAAK;AAAA,IAC7B;AAEA,QAAI,KAAK,6BAA6B,QAAW;AAC/C,aAAO,2BAA2B,KAAK;AAAA,IACzC;AAEA,WAAO;AAAA,EACT;AACF;;;AG9EA,SAAS,aAAa;AACtB,SAAS,iBAAiB;AAOnB,IAAM,mBAAN,MAAuB;AAAA,EACX;AAAA,EACA,UAAU,oBAAI,IAAY;AAAA,EAC1B,iBAAiB;AAAA,IAChC;AAAA,IACA;AAAA,IACA;AAAA,IACA;AAAA,IACA;AAAA,EACF;AAAA,EAEA,cAAc;AACZ,SAAK,SAAS,IAAI,UAAU;AAAA,MAC1B,kBAAkB;AAAA,MAClB,qBAAqB;AAAA,IACvB,CAAC;AAAA,EACH;AAAA;AAAA;AAAA;AAAA,EAKA,MAAM,aAAa,SAAoC;AACrD,UAAM,WAAW,oBAAI,IAAY;AACjC,UAAM,MAAM,IAAI,IAAI,OAAO;AAC3B,UAAM,SAAS,IAAI;AAGnB,QAAI;AACF,YAAM,YAAY,GAAG,MAAM;AAC3B,YAAM,WAAW,MAAM,MAAM,SAAS;AACtC,UAAI,SAAS,WAAW,KAAK;AAC3B,cAAM,OAAO,MAAM,SAAS,KAAK;AACjC,cAAM,UAAU,KAAK,SAAS,sBAAsB;AACpD,mBAAW,SAAS,SAAS;AAC3B,cAAI,MAAM,CAAC,EAAG,UAAS,IAAI,MAAM,CAAC,EAAE,KAAK,CAAC;AAAA,QAC5C;AAAA,MACF;AAAA,IACF,SAAS,GAAG;AAAA,IAEZ;AAGA,QAAI,SAAS,SAAS,GAAG;AACvB,iBAAWC,SAAQ,KAAK,gBAAgB;AACtC,YAAI;AACF,gBAAM,aAAa,GAAG,MAAM,GAAGA,KAAI;AACnC,gBAAM,WAAW,MAAM,MAAM,YAAY,EAAE,QAAQ,OAAO,CAAC;AAC3D,cAAI,SAAS,WAAW,KAAK;AAC3B,qBAAS,IAAI,UAAU;AAAA,UACzB;AAAA,QACF,SAAS,GAAG;AAAA,QAEZ;AAAA,MACF;AAAA,IACF;AAEA,WAAO,MAAM,KAAK,QAAQ;AAAA,EAC5B;AAAA;AAAA;AAAA;AAAA;AAAA,EAMA,OAAO,SAAS,SAAoD;AAClE,UAAM,QAAkB,CAAC,OAAO;AAEhC,WAAO,MAAM,SAAS,GAAG;AACvB,YAAM,aAAa,MAAM,MAAM;AAC/B,UAAI,KAAK,QAAQ,IAAI,UAAU,EAAG;AAClC,WAAK,QAAQ,IAAI,UAAU;AAE3B,UAAI;AACF,cAAM,WAAW,MAAM,MAAM,UAAU;AACvC,YAAI,SAAS,WAAW,IAAK;AAE7B,cAAM,UAAU,MAAM,SAAS,KAAK;AACpC,cAAM,UAAU,KAAK,OAAO,MAAM,OAAO;AAEzC,YAAI,QAAQ,cAAc;AACxB,gBAAM,WAAW,MAAM,QAAQ,QAAQ,aAAa,OAAO,IACvD,QAAQ,aAAa,UACrB,CAAC,QAAQ,aAAa,OAAO;AAEjC,qBAAW,WAAW,UAAU;AAC9B,gBAAI,SAAS,KAAK;AAChB,oBAAM,KAAK,QAAQ,GAAG;AAAA,YACxB;AAAA,UACF;AAAA,QACF,WAAW,QAAQ,QAAQ;AAEzB,gBAAM,EAAE,KAAK,YAAY,QAAQ;AAAA,QACnC;AAAA,MACF,SAAS,OAAO;AACd,gBAAQ,MAAM,uCAAuC,UAAU,KAAK,KAAK;AAAA,MAC3E;AAAA,IACF;AAAA,EACF;AACF;;;ACzGA,SAAS,aAAAC,kBAAiB;AAC1B,SAAS,SAAAC,cAAa;AAGf,IAAM,gBAAN,MAAoB;AAAA,EACR;AAAA,EAEjB,cAAc;AACZ,SAAK,SAAS,IAAID,WAAU;AAAA,MAC1B,kBAAkB;AAAA,MAClB,qBAAqB;AAAA,IACvB,CAAC;AAAA,EACH;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,EAUA,OAAO,MAAM,kBAAyF;AACpG,QAAI,aAAqB,OAAO,qBAAqB,WAAW,mBAAmB,iBAAiB;AACpG,QAAI;AACF,UAAI;AAEJ,UAAI,OAAO,qBAAqB,UAAU;AAExC,cAAM,WAAW,MAAMC,OAAM,UAAU;AACvC,kBAAU,MAAM,SAAS,KAAK;AAAA,MAChC,OAAO;AAEL,kBAAU,iBAAiB;AAAA,MAC7B;AAEA,YAAM,UAAU,KAAK,OAAO,MAAM,OAAO;AAEzC,UAAI,QAAQ,UAAU,QAAQ,OAAO,KAAK;AACxC,cAAM,OAAO,MAAM,QAAQ,QAAQ,OAAO,GAAG,IACzC,QAAQ,OAAO,MACf,CAAC,QAAQ,OAAO,GAAG;AAEvB,mBAAW,OAAO,MAAM;AACtB,cAAI,IAAI,KAAK;AACX,kBAAM;AAAA,cACJ,KAAK,IAAI;AAAA,cACT,QAAQ;AAAA,cACR,SAAS,IAAI;AAAA,cACb,YAAY,IAAI;AAAA,cAChB,UAAU,IAAI;AAAA,cACd,OAAO,CAAC;AAAA,YACV;AAAA,UACF;AAAA,QACF;AAAA,MACF;AAAA,IACF,SAAS,OAAO;AACd,cAAQ,MAAM,8BAA8B,UAAU,KAAK,KAAK;AAAA,IAClE;AAAA,EACF;AACF;;;ACxDO,IAAM,mBAAN,MAAuB;AAAA,EACX;AAAA,EACA;AAAA,EACA,WAAW,oBAAI,IAAY;AAAA,EAC3B,qBAAqB,oBAAI,IAAY;AAAA,EAEtD,cAAc;AACZ,SAAK,YAAY,IAAI,iBAAiB;AACtC,SAAK,SAAS,IAAI,cAAc;AAAA,EAClC;AAAA;AAAA;AAAA;AAAA,EAKA,wBAAkC;AAChC,WAAO,MAAM,KAAK,KAAK,kBAAkB;AAAA,EAC3C;AAAA;AAAA;AAAA;AAAA,EAKQ,aAAa,KAAqB;AACxC,QAAI;AACF,YAAM,SAAS,IAAI,IAAI,GAAG;AAC1B,UAAI,aAAa,OAAO,SAAS,OAAO,SAAS,QAAQ,OAAO,EAAE;AAClE,UAAI,OAAO,OAAQ,eAAc,OAAO;AACxC,aAAO,WAAW,YAAY;AAAA,IAChC,QAAQ;AACN,aAAO,IAAI,YAAY,EAAE,QAAQ,OAAO,EAAE;AAAA,IAC5C;AAAA,EACF;AAAA;AAAA;AAAA;AAAA,EAKA,OAAO,QAAQ,UAA8C;AAC3D,QAAI,YAAY,CAAC,QAAQ;AAGzB,QAAI,CAAC,SAAS,SAAS,MAAM,KAAK,CAAC,SAAS,SAAS,KAAK,GAAG;AAC3D,YAAM,aAAa,MAAM,KAAK,UAAU,aAAa,QAAQ;AAC7D,UAAI,WAAW,SAAS,GAAG;AACzB,gBAAQ,IAAI,qBAAgB,WAAW,MAAM,gBAAgB,WAAW,KAAK,IAAI,CAAC,EAAE;AACpF,oBAAY;AAAA,MACd,OAAO;AACL,gBAAQ,IAAI,kGAAwF;AAAA,MACtG;AAAA,IACF;AAEA,eAAW,YAAY,WAAW;AAChC,uBAAiB,cAAc,KAAK,UAAU,SAAS,QAAQ,GAAG;AAChE,aAAK,mBAAmB,IAAI,WAAW,GAAG;AAC1C,yBAAiB,UAAU,KAAK,OAAO,MAAM,UAAU,GAAG;AACxD,gBAAM,aAAa,KAAK,aAAa,OAAO,GAAG;AAC/C,cAAI,CAAC,KAAK,SAAS,IAAI,UAAU,GAAG;AAClC,iBAAK,SAAS,IAAI,UAAU;AAC5B,kBAAM;AAAA,UACR;AAAA,QACF;AAAA,MACF;AAAA,IACF;AAAA,EACF;AACF;;;AClEA,OAAO,gBAAgB;AAIhB,IAAM,iBAAN,MAAqB;AAAA,EACT;AAAA,EACA;AAAA,EAEjB,YAAY,QAAgB,SAAkB;AAC5C,SAAK,SAAS;AACd,QAAI,SAAS;AACX,UAAI;AACF,aAAK,aAAa,IAAI,IAAI,OAAO,EAAE,SAAS,QAAQ,UAAU,EAAE;AAAA,MAClE,QAAQ;AAAA,MAER;AAAA,IACF;AAAA,EACF;AAAA;AAAA;AAAA;AAAA,EAKA,MAAM,QAA4B;AAChC,UAAM,QAAgB,CAAC;AAIvB,QAAI,KAAK,OAAO,4BAA4B,KAAK,YAAY;AAC3D,UAAI;AACD,cAAM,gBAAgB,IAAI,IAAI,OAAO,GAAG,EAAE,SAAS,QAAQ,UAAU,EAAE;AACxE,YAAI,kBAAkB,KAAK,YAAY;AACrC,gBAAM,KAAK;AAAA,YACT,UAAU;AAAA,YACV,SAAS,KAAK;AAAA,YACd,MAAM;AAAA,YACN,QAAQ,iCAAiC,KAAK,UAAU,YAAY,KAAK,UAAU,gBAAgB,aAAa;AAAA,UAClH,CAAC;AAAA,QACH;AAAA,MACF,QAAQ;AAAA,MAER;AAAA,IACF;AAKA,eAAW,WAAW,KAAK,OAAO,qBAAqB;AACrD,UAAI,KAAK,QAAQ,OAAO,KAAK,OAAO,GAAG;AACrC,eAAO,UAAU;AACjB,eAAO,YAAY,QAAQ;AAC3B,eAAO;AAAA,MACT;AAAA,IACF;AAGA,eAAW,UAAU,KAAK,OAAO,UAAU;AACzC,iBAAW,WAAW,OAAO,UAAU;AACrC,YAAI,KAAK,QAAQ,OAAO,KAAK,OAAO,GAAG;AACrC,gBAAM,KAAK;AAAA,YACT,UAAU,OAAO;AAAA,YACjB,SAAS,QAAQ;AAAA,YACjB,MAAM,QAAQ;AAAA,YACd,QAAQ,QAAQ;AAAA,UAClB,CAAC;AAAA,QACH;AAAA,MACF;AAAA,IACF;AAEA,WAAO;AAAA,EACT;AAAA,EAEQ,QAAQ,KAAa,SAA2B;AACtD,YAAQ,QAAQ,MAAM;AAAA,MACpB,KAAK;AACH,eAAO,IAAI,SAAS,QAAQ,KAAK;AAAA,MACnC,KAAK;AACH,eAAO,WAAW,QAAQ,KAAK,QAAQ,OAAO,EAAE,UAAU,KAAK,CAAC;AAAA,MAClE,KAAK;AACH,YAAI;AACF,gBAAM,QAAQ,IAAI,OAAO,QAAQ,OAAO,GAAG;AAC3C,iBAAO,MAAM,KAAK,GAAG;AAAA,QACvB,QAAQ;AACN,iBAAO;AAAA,QACT;AAAA,MACF;AACE,eAAO;AAAA,IACX;AAAA,EACF;AACF;;;ACxFA,OAAOC,YAAW;AAGX,IAAM,kBAAN,MAA0C;AAAA,EAC/C,MAAM,SAAS,MAAiC;AAC9C,YAAQ,IAAI,OAAOA,OAAM,KAAK,KAAK,qCAAqC,CAAC;AACzE,YAAQ,IAAI,uBAAuB,KAAK,SAAS,EAAE;AACnD,YAAQ,IAAI,uBAAuB,KAAK,aAAa,IAAIA,OAAM,IAAI,KAAK,UAAU,IAAIA,OAAM,MAAM,CAAC,CAAC,EAAE;AACtG,YAAQ,IAAI,uBAAuB,KAAK,cAAc,MAAM,EAAE;AAC9D,YAAQ,IAAI,uBAAuB,KAAK,YAAY,SAAS,IAAIA,OAAM,OAAO,KAAK,YAAY,MAAM,IAAI,CAAC,EAAE;AAC5G,YAAQ,IAAI,yBAAyB,KAAK,QAAQ,QAAQ,IAAI,KAAK,UAAU,QAAQ,KAAK,KAAM,QAAQ,CAAC,CAAC,GAAG;AAE7G,QAAI,KAAK,cAAc,SAAS,GAAG;AACjC,cAAQ,IAAI,OAAOA,OAAM,KAAK,OAAO,eAAe,CAAC;AACrD,WAAK,cAAc,MAAM,GAAG,EAAE,EAAE,QAAQ,CAAC,QAAQ;AAC/C,gBAAQ,IAAI;AAAA,EAAKA,OAAM,KAAK,IAAI,GAAG,CAAC,EAAE;AACtC,YAAI,MAAM,QAAQ,CAAC,SAAS;AAC1B,kBAAQ,IAAI,QAAQA,OAAM,IAAI,KAAK,QAAQ,CAAC,KAAK,KAAK,MAAM,KAAKA,OAAM,KAAK,KAAK,OAAO,CAAC,GAAG;AAAA,QAC9F,CAAC;AAAA,MACH,CAAC;AAED,UAAI,KAAK,cAAc,SAAS,IAAI;AAClC,gBAAQ,IAAI;AAAA,UAAa,KAAK,cAAc,SAAS,EAAE,+CAA+C;AAAA,MACxG;AAAA,IACF;AAEA,YAAQ,IAAI,OAAOA,OAAM,KAAK,KAAK,qCAAqC,CAAC;AAAA,EAC3E;AACF;;;AC5BA,OAAOC,SAAQ;AAGR,IAAM,eAAN,MAAuC;AAAA,EAC3B;AAAA,EAEjB,YAAY,aAAqB,0BAA0B;AACzD,SAAK,aAAa;AAAA,EACpB;AAAA,EAEA,MAAM,SAAS,MAAiC;AAC9C,UAAM,SAAS;AAAA,MACb,UAAU;AAAA,QACR,cAAa,oBAAI,KAAK,GAAE,YAAY;AAAA,QACpC,YAAY,KAAK,QAAQ,QAAQ,IAAI,KAAK,UAAU,QAAQ;AAAA,MAC9D;AAAA,MACA,SAAS;AAAA,QACP,WAAW,KAAK;AAAA,QAChB,YAAY,KAAK;AAAA,QACjB,oBAAoB,KAAK,cAAc;AAAA,QACvC,kBAAkB,KAAK,YAAY;AAAA,MACrC;AAAA,MACA,UAAU,KAAK;AAAA,MACf,SAAS,KAAK;AAAA,IAChB;AAEA,UAAMA,IAAG,UAAU,KAAK,YAAY,KAAK,UAAU,QAAQ,MAAM,CAAC,GAAG,MAAM;AAC3E,YAAQ,IAAI,4BAA4B,KAAK,UAAU,EAAE;AAAA,EAC3D;AACF;;;AC7BA,OAAOC,SAAQ;AAGR,IAAM,eAAN,MAAuC;AAAA,EAC3B;AAAA,EAEjB,YAAY,aAAqB,0BAA0B;AACzD,SAAK,aAAa;AAAA,EACpB;AAAA,EAEA,MAAM,SAAS,MAAiC;AAC9C,UAAM,aAAa,KAAK,WAAW,IAAI;AACvC,UAAM,OAAO,KAAK,aAAa,MAAM,UAAU;AAE/C,UAAMA,IAAG,UAAU,KAAK,YAAY,MAAM,MAAM;AAChD,YAAQ,IAAI,4BAA4B,KAAK,UAAU,EAAE;AAAA,EAC3D;AAAA,EAEQ,WAAW,MAAkB;AACnC,UAAM,aAAiF,CAAC;AAExF,eAAW,UAAU,KAAK,eAAe;AACvC,iBAAW,QAAQ,OAAO,OAAO;AAC/B,YAAI,CAAC,WAAW,KAAK,QAAQ,GAAG;AAC9B,qBAAW,KAAK,QAAQ,IAAI,CAAC;AAAA,QAC/B;AACA,YAAI,CAAC,WAAW,KAAK,QAAQ,EAAE,KAAK,OAAO,GAAG;AAC5C,qBAAW,KAAK,QAAQ,EAAE,KAAK,OAAO,IAAI;AAAA,YACxC,QAAQ,KAAK;AAAA,YACb,MAAM,CAAC;AAAA,UACT;AAAA,QACF;AACA,mBAAW,KAAK,QAAQ,EAAE,KAAK,OAAO,EAAE,KAAK,KAAK,OAAO,GAAG;AAAA,MAC9D;AAAA,IACF;AAEA,WAAO;AAAA,EACT;AAAA,EAEQ,aAAa,MAAkB,YAAyB;AAC9D,UAAM,aAAa,KAAK,QAAQ,QAAQ,IAAI,KAAK,UAAU,QAAQ,KAAK,KAAM,QAAQ,CAAC;AACvF,UAAM,YAAY,KAAK,QAAQ,eAAe;AAC9C,UAAM,MAAM,KAAK,WAAW,KAAK,IAAI;AAErC,WAAO;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,gCAMquBA4L1B,IAAI,KAAK,OAAO,CAAC;AAAA,uBACjB,IAAI,SAAS,CAAC;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,iBAQpB,KAAK,mBAAmB,MAAM;AAAA;AAAA;AAAA;AAAA,iBAI9B,KAAK,UAAU,eAAe,CAAC;AAAA;AAAA;AAAA;AAAA,iBAI/B,KAAK,UAAU;AAAA;AAAA;AAAA;AAAA,iBAIf,KAAK,YAAY,MAAM;AAAA;AAAA;AAAA;AAAA,iBAIvB,QAAQ;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,4CAMmB,KAAK,mBAAmB,MAAM;AAAA;AAAA,kBAExD,KAAK,mBAAmB,IAAI,OAAK,yBAAyB,IAAI,CAAC,CAAC,QAAQ,EAAE,KAAK,EAAE,CAAC;AAAA;AAAA;AAAA;AAAA,UAI1F,KAAK,YAAY,SAAS,IAAI;AAAA;AAAA,qCAEH,KAAK,YAAY,MAAM;AAAA;AAAA,kBAE1C,KAAK,YAAY,IAAI,OAAK;AACxB,YAAM,kBAAkB,EAAE,MAAM,SAAS,IACnC,+FAA+F,CAAC,GAAG,IAAI,IAAI,EAAE,MAAM,IAAI,OAAK,EAAE,QAAQ,CAAC,CAAC,EAAE,IAAI,GAAG,EAAE,KAAK,IAAI,CAAC,aAC7J;AAEN,YAAM,YAAY,EAAE,aAAa;AACjC,aAAO,4CAA4C,IAAI,SAAS,CAAC,KAAK,IAAI,EAAE,GAAG,CAAC,iEAAiE,IAAI,SAAS,CAAC,WAAW,eAAe;AAAA,IAC7L,CAAC,EAAE,KAAK,EAAE,CAAC;AAAA;AAAA;AAAA,YAGf,EAAE;AAAA;AAAA,UAEJ,OAAO,QAAQ,UAAU,EAAE,IAAI,CAAC,CAAC,UAAU,QAAQ,MAAqB;AACtE,YAAM,oBAAoB,OAAO,OAAO,QAAQ,EAAE,OAAO,CAAC,KAAa,MAAW,MAAM,EAAE,KAAK,QAAQ,CAAC;AACxG,aAAO;AAAA;AAAA;AAAA,4BAGS,IAAI,QAAQ,CAAC,KAAK,iBAAiB;AAAA;AAAA;AAAA;AAAA,sBAIzC,OAAO,QAAQ,QAAQ,EAAE,IAAI,CAAC,CAAC,SAAS,OAAO,MAAqB;AAAA;AAAA;AAAA,sCAGpD,IAAI,OAAO,CAAC;AAAA,sDACI,QAAQ,KAAK,MAAM;AAAA;AAAA;AAAA,kCAGvC,IAAI,QAAQ,MAAM,CAAC;AAAA;AAAA;AAAA,kCAGnB,QAAQ,KAAK,MAAM,GAAG,CAAC,EAAE,IAAI,CAAC,QAAgB;AAAA,4DACpB,IAAI,GAAG,CAAC;AAAA,iCACnC,EAAE,KAAK,EAAE,CAAC;AAAA;AAAA,8BAEb,QAAQ,KAAK,SAAS,IAAI;AAAA,kEACU,QAAQ,KAAK,SAAS,CAAC;AAAA,gCACzD,EAAE;AAAA,4EAC0C,KAAK,UAAU,OAAO,EAAE,QAAQ,MAAM,QAAQ,CAAC,KAAK,KAAK,UAAU,QAAQ,IAAI,EAAE,QAAQ,MAAM,QAAQ,CAAC;AAAA,yDAClI,QAAQ,KAAK,MAAM;AAAA;AAAA;AAAA,qBAGhD,EAAE,KAAK,EAAE,CAAC;AAAA;AAAA;AAAA;AAAA,IAIvB,CAAC,EAAE,KAAK,EAAE,CAAC;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,EAuBjB;AAAA,EAEQ,WAAW,KAAqB;AACtC,WAAO,IACJ,QAAQ,MAAM,OAAO,EACrB,QAAQ,MAAM,MAAM,EACpB,QAAQ,MAAM,MAAM,EACpB,QAAQ,MAAM,QAAQ,EACtB,QAAQ,MAAM,QAAQ;AAAA,EAC3B;AACF;;;AVzVO,IAAM,iBAAiB,IAAI,QAAQ,SAAS,EAChD,YAAY,uCAAuC,EACnD,SAAS,SAAS,kBAAkB,EACpC,OAAO,uBAAuB,yBAAyB,EACvD,OAAO,yBAAyB,iCAAiC,EACjE,OAAO,wBAAwB,2BAA2B,EAC1D,OAAO,OAAO,KAAa,YAAmE;AAC7F,QAAM,YAAY,oBAAI,KAAK;AAG3B,QAAM,SAAS,aAAa,KAAK,QAAQ,MAAM;AAC/C,QAAM,SAAS,QAAQ,UAAU,OAAO,UAAU;AAClD,QAAM,eAAe,QAAQ,UAAU,OAAO,gBAAgB;AAG9D,QAAM,YAAY,IAAI,iBAAiB;AACvC,QAAM,UAAU,IAAI,eAAe,QAAQ,GAAG;AAE9C,QAAM,gBAA8B,CAAC;AACrC,QAAM,cAA4B,CAAC;AACnC,MAAI,YAAY;AAChB,MAAI,aAAa;AAEjB,UAAQ,IAAIC,OAAM,KAAK;AAAA,iCAA6B,GAAG,KAAK,CAAC;AAE7D,MAAI;AAEF,qBAAiB,UAAU,UAAU,QAAQ,GAAG,GAAG;AACjD;AACA,YAAM,QAAQ,QAAQ,MAAM,MAAM;AAElC,UAAI,MAAM,SAAS,GAAG;AACpB,eAAO,QAAQ;AACf,sBAAc,KAAK,MAAM;AACzB,sBAAc,MAAM;AAAA,MACtB,WAAW,OAAO,SAAS;AACzB,oBAAY,KAAK,MAAM;AAAA,MACzB;AAEA,UAAI,YAAY,QAAQ,GAAG;AACzB,gBAAQ,OAAO,MAAMA,OAAM,KAAK,eAAe,SAAS,UAAU,CAAC;AAAA,MACrE;AAAA,IACF;AACA,YAAQ,OAAO,MAAM,IAAI;AAEzB,UAAM,UAAU,oBAAI,KAAK;AACzB,UAAM,aAAyB;AAAA,MAC7B,SAAS;AAAA,MACT,oBAAoB,UAAU,sBAAsB;AAAA,MACpD;AAAA,MACA;AAAA,MACA;AAAA,MACA;AAAA,MACA;AAAA,MACA;AAAA,IACF;AAGA,UAAM,YAAwB,CAAC,IAAI,gBAAgB,CAAC;AAEpD,UAAMC,IAAG,MAAM,QAAQ,EAAE,WAAW,KAAK,CAAC;AAE1C,QAAI,iBAAiB,UAAU,iBAAiB,OAAO;AACrD,YAAM,WAAWC,MAAK,KAAK,QAAQ,wBAAwB;AAC3D,gBAAU,KAAK,IAAI,aAAa,QAAQ,CAAC;AAAA,IAC3C;AACA,QAAI,iBAAiB,UAAU,iBAAiB,OAAO;AACrD,YAAM,WAAWA,MAAK,KAAK,QAAQ,wBAAwB;AAC3D,gBAAU,KAAK,IAAI,aAAa,QAAQ,CAAC;AAAA,IAC3C;AAEA,eAAW,YAAY,WAAW;AAChC,YAAM,SAAS,SAAS,UAAU;AAAA,IACpC;AAGA,QAAI,aAAa,GAAG;AAClB,cAAQ,KAAK,CAAC;AAAA,IAChB,OAAO;AACL,cAAQ,KAAK,CAAC;AAAA,IAChB;AAAA,EAEF,SAAS,OAAO;AACd,YAAQ,MAAMF,OAAM,IAAI,oBAAoB,GAAG,KAAK;AACpD,YAAQ,KAAK,CAAC;AAAA,EAChB;AACF,CAAC;;;AWnGH,SAAS,WAAAG,gBAAe;AACxB,OAAOC,SAAQ;AACf,OAAOC,WAAU;AACjB,OAAOC,YAAW;AAElB,IAAM,iBAAiB;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAsDhB,IAAM,cAAc,IAAIH,SAAQ,MAAM,EAC1C,YAAY,yDAAyD,EACrE,OAAO,MAAM;AACZ,QAAM,aAAaE,MAAK,KAAK,QAAQ,IAAI,GAAG,iBAAiB;AAE7D,MAAID,IAAG,WAAW,UAAU,GAAG;AAC7B,YAAQ,MAAME,OAAM,IAAI,UAAU,UAAU,kBAAkB,CAAC;AAC/D,YAAQ,KAAK,CAAC;AAAA,EAChB;AAEA,MAAI;AACF,IAAAF,IAAG,cAAc,YAAY,gBAAgB,MAAM;AACnD,YAAQ,IAAIE,OAAM,MAAM,wBAAwB,UAAU,EAAE,CAAC;AAAA,EAC/D,SAAS,OAAO;AACd,YAAQ,MAAMA,OAAM,IAAI,sCAAsC,GAAG,KAAK;AACtE,YAAQ,KAAK,CAAC;AAAA,EAChB;AACF,CAAC;;;AZvEH,IAAM,UAAU,IAAIC,SAAQ;AAE5B,QACG,KAAK,YAAY,EACjB,QAAQ,OAAO,EACf,YAAY,+BAA+B;AAE9C,QAAQ,WAAW,cAAc;AACjC,QAAQ,WAAW,WAAW;AAG9B,QAAQ,GAAG,sBAAsB,CAAC,QAAQ,YAAY;AACpD,UAAQ,MAAM,2BAA2B,SAAS,WAAW,MAAM;AACnE,UAAQ,KAAK,CAAC;AAChB,CAAC;AAGD,QAAQ,GAAG,UAAU,MAAM;AACzB,UAAQ,IAAI,+BAA+B;AAC3C,UAAQ,KAAK,CAAC;AAChB,CAAC;AAED,QAAQ,GAAG,WAAW,MAAM;AAC1B,UAAQ,IAAI,+BAA+B;AAC3C,UAAQ,KAAK,CAAC;AAChB,CAAC;AAED,QAAQ,MAAM;","names":["Command","chalk","path","fs","path","XMLParser","fetch","chalk","fs","fs","chalk","fs","path","Command","fs","path","chalk","Command"]}
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@akotliar/sitemap-qa",
3
- "version": "1.0.0-alpha.4",
3
+ "version": "1.0.0-alpha.5",
4
4
  "description": "Detect test/qa/dev/staging URLs, admin paths, sensitive parameters, and URLs that shouldn't be publicly indexed.",
5
5
  "type": "module",
6
6
  "main": "./dist/index.js",