@govtechsg/oobee 0.10.93 → 0.10.94
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +20 -0
- package/dist/cli.js +3 -2
- package/dist/combine.js +3 -3
- package/dist/constants/common.js +119 -52
- package/dist/crawlers/commonCrawlerFunc.js +11 -2
- package/dist/crawlers/crawlDomain.js +4 -6
- package/dist/crawlers/crawlSitemap.js +14 -2
- package/dist/crawlers/custom/utils.js +22 -9
- package/dist/crawlers/guards/urlGuard.js +19 -1
- package/dist/static/ejs/partials/components/allIssues/CategoryBadges.ejs +3 -0
- package/dist/static/ejs/partials/components/allIssues/IssuesTable.ejs +3 -3
- package/dist/static/ejs/partials/components/header/aboutScanModal/AboutScanModal.ejs +1 -1
- package/dist/static/ejs/partials/components/header/aboutScanModal/ScanConfiguration.ejs +3 -3
- package/dist/static/ejs/partials/components/header/aboutScanModal/ScanDetails.ejs +34 -27
- package/dist/static/ejs/partials/components/ruleModal/ruleOffcanvas.ejs +1 -0
- package/dist/static/ejs/partials/components/scannedPagesSegmentedTabs.ejs +7 -0
- package/dist/static/ejs/partials/components/wcagCoverageDetails.ejs +5 -5
- package/dist/static/ejs/partials/scripts/header/aboutScanModal/AboutScanModal.ejs +3 -3
- package/dist/static/ejs/partials/scripts/prioritiseIssues/PrioritiseIssues.ejs +21 -19
- package/dist/static/ejs/partials/scripts/ruleModal/pageAccordionBuilder.ejs +39 -8
- package/dist/static/ejs/partials/scripts/scannedPagesSegmentedTabs.ejs +11 -5
- package/dist/static/ejs/partials/scripts/screenshotLightbox.ejs +49 -31
- package/dist/static/ejs/partials/styles/header/SiteInfo.ejs +1 -1
- package/dist/static/ejs/partials/styles/header/aboutScanModal/ScanDetails.ejs +36 -16
- package/dist/static/ejs/partials/styles/prioritiseIssues/PrioritiseIssues.ejs +22 -1
- package/dist/static/ejs/partials/styles/styles.ejs +1 -1
- package/dist/static/ejs/partials/styles/wcagCompliance/WcagGaugeBar.ejs +6 -0
- package/dist/static/ejs/partials/styles/wcagCompliance.ejs +5 -4
- package/dist/static/ejs/partials/styles/wcagCoverageDetails.ejs +6 -1
- package/oobee-client-scanner.js +2 -2
- package/package.json +1 -1
- package/src/cli.ts +3 -2
- package/src/combine.ts +3 -2
- package/src/constants/common.ts +112 -36
- package/src/crawlers/commonCrawlerFunc.ts +11 -2
- package/src/crawlers/crawlDomain.ts +4 -5
- package/src/crawlers/crawlSitemap.ts +19 -2
- package/src/crawlers/custom/utils.ts +26 -13
- package/src/crawlers/guards/urlGuard.ts +18 -1
- package/src/static/ejs/partials/components/allIssues/CategoryBadges.ejs +3 -0
- package/src/static/ejs/partials/components/allIssues/IssuesTable.ejs +3 -3
- package/src/static/ejs/partials/components/header/aboutScanModal/AboutScanModal.ejs +1 -1
- package/src/static/ejs/partials/components/header/aboutScanModal/ScanConfiguration.ejs +3 -3
- package/src/static/ejs/partials/components/header/aboutScanModal/ScanDetails.ejs +34 -27
- package/src/static/ejs/partials/components/ruleModal/ruleOffcanvas.ejs +1 -0
- package/src/static/ejs/partials/components/scannedPagesSegmentedTabs.ejs +7 -0
- package/src/static/ejs/partials/components/wcagCoverageDetails.ejs +5 -5
- package/src/static/ejs/partials/scripts/header/aboutScanModal/AboutScanModal.ejs +3 -3
- package/src/static/ejs/partials/scripts/prioritiseIssues/PrioritiseIssues.ejs +21 -19
- package/src/static/ejs/partials/scripts/ruleModal/pageAccordionBuilder.ejs +39 -8
- package/src/static/ejs/partials/scripts/scannedPagesSegmentedTabs.ejs +11 -5
- package/src/static/ejs/partials/scripts/screenshotLightbox.ejs +49 -31
- package/src/static/ejs/partials/styles/header/SiteInfo.ejs +1 -1
- package/src/static/ejs/partials/styles/header/aboutScanModal/ScanDetails.ejs +36 -16
- package/src/static/ejs/partials/styles/prioritiseIssues/PrioritiseIssues.ejs +22 -1
- package/src/static/ejs/partials/styles/styles.ejs +1 -1
- package/src/static/ejs/partials/styles/wcagCompliance/WcagGaugeBar.ejs +6 -0
- package/src/static/ejs/partials/styles/wcagCompliance.ejs +5 -4
- package/src/static/ejs/partials/styles/wcagCoverageDetails.ejs +6 -1
- package/testStaticJSScanner.html +1 -1
- /package/{7339fae5-e8ed-4b50-af13-317847620dbf.txt → 67e8137b-1939-4253-8f11-a82bc833cfcb.txt} +0 -0
package/AGENTS.md
CHANGED
|
@@ -112,6 +112,10 @@ Important behaviors:
|
|
|
112
112
|
- The crawler itself enforces `maxRequestsPerCrawl` by counting only successfully scanned pages
|
|
113
113
|
- `constants.sitemapFetchedLinks` stores the total discovered count for `scanData.json` reporting
|
|
114
114
|
- For sitemap indexes, child sitemaps are processed recursively
|
|
115
|
+
- Some sitemap XMLs include `<?xml-stylesheet ...?>` (XSL). In `getDataUsingPlaywright()`:
|
|
116
|
+
- Use `waitUntil: 'domcontentloaded'` (not `networkidle`) to avoid 60s timeouts caused by stylesheet/resource loading
|
|
117
|
+
- Prefer `response.text()` to capture raw XML before browser XSL transformation (preserves `<sitemapindex>` / `<urlset>` structure)
|
|
118
|
+
- Only fall back to DOM extraction when raw response text is unavailable
|
|
115
119
|
|
|
116
120
|
## Shared Mutable State
|
|
117
121
|
|
|
@@ -229,6 +233,12 @@ docker run oobee node dist/cli.js ...
|
|
|
229
233
|
|
|
230
234
|
10. **Intermediate JSONL write safety + corruption tolerance** — `ItemsStore.appendPageItems()` requires strict serialization of writes per rule file to prevent interleaved corruption. It also enforces a strict text sanitization regex to filter out literal `\n` and `\r` control characters from website HTML inputs immediately after `JSON.stringify()`. This ensures no single JSON issue accidentally injects illegal implicit newline boundaries when writing to JSONL format. Maintain backward-compatible `fs.appendFile` queues over buffered WriteStreams to guarantee pipeline sync visibility. `ItemsStore.readRuleItems()` tolerates historical malformed lines via fallback skip logic.
|
|
231
235
|
|
|
236
|
+
11. **`preNavigationHooks` and the Playwright header-rewrite warning** — `preNavigationHooks()` in `commonCrawlerFunc.ts` is always included in the crawler `preNavigationHooks` array (for both `crawlDomain` and `crawlSitemap`). The hook does two things:
|
|
237
|
+
- **Header rewriting**: only sets `crawlingContext.request.headers = extraHTTPHeaders` when `extraHTTPHeaders` is non-empty. Setting request headers causes Crawlee/Playwright to intercept every network request to rewrite them, which triggers `WARN Playwright Utils: Using other request methods than GET, rewriting headers and adding payloads has a high impact on performance`. This warning is expected for authenticated scans; it is suppressed for unauthenticated scans because `extraHTTPHeaders` stays empty (see pitfall 12 below).
|
|
238
|
+
- **Navigation wait**: always sets `gotoOptions.waitUntil = 'domcontentloaded'` and `gotoOptions.timeout = 30000` via **in-place object mutation**. Do NOT reassign the `gotoOptions` parameter (`gotoOptions = {...}`) — that only rebinds the local variable and does not propagate to Crawlee. `domcontentloaded` is used (not `networkidle`) to avoid indefinite hangs on sites with WebSockets, analytics polling, lazy-load beacons, or health-check pings that never quiet their network activity. Further page stability is handled by `waitForPageLoaded()` in each requestHandler and the DOM mutation observer in `postNavigationHooks`.
|
|
239
|
+
|
|
240
|
+
12. **`extraHTTPHeaders` must not be mutated before being passed to crawlers** — `checkUrlConnectivityWithBrowser()` in `common.ts` needs an `Accept` header for its own connectivity check but must NOT add it to the shared `extraHTTPHeaders` object. Mutating the shared object causes crawlers to see a non-empty `extraHTTPHeaders` (at minimum `{ Accept: '...' }`), which silently triggers header rewriting and the Playwright performance warning for every unauthenticated scan. Always use a local copy: `const localHeaders = { ...extraHTTPHeaders }; localHeaders.Accept ||= '...';`.
|
|
241
|
+
|
|
232
242
|
## Testing Considerations
|
|
233
243
|
|
|
234
244
|
When making changes, validate these areas which have well-established edge cases:
|
|
@@ -260,6 +270,16 @@ When making changes, validate these areas which have well-established edge cases
|
|
|
260
270
|
- `document.title` must be captured at the START of `runAxeScript()`, before axe scanning or screenshot capture. Pages can close during these operations (timeout, navigation, crash). Never create a new page just to re-navigate for the title — this leaks pages.
|
|
261
271
|
- The URL guard script in custom flow must be defensive against pages that close unexpectedly. All page event handlers should handle closed contexts gracefully.
|
|
262
272
|
|
|
273
|
+
### URL Guard & Overlay Management in Custom Flow
|
|
274
|
+
|
|
275
|
+
`src/crawlers/guards/urlGuard.ts` — attached via `addUrlGuardScript()` in `runCustom.ts`:
|
|
276
|
+
|
|
277
|
+
- **`restoreToSafeUrl` must validate the safe URL before calling `page.goto()`**. If the entry URL is `file://` (e.g. `-u '/path/to/report.html'`), `fallbackUrl` is also `file://`. Redirecting to it fires another `framenavigated` for `file://`, which re-triggers `restoreToSafeUrl` → infinite reload loop. Always check `ALLOWED_PROTOCOLS.has(safeObj.protocol)` before navigating; if the fallback is not http/https, return without redirecting.
|
|
278
|
+
|
|
279
|
+
- **`about:` protocol must be skipped in `framenavigated`**. Chromium fires `framenavigated` for `about:blank` as a transient intermediate state during every `page.goto()` call. Intercepting it and calling `restoreToSafeUrl` → `page.goto(safeUrl)` → `about:blank` → `restoreToSafeUrl` → … creates a second infinite loop. Always `return` early when `urlObj.protocol === 'about:'`.
|
|
280
|
+
|
|
281
|
+
- **`reconcileOverlayMenu` must not remove the overlay on macOS/Windows**. On `darwin`/`win32` the custom flow runs headful. When `isOverlayAllowed` returns `false` (e.g. transient `file://` or `about:blank` URL), do **not** call `removeOverlayMenu` — the URL guard will redirect back to the safe URL momentarily. Instead, fall through to the `hasOverlay` / `addOverlayMenu` block so the overlay is (re-)injected regardless of the current URL protocol. On Linux/Docker (headless) the removal behaviour is unchanged.
|
|
282
|
+
|
|
263
283
|
### Proxy & Network
|
|
264
284
|
- Proxy detection must handle `ALL_PROXY` on Windows. The proxy resolution logic should be tested on all platforms.
|
|
265
285
|
|
package/dist/cli.js
CHANGED
|
@@ -199,9 +199,10 @@ const scanInit = async (argvs) => {
|
|
|
199
199
|
if (res.httpStatus)
|
|
200
200
|
consoleLogger.info(`Connectivity Check HTTP Response Code: ${res.httpStatus}`);
|
|
201
201
|
if (res.status === statuses.success.code) {
|
|
202
|
-
//
|
|
203
|
-
//
|
|
202
|
+
// Keep browser-resolved URL as entryUrl for downstream scan metadata/events
|
|
203
|
+
// on non-custom scans.
|
|
204
204
|
if (data.type !== ScannerTypes.CUSTOM) {
|
|
205
|
+
data.entryUrl = res.url;
|
|
205
206
|
data.url = res.url;
|
|
206
207
|
}
|
|
207
208
|
if (process.env.OOBEE_VALIDATE_URL) {
|
package/dist/combine.js
CHANGED
|
@@ -23,7 +23,7 @@ export class ViewportSettingsClass {
|
|
|
23
23
|
}
|
|
24
24
|
const combineRun = async (details, deviceToScan) => {
|
|
25
25
|
const envDetails = { ...details };
|
|
26
|
-
const { type, url, nameEmail, randomToken, deviceChosen, customDevice, viewportWidth, playwrightDeviceDetailsObject, maxRequestsPerCrawl, browser, userDataDirectory, strategy, // Allow subdomains: if checked, = 'same-domain'
|
|
26
|
+
const { type, url, entryUrl, nameEmail, randomToken, deviceChosen, customDevice, viewportWidth, playwrightDeviceDetailsObject, maxRequestsPerCrawl, browser, userDataDirectory, strategy, // Allow subdomains: if checked, = 'same-domain'
|
|
27
27
|
specifiedMaxConcurrency, // Slow scan mode: if checked, = '1'
|
|
28
28
|
fileTypes, blacklistedPatternsFilename, includeScreenshots, // Include screenshots: if checked, = 'true'
|
|
29
29
|
followRobots, // Adhere to robots.txt: if checked, = 'true'
|
|
@@ -59,8 +59,8 @@ const combineRun = async (details, deviceToScan) => {
|
|
|
59
59
|
}
|
|
60
60
|
// remove basic-auth credentials from URL
|
|
61
61
|
const finalUrl = !(type === ScannerTypes.SITEMAP || type === ScannerTypes.LOCALFILE)
|
|
62
|
-
? new URL(
|
|
63
|
-
: new URL(pathToFileURL(
|
|
62
|
+
? new URL(entryUrl)
|
|
63
|
+
: new URL(pathToFileURL(entryUrl));
|
|
64
64
|
// Use the string version of finalUrl to reduce logic at submitForm
|
|
65
65
|
const finalUrlString = finalUrl.toString();
|
|
66
66
|
const scanDetails = {
|
package/dist/constants/common.js
CHANGED
|
@@ -292,15 +292,18 @@ const checkUrlConnectivityWithBrowser = async (url, browserToRun, clonedDataDir,
|
|
|
292
292
|
return res;
|
|
293
293
|
}
|
|
294
294
|
}
|
|
295
|
-
// Ensure Accept header for non-html content fallback
|
|
296
|
-
extraHTTPHeaders
|
|
295
|
+
// Ensure Accept header for non-html content fallback — use a local copy to avoid
|
|
296
|
+
// mutating the caller's extraHTTPHeaders object (which is later checked by crawlers
|
|
297
|
+
// to decide whether to enable preNavigationHooks header rewriting).
|
|
298
|
+
const localHeaders = { ...extraHTTPHeaders };
|
|
299
|
+
localHeaders.Accept ||= 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8';
|
|
297
300
|
await initModifiedUserAgent(browserToRun, playwrightDeviceDetailsObject, clonedDataDir);
|
|
298
301
|
let browserContext;
|
|
299
302
|
let browserInstance;
|
|
300
303
|
const rawDevice = (playwrightDeviceDetailsObject || {});
|
|
301
304
|
const { viewport, isMobile, hasTouch, userAgent: deviceUserAgent, ...restDevice } = rawDevice;
|
|
302
305
|
const launchOptions = getPlaywrightLaunchOptions(browserToRun);
|
|
303
|
-
const { Authorization, ...nonAuthHeaders } =
|
|
306
|
+
const { Authorization, ...nonAuthHeaders } = localHeaders || {};
|
|
304
307
|
let httpCredentials = undefined;
|
|
305
308
|
if (Authorization?.startsWith('Basic ')) {
|
|
306
309
|
const decoded = Buffer.from(Authorization.slice(6), 'base64').toString();
|
|
@@ -355,21 +358,23 @@ const checkUrlConnectivityWithBrowser = async (url, browserToRun, clonedDataDir,
|
|
|
355
358
|
// Only enable generic Authorization header routing interception broadly if
|
|
356
359
|
// a non-Basic Bearer auth string is heavily relied upon, thereby bypassing
|
|
357
360
|
// performance warnings inside the check checkUrl phase for typical public scans
|
|
358
|
-
if (
|
|
359
|
-
|
|
360
|
-
|
|
361
|
-
|
|
362
|
-
|
|
363
|
-
|
|
361
|
+
if (Object.keys(localHeaders).length > 0) {
|
|
362
|
+
if (Authorization && !httpCredentials) {
|
|
363
|
+
const entryOrigin = new URL(url).origin;
|
|
364
|
+
await browserContext.route('**/*', async (route, request) => {
|
|
365
|
+
try {
|
|
366
|
+
if (new URL(request.url()).origin === entryOrigin) {
|
|
367
|
+
await route.continue({ headers: { ...request.headers(), Authorization } });
|
|
368
|
+
}
|
|
369
|
+
else {
|
|
370
|
+
await route.continue();
|
|
371
|
+
}
|
|
364
372
|
}
|
|
365
|
-
|
|
373
|
+
catch {
|
|
366
374
|
await route.continue();
|
|
367
375
|
}
|
|
368
|
-
}
|
|
369
|
-
|
|
370
|
-
await route.continue();
|
|
371
|
-
}
|
|
372
|
-
});
|
|
376
|
+
});
|
|
377
|
+
}
|
|
373
378
|
}
|
|
374
379
|
const page = await browserContext.newPage();
|
|
375
380
|
// Block native Chrome download UI
|
|
@@ -491,7 +496,7 @@ export const isSitemapContent = (content) => {
|
|
|
491
496
|
return true;
|
|
492
497
|
}
|
|
493
498
|
const regexForHtml = new RegExp('<(?:!doctype html|html|head|body)+?>', 'gmi');
|
|
494
|
-
const regexForXmlSitemap = new RegExp('<(?:urlset|feed|rss)+?.*>', 'gmi');
|
|
499
|
+
const regexForXmlSitemap = new RegExp('<(?:urlset|sitemapindex|feed|rss)+?.*>', 'gmi');
|
|
495
500
|
if (content.match(regexForHtml) && content.match(regexForXmlSitemap)) {
|
|
496
501
|
// is an XML sitemap wrapped in a HTML document
|
|
497
502
|
return true;
|
|
@@ -505,7 +510,18 @@ export const isSitemapContent = (content) => {
|
|
|
505
510
|
return false;
|
|
506
511
|
};
|
|
507
512
|
export const checkUrl = async (scanner, url, browser, clonedDataDir, playwrightDeviceDetailsObject, extraHTTPHeaders, fileTypes) => {
|
|
508
|
-
|
|
513
|
+
let urlToCheck = url;
|
|
514
|
+
if (scanner === ScannerTypes.LOCALFILE) {
|
|
515
|
+
if (!isFilePath(url)) {
|
|
516
|
+
const res = new RES();
|
|
517
|
+
res.status = constants.urlCheckStatuses.notALocalFile.code;
|
|
518
|
+
return res;
|
|
519
|
+
}
|
|
520
|
+
if (!url.toLowerCase().startsWith('file://')) {
|
|
521
|
+
urlToCheck = pathToFileURL(path.resolve(url)).toString();
|
|
522
|
+
}
|
|
523
|
+
}
|
|
524
|
+
const res = await checkUrlConnectivityWithBrowser(urlToCheck, browser, clonedDataDir, playwrightDeviceDetailsObject, extraHTTPHeaders);
|
|
509
525
|
// If response is 200 (meaning no other code was set earlier)
|
|
510
526
|
if (res.status === constants.urlCheckStatuses.success.code) {
|
|
511
527
|
// Check if document is pdf type
|
|
@@ -552,7 +568,7 @@ export const prepareData = async (argv) => {
|
|
|
552
568
|
if (isEmptyObject(argv)) {
|
|
553
569
|
throw Error('No inputs should be provided');
|
|
554
570
|
}
|
|
555
|
-
let { scanner, headless, url, deviceChosen, customDevice, viewportWidth, maxpages, strategy, isLocalFileScan = argv.scanner === ScannerTypes.LOCALFILE, browserToRun, nameEmail, customFlowLabel, specifiedMaxConcurrency, fileTypes, blacklistedPatternsFilename, additional, metadata, followRobots, header, safeMode, exportDirectory, zip, ruleset, generateJsonFiles, scanDuration, } = argv;
|
|
571
|
+
let { scanner, headless, url, deviceChosen, customDevice, viewportWidth, maxpages, strategy, isLocalFileScan = argv.scanner === ScannerTypes.LOCALFILE, browserToRun, nameEmail, customFlowLabel, specifiedMaxConcurrency, fileTypes, blacklistedPatternsFilename, additional, metadata, followRobots, header, safeMode, exportDirectory, zip, ruleset, generateJsonFiles, scanDuration, finalUrl, } = argv;
|
|
556
572
|
const extraHTTPHeaders = parseHeaders(header);
|
|
557
573
|
// Set default username and password for basic auth
|
|
558
574
|
let username = '';
|
|
@@ -578,6 +594,9 @@ export const prepareData = async (argv) => {
|
|
|
578
594
|
temp.password = '';
|
|
579
595
|
url = temp.toString();
|
|
580
596
|
}
|
|
597
|
+
// Keep browser-resolved URL (if provided by pre-check flow) as canonical entry URL.
|
|
598
|
+
// For local file paths, keep using the normalized `url` value below.
|
|
599
|
+
const resolvedEntryUrl = finalUrl && !isFilePath(finalUrl) ? finalUrl : url;
|
|
581
600
|
// construct filename for scan results
|
|
582
601
|
const [date, time] = new Date().toLocaleString('sv').replaceAll(/-|:/g, '').split(' ');
|
|
583
602
|
const domain = isLocalFileScan ? path.basename(url) : new URL(url).hostname;
|
|
@@ -605,7 +624,7 @@ export const prepareData = async (argv) => {
|
|
|
605
624
|
return {
|
|
606
625
|
type: scanner,
|
|
607
626
|
url,
|
|
608
|
-
entryUrl:
|
|
627
|
+
entryUrl: resolvedEntryUrl,
|
|
609
628
|
isHeadless: headless,
|
|
610
629
|
deviceChosen,
|
|
611
630
|
customDevice,
|
|
@@ -810,6 +829,7 @@ export const getLinksFromSitemap = async (sitemapUrl, _maxLinksCount, browser, u
|
|
|
810
829
|
const scannedSitemaps = new Set();
|
|
811
830
|
const sitemapLinkCounts = {};
|
|
812
831
|
const allUrls = new Set(); // all discovered URLs (lightweight strings)
|
|
832
|
+
const isImageSitemapUrl = (candidateUrl) => /(^|\/)image-sitemap(?:-index)?(?:-\d+)?\.xml(?:$|[?#])/i.test(candidateUrl);
|
|
813
833
|
const addToUrlList = (url) => {
|
|
814
834
|
if (!url)
|
|
815
835
|
return;
|
|
@@ -880,6 +900,10 @@ export const getLinksFromSitemap = async (sitemapUrl, _maxLinksCount, browser, u
|
|
|
880
900
|
const fetchUrls = async (url, extraHTTPHeaders) => {
|
|
881
901
|
let data;
|
|
882
902
|
let sitemapType;
|
|
903
|
+
if (isImageSitemapUrl(url)) {
|
|
904
|
+
consoleLogger.info(`Skipping image sitemap: ${url}`);
|
|
905
|
+
return;
|
|
906
|
+
}
|
|
883
907
|
if (scannedSitemaps.has(url)) {
|
|
884
908
|
// Skip processing if the sitemap has already been scanned
|
|
885
909
|
return;
|
|
@@ -926,27 +950,45 @@ export const getLinksFromSitemap = async (sitemapUrl, _maxLinksCount, browser, u
|
|
|
926
950
|
});
|
|
927
951
|
}
|
|
928
952
|
const page = await browserContext.newPage();
|
|
929
|
-
|
|
930
|
-
|
|
931
|
-
|
|
932
|
-
|
|
933
|
-
|
|
934
|
-
|
|
935
|
-
|
|
936
|
-
|
|
937
|
-
|
|
938
|
-
|
|
939
|
-
|
|
940
|
-
data = await urlSet.evaluate(elem => elem.outerHTML);
|
|
953
|
+
// Use 'domcontentloaded' instead of 'networkidle' — sitemap XMLs with
|
|
954
|
+
// XSL stylesheet references (e.g. <?xml-stylesheet ...?>) cause the browser
|
|
955
|
+
// to fetch and apply the stylesheet, which may load additional resources
|
|
956
|
+
// (fonts, CSS, images) that prevent 'networkidle' from ever being reached.
|
|
957
|
+
const response = await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 60000 });
|
|
958
|
+
// Prefer the raw response body — this gives us the original XML before
|
|
959
|
+
// the browser applies any XSL transformation (which would turn the XML
|
|
960
|
+
// into rendered HTML, losing the sitemap structure).
|
|
961
|
+
if (response) {
|
|
962
|
+
try {
|
|
963
|
+
data = await response.text();
|
|
941
964
|
}
|
|
942
|
-
|
|
943
|
-
|
|
965
|
+
catch {
|
|
966
|
+
// response.text() can fail if the body was already consumed or
|
|
967
|
+
// if a redirect occurred; fall through to DOM extraction below.
|
|
944
968
|
}
|
|
945
|
-
|
|
946
|
-
|
|
969
|
+
}
|
|
970
|
+
if (!data) {
|
|
971
|
+
if ((await page.locator('body').count()) > 0) {
|
|
972
|
+
data = await page.locator('body').innerText();
|
|
947
973
|
}
|
|
948
|
-
else
|
|
949
|
-
|
|
974
|
+
else {
|
|
975
|
+
const urlSet = page.locator('urlset');
|
|
976
|
+
const sitemapIndex = page.locator('sitemapindex');
|
|
977
|
+
const rss = page.locator('rss');
|
|
978
|
+
const feed = page.locator('feed');
|
|
979
|
+
const isRoot = async (locator) => (await locator.count()) > 0;
|
|
980
|
+
if (await isRoot(urlSet)) {
|
|
981
|
+
data = await urlSet.evaluate(elem => elem.outerHTML);
|
|
982
|
+
}
|
|
983
|
+
else if (await isRoot(sitemapIndex)) {
|
|
984
|
+
data = await sitemapIndex.evaluate(elem => elem.outerHTML);
|
|
985
|
+
}
|
|
986
|
+
else if (await isRoot(rss)) {
|
|
987
|
+
data = await rss.evaluate(elem => elem.outerHTML);
|
|
988
|
+
}
|
|
989
|
+
else if (await isRoot(feed)) {
|
|
990
|
+
data = await feed.evaluate(elem => elem.outerHTML);
|
|
991
|
+
}
|
|
950
992
|
}
|
|
951
993
|
}
|
|
952
994
|
}
|
|
@@ -969,37 +1011,61 @@ export const getLinksFromSitemap = async (sitemapUrl, _maxLinksCount, browser, u
|
|
|
969
1011
|
data = fs.readFileSync(url, 'utf8');
|
|
970
1012
|
}
|
|
971
1013
|
const $ = cheerio.load(data, { xml: true });
|
|
1014
|
+
const countBefore = allUrls.size;
|
|
972
1015
|
// This case is when the document is not an XML format document
|
|
973
1016
|
if ($(':root').length === 0) {
|
|
974
1017
|
processNonStandardSitemap(data);
|
|
1018
|
+
const linksFromThisSitemap = allUrls.size - countBefore;
|
|
1019
|
+
if (linksFromThisSitemap > 0) {
|
|
1020
|
+
sitemapLinkCounts[url] = (sitemapLinkCounts[url] || 0) + linksFromThisSitemap;
|
|
1021
|
+
}
|
|
975
1022
|
return;
|
|
976
1023
|
}
|
|
977
1024
|
// Root element
|
|
978
1025
|
const root = $(':root')[0];
|
|
979
|
-
const
|
|
980
|
-
|
|
981
|
-
|
|
1026
|
+
const hasImageNamespace = Object.values(root?.attribs ?? {}).some(attribVal => typeof attribVal === 'string' && attribVal.toLowerCase().includes('sitemap-image'));
|
|
1027
|
+
if (hasImageNamespace) {
|
|
1028
|
+
consoleLogger.info(`Skipping image sitemap: ${url}`);
|
|
1029
|
+
return;
|
|
1030
|
+
}
|
|
1031
|
+
const rootName = root?.name?.toLowerCase().split(':').pop() ?? '';
|
|
1032
|
+
const hasXmlSitemapIndexTag = /<\s*(?:[a-z0-9_-]+:)?sitemapindex\b/i.test(data);
|
|
1033
|
+
const hasXmlUrlsetTag = /<\s*(?:[a-z0-9_-]+:)?urlset\b/i.test(data);
|
|
1034
|
+
if (rootName === 'urlset') {
|
|
982
1035
|
sitemapType = constants.xmlSitemapTypes.xml;
|
|
983
1036
|
}
|
|
984
|
-
else if (
|
|
1037
|
+
else if (rootName === 'sitemapindex') {
|
|
985
1038
|
sitemapType = constants.xmlSitemapTypes.xmlIndex;
|
|
986
1039
|
}
|
|
987
|
-
else if (
|
|
1040
|
+
else if (rootName === 'rss') {
|
|
988
1041
|
sitemapType = constants.xmlSitemapTypes.rss;
|
|
989
1042
|
}
|
|
990
|
-
else if (
|
|
1043
|
+
else if (rootName === 'feed') {
|
|
991
1044
|
sitemapType = constants.xmlSitemapTypes.atom;
|
|
992
1045
|
}
|
|
1046
|
+
else if (hasXmlSitemapIndexTag) {
|
|
1047
|
+
sitemapType = constants.xmlSitemapTypes.xmlIndex;
|
|
1048
|
+
}
|
|
1049
|
+
else if (hasXmlUrlsetTag) {
|
|
1050
|
+
sitemapType = constants.xmlSitemapTypes.xml;
|
|
1051
|
+
}
|
|
993
1052
|
else {
|
|
994
1053
|
sitemapType = constants.xmlSitemapTypes.unknown;
|
|
995
1054
|
}
|
|
996
|
-
const countBefore = allUrls.size;
|
|
997
1055
|
switch (sitemapType) {
|
|
998
1056
|
case constants.xmlSitemapTypes.xmlIndex:
|
|
999
|
-
consoleLogger.info(`This is a XML format sitemap index
|
|
1057
|
+
consoleLogger.info(`This is a XML format sitemap index: ${url}`);
|
|
1000
1058
|
for (const childSitemapUrl of $('loc')) {
|
|
1001
|
-
const childSitemapUrlText = $(childSitemapUrl).text();
|
|
1002
|
-
if (childSitemapUrlText
|
|
1059
|
+
const childSitemapUrlText = $(childSitemapUrl).text().trim();
|
|
1060
|
+
if (!childSitemapUrlText) {
|
|
1061
|
+
continue;
|
|
1062
|
+
}
|
|
1063
|
+
const childSitemapPath = childSitemapUrlText.split(/[?#]/)[0].toLowerCase();
|
|
1064
|
+
if (childSitemapPath.endsWith('.xml') || childSitemapPath.endsWith('.txt')) {
|
|
1065
|
+
if (isImageSitemapUrl(childSitemapUrlText)) {
|
|
1066
|
+
consoleLogger.info(`Skipping image sitemap: ${childSitemapUrlText}`);
|
|
1067
|
+
continue;
|
|
1068
|
+
}
|
|
1003
1069
|
await fetchUrls(childSitemapUrlText, extraHTTPHeaders); // Recursive call for nested sitemaps
|
|
1004
1070
|
}
|
|
1005
1071
|
else {
|
|
@@ -1008,19 +1074,19 @@ export const getLinksFromSitemap = async (sitemapUrl, _maxLinksCount, browser, u
|
|
|
1008
1074
|
}
|
|
1009
1075
|
break;
|
|
1010
1076
|
case constants.xmlSitemapTypes.xml:
|
|
1011
|
-
consoleLogger.info(`This is a XML format sitemap
|
|
1077
|
+
consoleLogger.info(`This is a XML format sitemap: ${url}`);
|
|
1012
1078
|
await processXmlSitemap($, sitemapType, 'loc', 'lastmod', 'url');
|
|
1013
1079
|
break;
|
|
1014
1080
|
case constants.xmlSitemapTypes.rss:
|
|
1015
|
-
consoleLogger.info(`This is a RSS format sitemap
|
|
1081
|
+
consoleLogger.info(`This is a RSS format sitemap: ${url}`);
|
|
1016
1082
|
await processXmlSitemap($, sitemapType, 'link', 'pubDate', 'item');
|
|
1017
1083
|
break;
|
|
1018
1084
|
case constants.xmlSitemapTypes.atom:
|
|
1019
|
-
consoleLogger.info(`This is a Atom format sitemap
|
|
1085
|
+
consoleLogger.info(`This is a Atom format sitemap: ${url}`);
|
|
1020
1086
|
await processXmlSitemap($, sitemapType, 'link', 'published', 'entry');
|
|
1021
1087
|
break;
|
|
1022
1088
|
default:
|
|
1023
|
-
consoleLogger.info(`This is an unrecognised XML sitemap format
|
|
1089
|
+
consoleLogger.info(`This is an unrecognised XML sitemap format: ${url}`);
|
|
1024
1090
|
processNonStandardSitemap(data);
|
|
1025
1091
|
}
|
|
1026
1092
|
const linksFromThisSitemap = allUrls.size - countBefore;
|
|
@@ -1836,7 +1902,8 @@ function isValidHttpUrl(urlString) {
|
|
|
1836
1902
|
export const isFilePath = (url) => {
|
|
1837
1903
|
const driveLetterPattern = /^[A-Z]:/i;
|
|
1838
1904
|
const backslashPattern = /\\/;
|
|
1839
|
-
return (url.startsWith('
|
|
1905
|
+
return (url.toLowerCase().startsWith('file://') ||
|
|
1906
|
+
url.startsWith('/') ||
|
|
1840
1907
|
driveLetterPattern.test(url) ||
|
|
1841
1908
|
backslashPattern.test(url) ||
|
|
1842
1909
|
url.startsWith('./') ||
|
|
@@ -898,10 +898,19 @@ export const createCrawleeSubFolders = async (randomToken) => {
|
|
|
898
898
|
export const preNavigationHooks = (extraHTTPHeaders) => {
|
|
899
899
|
return [
|
|
900
900
|
async (crawlingContext, gotoOptions) => {
|
|
901
|
-
if (extraHTTPHeaders) {
|
|
901
|
+
if (extraHTTPHeaders && Object.keys(extraHTTPHeaders).length > 0) {
|
|
902
902
|
crawlingContext.request.headers = extraHTTPHeaders;
|
|
903
903
|
}
|
|
904
|
-
|
|
904
|
+
// Use domcontentloaded — fires as soon as the DOM is parsed, before
|
|
905
|
+
// images/stylesheets/network requests settle. This avoids indefinite
|
|
906
|
+
// hangs on sites with WebSockets, analytics polling, or infinite-scroll
|
|
907
|
+
// beacons that never reach networkidle. Further page stability is
|
|
908
|
+
// handled by waitForPageLoaded() in each crawler's requestHandler and
|
|
909
|
+
// by the DOM mutation observer in postNavigationHooks.
|
|
910
|
+
if (gotoOptions) {
|
|
911
|
+
gotoOptions.waitUntil = 'domcontentloaded';
|
|
912
|
+
gotoOptions.timeout = 30000;
|
|
913
|
+
}
|
|
905
914
|
},
|
|
906
915
|
];
|
|
907
916
|
};
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
import crawlee from 'crawlee';
|
|
2
2
|
import { CrawlRateController } from './crawlRateController.js';
|
|
3
|
-
import { createCrawleeSubFolders, getPreLaunchHook, runAxeScript, isUrlPdf, shouldSkipClickDueToDisallowedHref, shouldSkipDueToUnsupportedContent, splitAuthHeaders, } from './commonCrawlerFunc.js';
|
|
3
|
+
import { createCrawleeSubFolders, getPreLaunchHook, preNavigationHooks, runAxeScript, isUrlPdf, shouldSkipClickDueToDisallowedHref, shouldSkipDueToUnsupportedContent, splitAuthHeaders, } from './commonCrawlerFunc.js';
|
|
4
4
|
import constants, { blackListedFileExtensions, guiInfoStatusTypes, cssQuerySelectors, STATUS_CODE_METADATA, disallowedListOfPatterns, disallowedSelectorPatterns, FileTypes, } from '../constants/constants.js';
|
|
5
5
|
import { getPlaywrightLaunchOptions, isBlacklistedFileExtensions, isSkippedUrl, isDisallowedInRobotsTxt, getUrlsFromRobotsTxt, waitForPageLoaded, } from '../constants/common.js';
|
|
6
6
|
import { areLinksEqual, isFollowStrategy, isSameHostname, normUrl, register } from '../utils.js';
|
|
@@ -301,12 +301,10 @@ const crawlDomain = async ({ url, randomToken, host: _host, viewportSettings, ma
|
|
|
301
301
|
],
|
|
302
302
|
},
|
|
303
303
|
requestQueue,
|
|
304
|
+
maxRequestRetries: 3,
|
|
305
|
+
maxSessionRotations: 1,
|
|
304
306
|
preNavigationHooks: [
|
|
305
|
-
|
|
306
|
-
if (extraHTTPHeaders) {
|
|
307
|
-
crawlingContext.request.headers = extraHTTPHeaders;
|
|
308
|
-
}
|
|
309
|
-
},
|
|
307
|
+
...preNavigationHooks(extraHTTPHeaders),
|
|
310
308
|
],
|
|
311
309
|
postNavigationHooks: [
|
|
312
310
|
async (crawlingContext) => {
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
import crawlee, { EnqueueStrategy, RequestList } from 'crawlee';
|
|
2
2
|
import { CrawlRateController } from './crawlRateController.js';
|
|
3
|
-
import { createCrawleeSubFolders, getPreLaunchHook, preNavigationHooks, runAxeScript, } from './commonCrawlerFunc.js';
|
|
3
|
+
import { createCrawleeSubFolders, getPreLaunchHook, preNavigationHooks, runAxeScript, splitAuthHeaders, } from './commonCrawlerFunc.js';
|
|
4
4
|
import constants, { STATUS_CODE_METADATA, guiInfoStatusTypes, disallowedListOfPatterns, FileTypes, } from '../constants/constants.js';
|
|
5
5
|
import { getLinksFromSitemap, getPlaywrightLaunchOptions, isSkippedUrl, waitForPageLoaded, isFilePath, } from '../constants/common.js';
|
|
6
6
|
import { areLinksEqual, isFollowStrategy, isWhitelistedContentType, normUrl, register } from '../utils.js';
|
|
@@ -13,6 +13,7 @@ const crawlSitemap = async ({ sitemapUrl, randomToken, host, viewportSettings, m
|
|
|
13
13
|
let durationExceeded = false;
|
|
14
14
|
let isAbortingScan = false;
|
|
15
15
|
const rateController = new CrawlRateController(maxRequestsPerCrawl, specifiedMaxConcurrency || constants.maxConcurrency);
|
|
16
|
+
const initialNoSuccessFailureAbortThreshold = Math.max(5, Math.min(maxRequestsPerCrawl, 25));
|
|
16
17
|
if (fromCrawlIntelligentSitemap) {
|
|
17
18
|
dataset = datasetFromIntelligent;
|
|
18
19
|
urlsCrawled = urlsCrawledFromIntelligent;
|
|
@@ -33,6 +34,7 @@ const crawlSitemap = async ({ sitemapUrl, randomToken, host, viewportSettings, m
|
|
|
33
34
|
const isScanPdfs = [FileTypes.All, FileTypes.PdfOnly].includes(fileTypes);
|
|
34
35
|
const { playwrightDeviceDetailsObject } = viewportSettings;
|
|
35
36
|
const { maxConcurrency } = constants;
|
|
37
|
+
const { nonAuthHeaders, httpCredentials } = splitAuthHeaders(extraHTTPHeaders);
|
|
36
38
|
const requestList = await RequestList.open({
|
|
37
39
|
sources: linksFromSitemap,
|
|
38
40
|
});
|
|
@@ -53,11 +55,15 @@ const crawlSitemap = async ({ sitemapUrl, randomToken, host, viewportSettings, m
|
|
|
53
55
|
...playwrightDeviceDetailsObject,
|
|
54
56
|
...(process.env.OOBEE_USER_AGENT && { userAgent: process.env.OOBEE_USER_AGENT }),
|
|
55
57
|
...(process.env.OOBEE_DISABLE_BROWSER_DOWNLOAD && { acceptDownloads: false }),
|
|
58
|
+
...(nonAuthHeaders && { extraHTTPHeaders: nonAuthHeaders }),
|
|
59
|
+
...(httpCredentials && { httpCredentials }),
|
|
56
60
|
};
|
|
57
61
|
},
|
|
58
62
|
],
|
|
59
63
|
},
|
|
60
64
|
requestList,
|
|
65
|
+
maxRequestRetries: 3,
|
|
66
|
+
maxSessionRotations: 1,
|
|
61
67
|
postNavigationHooks: [
|
|
62
68
|
async ({ page }) => {
|
|
63
69
|
try {
|
|
@@ -104,6 +110,7 @@ const crawlSitemap = async ({ sitemapUrl, randomToken, host, viewportSettings, m
|
|
|
104
110
|
},
|
|
105
111
|
],
|
|
106
112
|
preNavigationHooks: [
|
|
113
|
+
...preNavigationHooks(extraHTTPHeaders),
|
|
107
114
|
async ({ request, page }, gotoOptions) => {
|
|
108
115
|
const url = request.url.toLowerCase();
|
|
109
116
|
const isNotSupportedDocument = disallowedListOfPatterns.some(pattern => url.startsWith(pattern));
|
|
@@ -114,7 +121,6 @@ const crawlSitemap = async ({ sitemapUrl, randomToken, host, viewportSettings, m
|
|
|
114
121
|
// console.log(`[SKIP] Not supported: ${request.url}`);
|
|
115
122
|
return;
|
|
116
123
|
}
|
|
117
|
-
preNavigationHooks(extraHTTPHeaders);
|
|
118
124
|
},
|
|
119
125
|
],
|
|
120
126
|
requestHandlerTimeoutSecs: 90,
|
|
@@ -310,6 +316,12 @@ const crawlSitemap = async ({ sitemapUrl, randomToken, host, viewportSettings, m
|
|
|
310
316
|
httpStatusCode: typeof status === 'number' ? status : 0,
|
|
311
317
|
});
|
|
312
318
|
crawlee.log.error(`Failed Request - ${request.url}: ${request.errorMessages}`);
|
|
319
|
+
if (urlsCrawled.scanned.length === 0 &&
|
|
320
|
+
urlsCrawled.error.length >= initialNoSuccessFailureAbortThreshold) {
|
|
321
|
+
consoleLogger.info(`Aborting sitemap crawl: ${urlsCrawled.error.length} failed pages with 0 successful scans.`);
|
|
322
|
+
isAbortingScan = true;
|
|
323
|
+
crawler.autoscaledPool?.abort();
|
|
324
|
+
}
|
|
313
325
|
},
|
|
314
326
|
maxRequestsPerCrawl: Infinity,
|
|
315
327
|
maxConcurrency: specifiedMaxConcurrency || maxConcurrency,
|
|
@@ -1064,15 +1064,28 @@ export const initNewPage = async (page, pageClosePromises, processPageParams, pa
|
|
|
1064
1064
|
return;
|
|
1065
1065
|
const allowed = isOverlayAllowed(page.url(), processPageParams.entryUrl);
|
|
1066
1066
|
if (!allowed) {
|
|
1067
|
-
|
|
1068
|
-
|
|
1069
|
-
|
|
1070
|
-
|
|
1071
|
-
|
|
1072
|
-
|
|
1073
|
-
|
|
1074
|
-
|
|
1075
|
-
|
|
1067
|
+
// On macOS and Windows the custom flow always runs headful.
|
|
1068
|
+
// The URL guard (urlGuard.ts) intercepts non-http/https navigations
|
|
1069
|
+
// and calls page.goto(safeUrl). Do NOT remove the overlay here —
|
|
1070
|
+
// removing it causes it to stay permanently disabled if the redirect
|
|
1071
|
+
// races ahead of the next reconcile cycle.
|
|
1072
|
+
// Instead, fall through to the hasOverlay / addOverlayMenu block so
|
|
1073
|
+
// the overlay is (re-)injected even on transient non-http/https URLs
|
|
1074
|
+
// (e.g. file://, about:blank) and again after the guard's redirect.
|
|
1075
|
+
const isDesktopHost = process.platform === 'darwin' || process.platform === 'win32';
|
|
1076
|
+
if (!isDesktopHost) {
|
|
1077
|
+
// On Linux / Docker: remove overlay for non-http/https URLs and stop.
|
|
1078
|
+
await Promise.race([
|
|
1079
|
+
removeOverlayMenu(page),
|
|
1080
|
+
new Promise((_, reject) => {
|
|
1081
|
+
setTimeout(() => {
|
|
1082
|
+
reject(new Error(`removeOverlayMenu timed out after ${OVERLAY_OPERATION_TIMEOUT_MS}ms`));
|
|
1083
|
+
}, OVERLAY_OPERATION_TIMEOUT_MS);
|
|
1084
|
+
}),
|
|
1085
|
+
]);
|
|
1086
|
+
return;
|
|
1087
|
+
}
|
|
1088
|
+
// Desktop hosts: skip removal and fall through to re-add overlay.
|
|
1076
1089
|
}
|
|
1077
1090
|
const hasOverlay = await page.evaluate(() => Boolean(document.querySelector('#oobeeShadowHost')));
|
|
1078
1091
|
consoleLogger.info(`Overlay state (${trigger}): ${hasOverlay}`);
|
|
@@ -30,8 +30,20 @@ export function addUrlGuardScript(context, opts = {}) {
|
|
|
30
30
|
// page may have closed before addInitScript completed; safe to ignore
|
|
31
31
|
});
|
|
32
32
|
const restoreToSafeUrl = async (page, attemptedUrl) => {
|
|
33
|
+
const safeUrl = lastAllowedUrlByPage.get(page) || fallbackUrl || 'about:blank';
|
|
34
|
+
// Only redirect if the safe URL is itself an allowed (http/https) URL.
|
|
35
|
+
// If the entry URL is file:// (e.g. scanning a local HTML file), the
|
|
36
|
+
// fallback is also file://, and redirecting would create an infinite loop:
|
|
37
|
+
// file:// → restoreToSafeUrl → file:// → framenavigated → restoreToSafeUrl → …
|
|
38
|
+
try {
|
|
39
|
+
const safeObj = new URL(safeUrl);
|
|
40
|
+
if (!ALLOWED_PROTOCOLS.has(safeObj.protocol))
|
|
41
|
+
return;
|
|
42
|
+
}
|
|
43
|
+
catch {
|
|
44
|
+
return;
|
|
45
|
+
}
|
|
33
46
|
try {
|
|
34
|
-
const safeUrl = lastAllowedUrlByPage.get(page) || fallbackUrl || 'about:blank';
|
|
35
47
|
await page.goto(safeUrl, { waitUntil: 'domcontentloaded' });
|
|
36
48
|
}
|
|
37
49
|
catch {
|
|
@@ -53,6 +65,12 @@ export function addUrlGuardScript(context, opts = {}) {
|
|
|
53
65
|
lastAllowedUrlByPage.set(page, urlObj.toString());
|
|
54
66
|
return;
|
|
55
67
|
}
|
|
68
|
+
// Skip browser-internal transitional states (about:blank, about:srcdoc, etc.).
|
|
69
|
+
// page.goto() navigates through about:blank before loading the target URL.
|
|
70
|
+
// Redirecting from about: creates an infinite loop:
|
|
71
|
+
// restoreToSafeUrl → page.goto(safeUrl) → about:blank → restoreToSafeUrl → …
|
|
72
|
+
if (urlObj.protocol === 'about:')
|
|
73
|
+
return;
|
|
56
74
|
await restoreToSafeUrl(page, urlStr);
|
|
57
75
|
});
|
|
58
76
|
};
|
|
@@ -7,6 +7,7 @@
|
|
|
7
7
|
<button
|
|
8
8
|
type="button"
|
|
9
9
|
class="category-tooltip-icon"
|
|
10
|
+
aria-label="About Must Fix category"
|
|
10
11
|
aria-describedby="mustFixTooltip"
|
|
11
12
|
>
|
|
12
13
|
<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14"
|
|
@@ -34,6 +35,7 @@
|
|
|
34
35
|
<button
|
|
35
36
|
type="button"
|
|
36
37
|
class="category-tooltip-icon"
|
|
38
|
+
aria-label="About Good to Fix category"
|
|
37
39
|
aria-describedby="goodToFixTooltip"
|
|
38
40
|
>
|
|
39
41
|
<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14"
|
|
@@ -61,6 +63,7 @@
|
|
|
61
63
|
<button
|
|
62
64
|
type="button"
|
|
63
65
|
class="category-tooltip-icon"
|
|
66
|
+
aria-label="About Manual Test category"
|
|
64
67
|
aria-describedby="manualTestTooltip"
|
|
65
68
|
>
|
|
66
69
|
<svg xmlns="http://www.w3.org/2000/svg" width="14" height="14"
|
|
@@ -2,21 +2,21 @@
|
|
|
2
2
|
<table class="issues-table" id="issuesTable">
|
|
3
3
|
<thead>
|
|
4
4
|
<tr>
|
|
5
|
-
<th class="sortable"
|
|
5
|
+
<th class="sortable" tabindex="0" aria-sort="none" style="width: 15%;">
|
|
6
6
|
<span>Severity</span>
|
|
7
7
|
<svg class="sort-icon" width="24" height="24" viewBox="0 0 24 24" fill="none" aria-hidden="true">
|
|
8
8
|
<path d="M7 9L12 4L17 9H7Z" fill="currentColor" opacity="1" />
|
|
9
9
|
<path d="M7 15L12 20L17 15H7Z" fill="currentColor" opacity="0.3" />
|
|
10
10
|
</svg>
|
|
11
11
|
</th>
|
|
12
|
-
<th class="sortable"
|
|
12
|
+
<th class="sortable" tabindex="0" aria-sort="none">
|
|
13
13
|
<span>Issue Name</span>
|
|
14
14
|
<svg class="sort-icon" width="24" height="24" viewBox="0 0 24 24" fill="none" aria-hidden="true">
|
|
15
15
|
<path d="M7 9L12 4L17 9H7Z" fill="currentColor" opacity="0.3" />
|
|
16
16
|
<path d="M7 15L12 20L17 15H7Z" fill="currentColor" opacity="1" />
|
|
17
17
|
</svg>
|
|
18
18
|
</th>
|
|
19
|
-
<th class="sortable"
|
|
19
|
+
<th class="sortable" tabindex="0" aria-sort="descending" style="width: 15%;">
|
|
20
20
|
<span>Occurrence</span>
|
|
21
21
|
<svg class="sort-icon" width="24" height="24" viewBox="0 0 24 24" fill="none" aria-hidden="true">
|
|
22
22
|
<path d="M7 9L12 4L17 9H7Z" fill="currentColor" opacity="0.3" />
|
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
<div id="aboutScanModal" class="modal fade" tabindex="-1" aria-
|
|
1
|
+
<div id="aboutScanModal" class="modal fade" tabindex="-1" aria-label="About this scan" aria-hidden="true">
|
|
2
2
|
<div class="modal-dialog modal-dialog-centered">
|
|
3
3
|
<div class="modal-content">
|
|
4
4
|
<div class="modal-header">
|