bluera-knowledge 0.18.1 → 0.19.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +20 -0
- package/README.md +19 -28
- package/dist/{brands-3EYIYV6T.js → brands-XDTIHFNU.js} +2 -1
- package/dist/{chunk-EZXJ3W5X.js → chunk-27Y4ENUD.js} +2 -2
- package/dist/chunk-DGUM43GV.js +11 -0
- package/dist/{chunk-VUGQ7HAR.js → chunk-EQYSYRQJ.js} +129 -50
- package/dist/chunk-EQYSYRQJ.js.map +1 -0
- package/dist/{chunk-RDDGZIDL.js → chunk-KQLTWB4T.js} +44 -120
- package/dist/chunk-KQLTWB4T.js.map +1 -0
- package/dist/index.js +5 -4
- package/dist/index.js.map +1 -1
- package/dist/mcp/server.d.ts +0 -29
- package/dist/mcp/server.js +3 -2
- package/dist/{watch.service-VDSUQ72Z.js → watch.service-NXRWLJG6.js} +2 -1
- package/dist/watch.service-NXRWLJG6.js.map +1 -0
- package/dist/workers/background-worker-cli.js +3 -2
- package/dist/workers/background-worker-cli.js.map +1 -1
- package/package.json +7 -5
- package/python/ast_worker.py +209 -0
- package/dist/chunk-RDDGZIDL.js.map +0 -1
- package/dist/chunk-VUGQ7HAR.js.map +0 -1
- package/python/crawl_worker.py +0 -280
- /package/dist/{brands-3EYIYV6T.js.map → brands-XDTIHFNU.js.map} +0 -0
- /package/dist/{chunk-EZXJ3W5X.js.map → chunk-27Y4ENUD.js.map} +0 -0
- /package/dist/{watch.service-VDSUQ72Z.js.map → chunk-DGUM43GV.js.map} +0 -0
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,26 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to this project will be documented in this file. See [commit-and-tag-version](https://github.com/absolute-version/commit-and-tag-version) for commit guidelines.
|
|
4
4
|
|
|
5
|
+
## [0.19.0](https://github.com/blueraai/bluera-knowledge/compare/v0.18.2...v0.19.0) (2026-01-29)
|
|
6
|
+
|
|
7
|
+
|
|
8
|
+
### Features
|
|
9
|
+
|
|
10
|
+
* **ci:** add local npm package validation with npm link ([bff2aaf](https://github.com/blueraai/bluera-knowledge/commit/bff2aaf3ae8dbba4956669dd305cf9fb5affed59))
|
|
11
|
+
|
|
12
|
+
|
|
13
|
+
### Bug Fixes
|
|
14
|
+
|
|
15
|
+
* **validate:** handle empty crawl results in simple mode test ([2c266a0](https://github.com/blueraai/bluera-knowledge/commit/2c266a0d0cb3b1c7ed804f42f0c0204ab5136acb))
|
|
16
|
+
|
|
17
|
+
## [0.18.2](https://github.com/blueraai/bluera-knowledge/compare/v0.18.1...v0.18.2) (2026-01-29)
|
|
18
|
+
|
|
19
|
+
|
|
20
|
+
### Bug Fixes
|
|
21
|
+
|
|
22
|
+
* add tests for tree-sitter unavailable paths and improve coverage ([febde0d](https://github.com/blueraai/bluera-knowledge/commit/febde0d7d6ea89ae928e99b20294a4d3e42f5e29))
|
|
23
|
+
* make tree-sitter optional and use bun for integration tests ([6c50cca](https://github.com/blueraai/bluera-knowledge/commit/6c50cca142020899b7e4758b4026da18466e87b0))
|
|
24
|
+
|
|
5
25
|
## [0.18.1](https://github.com/blueraai/bluera-knowledge/compare/v0.18.0...v0.18.1) (2026-01-28)
|
|
6
26
|
|
|
7
27
|
## [0.18.0](https://github.com/blueraai/bluera-knowledge/compare/v0.17.2...v0.18.0) (2026-01-28)
|
package/README.md
CHANGED
|
@@ -5,7 +5,7 @@
|
|
|
5
5
|

|
|
6
6
|

|
|
7
7
|

|
|
8
|
-

|
|
9
9
|
|
|
10
10
|
> 🚀 **Build a local knowledge base for your AI coding agent—dependency source code, crawled docs, and your own files, all instantly searchable.**
|
|
11
11
|
|
|
@@ -82,7 +82,7 @@ Works with any AI coding tool, editor, CI/CD pipeline, or automation.
|
|
|
82
82
|
Adds slash commands, MCP tools, and Skills for optimal Claude Code integration.
|
|
83
83
|
|
|
84
84
|
> [!NOTE]
|
|
85
|
-
> **First launch may
|
|
85
|
+
> **First launch may take a moment** while the plugin installs Playwright browser binaries for web crawling. This is a one-time setup—subsequent launches are instant.
|
|
86
86
|
|
|
87
87
|
---
|
|
88
88
|
|
|
@@ -193,7 +193,7 @@ bluera-knowledge store list
|
|
|
193
193
|
- **🔍 Automatic Repository Discovery** - Queries package registries (NPM, PyPI, crates.io, Go modules) to automatically find GitHub repository URLs
|
|
194
194
|
- **📦 Git Repository Indexing** - Clones and indexes dependency source code for both semantic search and direct file access
|
|
195
195
|
- **📁 Local Folder Indexing** - Indexes any local content - documentation, standards, reference materials, or custom content
|
|
196
|
-
- **🌐 Web Crawling** - Crawl and index web pages using
|
|
196
|
+
- **🌐 Web Crawling** - Crawl and index web pages using Node.js Playwright - convert documentation sites to searchable markdown
|
|
197
197
|
|
|
198
198
|
### 🔍 Search Modes
|
|
199
199
|
|
|
@@ -478,14 +478,13 @@ If the issue persists, check that Claude Code is v2.0.65 or later (earlier versi
|
|
|
478
478
|
<details>
|
|
479
479
|
<summary><b>🌐 Web crawling fails</b></summary>
|
|
480
480
|
|
|
481
|
-
Check
|
|
481
|
+
Check Playwright browser installation:
|
|
482
482
|
|
|
483
483
|
```bash
|
|
484
|
-
|
|
485
|
-
pip install crawl4ai
|
|
484
|
+
npx playwright install chromium
|
|
486
485
|
```
|
|
487
486
|
|
|
488
|
-
The plugin attempts to auto-install
|
|
487
|
+
The plugin attempts to auto-install Playwright browsers on first use, but manual installation may be needed in some environments.
|
|
489
488
|
</details>
|
|
490
489
|
|
|
491
490
|
<details>
|
|
@@ -569,34 +568,26 @@ Logs are JSON formatted (NDJSON) and can be processed with `jq` for pretty-print
|
|
|
569
568
|
|
|
570
569
|
## 🔧 Dependencies
|
|
571
570
|
|
|
572
|
-
The plugin automatically checks for and attempts to install
|
|
571
|
+
The plugin automatically checks for and attempts to install browser binaries on first use:
|
|
573
572
|
|
|
574
|
-
**Required:**
|
|
575
|
-
-
|
|
576
|
-
|
|
577
|
-
|
|
573
|
+
**Required for Web Crawling:**
|
|
574
|
+
- **🎭 Playwright Chromium** - Required for headless browser crawling (auto-installed via SessionStart hook)
|
|
575
|
+
|
|
576
|
+
**Optional:**
|
|
577
|
+
- **🐍 Python 3.8+** - Only needed for Python AST parsing in code-graph features (not required for web crawling)
|
|
578
578
|
|
|
579
579
|
**What the SessionStart hook installs:**
|
|
580
|
-
- ✅
|
|
581
|
-
- ✅ Playwright Chromium browser binaries
|
|
580
|
+
- ✅ Node.js dependencies (via bun/npm)
|
|
581
|
+
- ✅ Playwright Chromium browser binaries
|
|
582
582
|
|
|
583
583
|
If auto-installation fails, install manually:
|
|
584
584
|
|
|
585
585
|
```bash
|
|
586
|
-
|
|
587
|
-
playwright install chromium
|
|
588
|
-
```
|
|
589
|
-
|
|
590
|
-
**Disable auto-install (security-conscious environments):**
|
|
591
|
-
|
|
592
|
-
Set the `BK_SKIP_AUTO_INSTALL` environment variable to disable automatic pip package installation:
|
|
593
|
-
|
|
594
|
-
```bash
|
|
595
|
-
export BK_SKIP_AUTO_INSTALL=1
|
|
586
|
+
npx playwright install chromium
|
|
596
587
|
```
|
|
597
588
|
|
|
598
589
|
> [!NOTE]
|
|
599
|
-
>
|
|
590
|
+
> Web crawling uses Node.js Playwright directly—no Python dependencies required. The default mode uses headless browser for maximum compatibility with JavaScript-rendered sites. Use `--fast` for static sites when speed is critical.
|
|
600
591
|
|
|
601
592
|
**Update Plugin:**
|
|
602
593
|
```bash
|
|
@@ -737,7 +728,7 @@ This ensures:
|
|
|
737
728
|
- **🧠 Semantic Search** - AI-powered vector embeddings with [LanceDB](https://github.com/lancedb/lancedb)
|
|
738
729
|
- **📦 Git Operations** - Native git clone
|
|
739
730
|
- **💻 CLI** - [Commander.js](https://github.com/tj/commander.js)
|
|
740
|
-
- **🕷️ Web Crawling** - [
|
|
731
|
+
- **🕷️ Web Crawling** - [Playwright](https://github.com/microsoft/playwright) (Node.js headless browser) with [Cheerio](https://github.com/cheeriojs/cheerio) (HTML parsing)
|
|
741
732
|
|
|
742
733
|
---
|
|
743
734
|
|
|
@@ -769,10 +760,10 @@ MIT - See [LICENSE](./LICENSE) for details.
|
|
|
769
760
|
This project includes software developed by third parties. See [NOTICE](./NOTICE) for full attribution.
|
|
770
761
|
|
|
771
762
|
Key dependencies:
|
|
772
|
-
- **[
|
|
763
|
+
- **[Playwright](https://github.com/microsoft/playwright)** - Headless browser for web crawling (Apache-2.0)
|
|
773
764
|
- **[LanceDB](https://github.com/lancedb/lancedb)** - Vector database (Apache-2.0)
|
|
774
765
|
- **[Hugging Face Transformers](https://github.com/huggingface/transformers.js)** - Embeddings (Apache-2.0)
|
|
775
|
-
- **[
|
|
766
|
+
- **[Cheerio](https://github.com/cheeriojs/cheerio)** - HTML parsing (MIT)
|
|
776
767
|
|
|
777
768
|
---
|
|
778
769
|
|
|
@@ -4,10 +4,11 @@ import {
|
|
|
4
4
|
isDocumentId,
|
|
5
5
|
isStoreId
|
|
6
6
|
} from "./chunk-CLIMKLTW.js";
|
|
7
|
+
import "./chunk-DGUM43GV.js";
|
|
7
8
|
export {
|
|
8
9
|
createDocumentId,
|
|
9
10
|
createStoreId,
|
|
10
11
|
isDocumentId,
|
|
11
12
|
isStoreId
|
|
12
13
|
};
|
|
13
|
-
//# sourceMappingURL=brands-
|
|
14
|
+
//# sourceMappingURL=brands-XDTIHFNU.js.map
|
|
@@ -9,7 +9,7 @@ import {
|
|
|
9
9
|
isRepoStoreDefinition,
|
|
10
10
|
isWebStoreDefinition,
|
|
11
11
|
summarizePayload
|
|
12
|
-
} from "./chunk-
|
|
12
|
+
} from "./chunk-KQLTWB4T.js";
|
|
13
13
|
|
|
14
14
|
// src/mcp/server.ts
|
|
15
15
|
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
|
|
@@ -2202,4 +2202,4 @@ export {
|
|
|
2202
2202
|
createMCPServer,
|
|
2203
2203
|
runMCPServer
|
|
2204
2204
|
};
|
|
2205
|
-
//# sourceMappingURL=chunk-
|
|
2205
|
+
//# sourceMappingURL=chunk-27Y4ENUD.js.map
|
|
@@ -0,0 +1,11 @@
|
|
|
1
|
+
var __require = /* @__PURE__ */ ((x) => typeof require !== "undefined" ? require : typeof Proxy !== "undefined" ? new Proxy(x, {
|
|
2
|
+
get: (a, b) => (typeof require !== "undefined" ? require : a)[b]
|
|
3
|
+
}) : x)(function(x) {
|
|
4
|
+
if (typeof require !== "undefined") return require.apply(this, arguments);
|
|
5
|
+
throw Error('Dynamic require of "' + x + '" is not supported');
|
|
6
|
+
});
|
|
7
|
+
|
|
8
|
+
export {
|
|
9
|
+
__require
|
|
10
|
+
};
|
|
11
|
+
//# sourceMappingURL=chunk-DGUM43GV.js.map
|
|
@@ -1,9 +1,8 @@
|
|
|
1
1
|
import {
|
|
2
|
-
PythonBridge,
|
|
3
2
|
createLogger,
|
|
4
3
|
summarizePayload,
|
|
5
4
|
truncateForLog
|
|
6
|
-
} from "./chunk-
|
|
5
|
+
} from "./chunk-KQLTWB4T.js";
|
|
7
6
|
|
|
8
7
|
// src/crawl/intelligent-crawler.ts
|
|
9
8
|
import { EventEmitter } from "events";
|
|
@@ -458,45 +457,127 @@ ${this.truncateMarkdown(markdown, 1e5)}`;
|
|
|
458
457
|
}
|
|
459
458
|
};
|
|
460
459
|
|
|
460
|
+
// src/crawl/link-extractor.ts
|
|
461
|
+
import * as cheerio2 from "cheerio";
|
|
462
|
+
function extractLinks(html, baseUrl) {
|
|
463
|
+
const $ = cheerio2.load(html);
|
|
464
|
+
const links = /* @__PURE__ */ new Set();
|
|
465
|
+
$("a[href]").each((_, el) => {
|
|
466
|
+
const href = $(el).attr("href");
|
|
467
|
+
if (href !== void 0 && href !== "") {
|
|
468
|
+
try {
|
|
469
|
+
const absoluteUrl = new URL(href, baseUrl).href;
|
|
470
|
+
if (absoluteUrl.startsWith("http://") || absoluteUrl.startsWith("https://")) {
|
|
471
|
+
links.add(absoluteUrl);
|
|
472
|
+
}
|
|
473
|
+
} catch {
|
|
474
|
+
}
|
|
475
|
+
}
|
|
476
|
+
});
|
|
477
|
+
return [...links];
|
|
478
|
+
}
|
|
479
|
+
|
|
480
|
+
// src/crawl/playwright-crawler.ts
|
|
481
|
+
import { chromium } from "playwright";
|
|
482
|
+
var logger2 = createLogger("playwright-crawler");
|
|
483
|
+
var DEFAULT_TIMEOUT = 3e4;
|
|
484
|
+
var USER_AGENT = "Mozilla/5.0 (compatible; BlueraKnowledge/1.0; +https://github.com/blueraai/bluera-knowledge)";
|
|
485
|
+
var browserInstance = null;
|
|
486
|
+
async function getBrowser() {
|
|
487
|
+
if (browserInstance?.isConnected() !== true) {
|
|
488
|
+
logger2.debug("Launching new Chromium browser instance");
|
|
489
|
+
browserInstance = await chromium.launch({
|
|
490
|
+
headless: true
|
|
491
|
+
});
|
|
492
|
+
}
|
|
493
|
+
return browserInstance;
|
|
494
|
+
}
|
|
495
|
+
async function closeBrowser() {
|
|
496
|
+
if (browserInstance?.isConnected() === true) {
|
|
497
|
+
await browserInstance.close();
|
|
498
|
+
browserInstance = null;
|
|
499
|
+
}
|
|
500
|
+
}
|
|
501
|
+
async function fetchWithPlaywright(url, timeoutMs = DEFAULT_TIMEOUT) {
|
|
502
|
+
const startTime = Date.now();
|
|
503
|
+
logger2.debug({ url, timeoutMs }, "Fetching page with Playwright");
|
|
504
|
+
const browser = await getBrowser();
|
|
505
|
+
const context = await browser.newContext({
|
|
506
|
+
userAgent: USER_AGENT
|
|
507
|
+
});
|
|
508
|
+
let page = null;
|
|
509
|
+
try {
|
|
510
|
+
page = await context.newPage();
|
|
511
|
+
await page.goto(url, {
|
|
512
|
+
waitUntil: "networkidle",
|
|
513
|
+
timeout: timeoutMs
|
|
514
|
+
});
|
|
515
|
+
const html = await page.content();
|
|
516
|
+
const links = await page.evaluate(() => {
|
|
517
|
+
const anchors = document.querySelectorAll("a[href]");
|
|
518
|
+
const hrefs = [];
|
|
519
|
+
anchors.forEach((anchor) => {
|
|
520
|
+
const href = anchor.getAttribute("href");
|
|
521
|
+
if (href !== null && href !== "") {
|
|
522
|
+
try {
|
|
523
|
+
const absoluteUrl = new URL(href, document.baseURI).href;
|
|
524
|
+
if (absoluteUrl.startsWith("http://") || absoluteUrl.startsWith("https://")) {
|
|
525
|
+
hrefs.push(absoluteUrl);
|
|
526
|
+
}
|
|
527
|
+
} catch {
|
|
528
|
+
}
|
|
529
|
+
}
|
|
530
|
+
});
|
|
531
|
+
return [...new Set(hrefs)];
|
|
532
|
+
});
|
|
533
|
+
const durationMs = Date.now() - startTime;
|
|
534
|
+
logger2.info(
|
|
535
|
+
{
|
|
536
|
+
url,
|
|
537
|
+
durationMs,
|
|
538
|
+
linkCount: links.length,
|
|
539
|
+
...summarizePayload(html, "raw-html", url)
|
|
540
|
+
},
|
|
541
|
+
"Page fetched with Playwright"
|
|
542
|
+
);
|
|
543
|
+
return { html, links };
|
|
544
|
+
} finally {
|
|
545
|
+
await context.close();
|
|
546
|
+
}
|
|
547
|
+
}
|
|
548
|
+
|
|
461
549
|
// src/crawl/intelligent-crawler.ts
|
|
462
|
-
var
|
|
550
|
+
var logger3 = createLogger("crawler");
|
|
463
551
|
async function getCrawlStrategy(seedUrl, crawlInstruction, useHeadless = false) {
|
|
464
552
|
if (!ClaudeClient.isAvailable()) {
|
|
465
553
|
throw new Error("Claude CLI not available: install Claude Code for intelligent crawling");
|
|
466
554
|
}
|
|
467
555
|
const client = new ClaudeClient();
|
|
468
|
-
|
|
469
|
-
|
|
470
|
-
|
|
471
|
-
|
|
472
|
-
|
|
473
|
-
|
|
474
|
-
|
|
475
|
-
|
|
476
|
-
|
|
477
|
-
|
|
478
|
-
|
|
479
|
-
|
|
480
|
-
});
|
|
481
|
-
seedHtml = typeof response.data === "string" ? response.data : JSON.stringify(response.data);
|
|
482
|
-
}
|
|
483
|
-
return await client.determineCrawlUrls(seedUrl, seedHtml, crawlInstruction);
|
|
484
|
-
} finally {
|
|
485
|
-
await bridge.stop();
|
|
556
|
+
let seedHtml;
|
|
557
|
+
if (useHeadless) {
|
|
558
|
+
const headlessResult = await fetchWithPlaywright(seedUrl);
|
|
559
|
+
seedHtml = headlessResult.html;
|
|
560
|
+
} else {
|
|
561
|
+
const response = await axios.get(seedUrl, {
|
|
562
|
+
timeout: 3e4,
|
|
563
|
+
headers: {
|
|
564
|
+
"User-Agent": "Mozilla/5.0 (compatible; BlueraKnowledge/1.0; +https://github.com/blueraai/bluera-knowledge)"
|
|
565
|
+
}
|
|
566
|
+
});
|
|
567
|
+
seedHtml = typeof response.data === "string" ? response.data : JSON.stringify(response.data);
|
|
486
568
|
}
|
|
569
|
+
return client.determineCrawlUrls(seedUrl, seedHtml, crawlInstruction);
|
|
487
570
|
}
|
|
488
571
|
var DEFAULT_USER_AGENT = "Mozilla/5.0 (compatible; BlueraKnowledge/1.0; +https://github.com/blueraai/bluera-knowledge)";
|
|
489
|
-
var
|
|
572
|
+
var DEFAULT_TIMEOUT2 = 3e4;
|
|
490
573
|
var IntelligentCrawler = class extends EventEmitter {
|
|
491
574
|
claudeClient;
|
|
492
|
-
pythonBridge;
|
|
493
575
|
visited;
|
|
494
576
|
config;
|
|
495
577
|
stopped;
|
|
496
578
|
constructor(config) {
|
|
497
579
|
super();
|
|
498
580
|
this.claudeClient = new ClaudeClient();
|
|
499
|
-
this.pythonBridge = new PythonBridge();
|
|
500
581
|
this.visited = /* @__PURE__ */ new Set();
|
|
501
582
|
this.config = config ?? {};
|
|
502
583
|
this.stopped = false;
|
|
@@ -508,7 +589,7 @@ var IntelligentCrawler = class extends EventEmitter {
|
|
|
508
589
|
const { crawlInstruction, extractInstruction, maxPages = 50, simple = false } = options;
|
|
509
590
|
this.visited.clear();
|
|
510
591
|
this.stopped = false;
|
|
511
|
-
|
|
592
|
+
logger3.info(
|
|
512
593
|
{
|
|
513
594
|
seedUrl,
|
|
514
595
|
maxPages,
|
|
@@ -536,7 +617,7 @@ var IntelligentCrawler = class extends EventEmitter {
|
|
|
536
617
|
} else {
|
|
537
618
|
yield* this.crawlSimple(seedUrl, extractInstruction, maxPages, options.useHeadless ?? false);
|
|
538
619
|
}
|
|
539
|
-
|
|
620
|
+
logger3.info(
|
|
540
621
|
{
|
|
541
622
|
seedUrl,
|
|
542
623
|
pagesVisited: this.visited.size
|
|
@@ -651,9 +732,9 @@ var IntelligentCrawler = class extends EventEmitter {
|
|
|
651
732
|
try {
|
|
652
733
|
const links = await this.extractLinks(current.url, useHeadless);
|
|
653
734
|
if (links.length === 0) {
|
|
654
|
-
|
|
735
|
+
logger3.debug({ url: current.url }, "No links found - page may be a leaf node");
|
|
655
736
|
} else {
|
|
656
|
-
|
|
737
|
+
logger3.debug(
|
|
657
738
|
{ url: current.url, linkCount: links.length },
|
|
658
739
|
"Links extracted from page"
|
|
659
740
|
);
|
|
@@ -705,7 +786,7 @@ var IntelligentCrawler = class extends EventEmitter {
|
|
|
705
786
|
this.visited.add(url);
|
|
706
787
|
const html = await this.fetchHtml(url, useHeadless);
|
|
707
788
|
const conversion = await convertHtmlToMarkdown(html, url);
|
|
708
|
-
|
|
789
|
+
logger3.debug(
|
|
709
790
|
{
|
|
710
791
|
url,
|
|
711
792
|
title: conversion.title,
|
|
@@ -739,12 +820,12 @@ var IntelligentCrawler = class extends EventEmitter {
|
|
|
739
820
|
*/
|
|
740
821
|
async fetchHtml(url, useHeadless = false) {
|
|
741
822
|
const startTime = Date.now();
|
|
742
|
-
|
|
823
|
+
logger3.debug({ url, useHeadless }, "Fetching HTML");
|
|
743
824
|
if (useHeadless) {
|
|
744
825
|
try {
|
|
745
|
-
const result = await
|
|
826
|
+
const result = await fetchWithPlaywright(url);
|
|
746
827
|
const durationMs = Date.now() - startTime;
|
|
747
|
-
|
|
828
|
+
logger3.info(
|
|
748
829
|
{
|
|
749
830
|
url,
|
|
750
831
|
useHeadless: true,
|
|
@@ -762,13 +843,13 @@ var IntelligentCrawler = class extends EventEmitter {
|
|
|
762
843
|
}
|
|
763
844
|
try {
|
|
764
845
|
const response = await axios.get(url, {
|
|
765
|
-
timeout: this.config.timeout ??
|
|
846
|
+
timeout: this.config.timeout ?? DEFAULT_TIMEOUT2,
|
|
766
847
|
headers: {
|
|
767
848
|
"User-Agent": this.config.userAgent ?? DEFAULT_USER_AGENT
|
|
768
849
|
}
|
|
769
850
|
});
|
|
770
851
|
const durationMs = Date.now() - startTime;
|
|
771
|
-
|
|
852
|
+
logger3.info(
|
|
772
853
|
{
|
|
773
854
|
url,
|
|
774
855
|
useHeadless: false,
|
|
@@ -779,7 +860,7 @@ var IntelligentCrawler = class extends EventEmitter {
|
|
|
779
860
|
);
|
|
780
861
|
return response.data;
|
|
781
862
|
} catch (error) {
|
|
782
|
-
|
|
863
|
+
logger3.error(
|
|
783
864
|
{ url, error: error instanceof Error ? error.message : String(error) },
|
|
784
865
|
"Failed to fetch HTML"
|
|
785
866
|
);
|
|
@@ -789,26 +870,24 @@ var IntelligentCrawler = class extends EventEmitter {
|
|
|
789
870
|
}
|
|
790
871
|
}
|
|
791
872
|
/**
|
|
792
|
-
* Extract links from a page using
|
|
873
|
+
* Extract links from a page using Playwright or cheerio
|
|
793
874
|
*/
|
|
794
875
|
async extractLinks(url, useHeadless = false) {
|
|
795
876
|
try {
|
|
796
877
|
if (useHeadless) {
|
|
797
|
-
const
|
|
798
|
-
return
|
|
799
|
-
if (typeof link === "string") return link;
|
|
800
|
-
return link.href;
|
|
801
|
-
});
|
|
878
|
+
const result = await fetchWithPlaywright(url);
|
|
879
|
+
return result.links;
|
|
802
880
|
}
|
|
803
|
-
const
|
|
804
|
-
|
|
805
|
-
|
|
806
|
-
|
|
807
|
-
|
|
808
|
-
|
|
881
|
+
const response = await axios.get(url, {
|
|
882
|
+
timeout: this.config.timeout ?? DEFAULT_TIMEOUT2,
|
|
883
|
+
headers: {
|
|
884
|
+
"User-Agent": this.config.userAgent ?? DEFAULT_USER_AGENT
|
|
885
|
+
}
|
|
886
|
+
});
|
|
887
|
+
return extractLinks(response.data, url);
|
|
809
888
|
} catch (error) {
|
|
810
889
|
const errorMessage = error instanceof Error ? error.message : String(error);
|
|
811
|
-
|
|
890
|
+
logger3.error({ url, error: errorMessage }, "Failed to extract links");
|
|
812
891
|
throw new Error(`Link extraction failed for ${url}: ${errorMessage}`);
|
|
813
892
|
}
|
|
814
893
|
}
|
|
@@ -829,7 +908,7 @@ var IntelligentCrawler = class extends EventEmitter {
|
|
|
829
908
|
*/
|
|
830
909
|
async stop() {
|
|
831
910
|
this.stopped = true;
|
|
832
|
-
await
|
|
911
|
+
await closeBrowser();
|
|
833
912
|
}
|
|
834
913
|
};
|
|
835
914
|
|
|
@@ -837,4 +916,4 @@ export {
|
|
|
837
916
|
getCrawlStrategy,
|
|
838
917
|
IntelligentCrawler
|
|
839
918
|
};
|
|
840
|
-
//# sourceMappingURL=chunk-
|
|
919
|
+
//# sourceMappingURL=chunk-EQYSYRQJ.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"sources":["../src/crawl/intelligent-crawler.ts","../src/crawl/article-converter.ts","../src/crawl/markdown-utils.ts","../src/crawl/claude-client.ts","../src/crawl/link-extractor.ts","../src/crawl/playwright-crawler.ts"],"sourcesContent":["/**\n * Intelligent web crawler with natural language control\n * Two modes: Intelligent (Claude-driven) and Simple (BFS)\n */\n\nimport { EventEmitter } from 'node:events';\nimport axios from 'axios';\nimport { convertHtmlToMarkdown } from './article-converter.js';\nimport { ClaudeClient, type CrawlStrategy } from './claude-client.js';\nimport { extractLinks as extractLinksFromHtml } from './link-extractor.js';\nimport { fetchWithPlaywright, closeBrowser } from './playwright-crawler.js';\nimport { createLogger, summarizePayload } from '../logging/index.js';\n\nconst logger = createLogger('crawler');\n\n/**\n * Get crawl strategy from Claude CLI without initializing lancedb.\n * Call this BEFORE createServices() to avoid fork-safety issues with lancedb.\n *\n * LanceDB's native Rust code is not fork-safe. If we spawn Claude CLI\n * after lancedb is loaded, the fork corrupts lancedb's mutex state.\n */\nexport async function getCrawlStrategy(\n seedUrl: string,\n crawlInstruction: string,\n useHeadless: boolean = false\n): Promise<CrawlStrategy> {\n if (!ClaudeClient.isAvailable()) {\n throw new Error('Claude CLI not available: install Claude Code for intelligent crawling');\n }\n\n const client = new ClaudeClient();\n\n // Fetch seed HTML\n let seedHtml: string;\n if (useHeadless) {\n const headlessResult = await fetchWithPlaywright(seedUrl);\n seedHtml = headlessResult.html;\n } else {\n const response = await axios.get(seedUrl, {\n timeout: 30000,\n headers: {\n 'User-Agent':\n 'Mozilla/5.0 (compatible; BlueraKnowledge/1.0; +https://github.com/blueraai/bluera-knowledge)',\n },\n });\n seedHtml = typeof response.data === 'string' ? response.data : JSON.stringify(response.data);\n }\n\n // Get strategy from Claude\n return client.determineCrawlUrls(seedUrl, seedHtml, crawlInstruction);\n}\n\nexport interface CrawlConfig {\n userAgent?: string; // User-Agent header for requests\n timeout?: number; // Per-page timeout in ms (default: 30000)\n maxConcurrency?: number; // Max concurrent requests (for future use)\n}\n\nexport interface CrawlOptions {\n crawlInstruction?: string; // Natural language: what to crawl\n extractInstruction?: string; // Natural language: what to extract\n maxPages?: number; // Max pages to crawl (default: 50)\n timeout?: number; // Per-page timeout in ms (default: 30000)\n simple?: boolean; // Force simple BFS mode\n useHeadless?: boolean; // Enable headless browser for JavaScript-rendered sites\n preComputedStrategy?: CrawlStrategy; // Pre-computed strategy (to avoid fork issues with lancedb)\n}\n\nexport interface CrawlResult {\n url: string;\n title?: string;\n markdown: string;\n extracted?: string;\n depth?: number;\n}\n\nexport interface CrawlProgress {\n type: 'start' | 'strategy' | 'page' | 'extraction' | 'complete' | 'error';\n pagesVisited: number;\n totalPages: number;\n currentUrl?: string;\n message?: string;\n error?: Error;\n}\n\n// Default values for crawl config\nconst DEFAULT_USER_AGENT =\n 'Mozilla/5.0 (compatible; BlueraKnowledge/1.0; +https://github.com/blueraai/bluera-knowledge)';\nconst DEFAULT_TIMEOUT = 30000;\n\n/**\n * Intelligent crawler that uses Claude CLI for strategy and extraction\n */\nexport class IntelligentCrawler extends EventEmitter {\n private readonly claudeClient: ClaudeClient;\n private readonly visited: Set<string>;\n private readonly config: CrawlConfig;\n private stopped: boolean;\n\n constructor(config?: CrawlConfig) {\n super();\n this.claudeClient = new ClaudeClient();\n this.visited = new Set();\n this.config = config ?? {};\n this.stopped = false;\n }\n\n /**\n * Crawl a website with intelligent or simple mode\n */\n async *crawl(seedUrl: string, options: CrawlOptions = {}): AsyncIterable<CrawlResult> {\n const { crawlInstruction, extractInstruction, maxPages = 50, simple = false } = options;\n\n this.visited.clear();\n this.stopped = false;\n\n logger.info(\n {\n seedUrl,\n maxPages,\n mode: simple\n ? 'simple'\n : crawlInstruction !== undefined && crawlInstruction !== ''\n ? 'intelligent'\n : 'simple',\n hasExtractInstruction: extractInstruction !== undefined,\n },\n 'Starting crawl'\n );\n\n const startProgress: CrawlProgress = {\n type: 'start',\n pagesVisited: 0,\n totalPages: maxPages,\n };\n this.emit('progress', startProgress);\n\n // Determine mode: intelligent (with crawl instruction) or simple (BFS)\n const useIntelligentMode = !simple && crawlInstruction !== undefined && crawlInstruction !== '';\n\n if (useIntelligentMode) {\n // TypeScript knows crawlInstruction is defined here due to useIntelligentMode check\n yield* this.crawlIntelligent(\n seedUrl,\n crawlInstruction,\n extractInstruction,\n maxPages,\n options.useHeadless ?? false,\n options.preComputedStrategy\n );\n } else {\n yield* this.crawlSimple(seedUrl, extractInstruction, maxPages, options.useHeadless ?? false);\n }\n\n logger.info(\n {\n seedUrl,\n pagesVisited: this.visited.size,\n },\n 'Crawl complete'\n );\n\n // Warn if crawl discovered far fewer pages than requested\n if (this.visited.size === 1 && maxPages > 1) {\n const warningProgress: CrawlProgress = {\n type: 'error',\n pagesVisited: this.visited.size,\n totalPages: maxPages,\n message: `Warning: Only crawled 1 page despite maxPages=${String(maxPages)}. Link discovery may have failed. If using --fast mode, try without it for JavaScript-heavy sites.`,\n error: new Error('Low page discovery'),\n };\n this.emit('progress', warningProgress);\n }\n\n const completeProgress: CrawlProgress = {\n type: 'complete',\n pagesVisited: this.visited.size,\n totalPages: this.visited.size,\n };\n this.emit('progress', completeProgress);\n }\n\n /**\n * Intelligent mode: Use Claude to determine which URLs to crawl\n */\n private async *crawlIntelligent(\n seedUrl: string,\n crawlInstruction: string,\n extractInstruction: string | undefined,\n maxPages: number,\n useHeadless: boolean = false,\n preComputedStrategy?: CrawlStrategy\n ): AsyncIterable<CrawlResult> {\n let strategy: CrawlStrategy;\n\n // Use pre-computed strategy if provided (avoids fork issues with lancedb)\n if (preComputedStrategy !== undefined) {\n strategy = preComputedStrategy;\n const strategyProgress: CrawlProgress = {\n type: 'strategy',\n pagesVisited: 0,\n totalPages: maxPages,\n message: `Using pre-computed strategy: ${String(strategy.urls.length)} URLs to crawl`,\n };\n this.emit('progress', strategyProgress);\n } else {\n // Check if Claude CLI is available before attempting intelligent mode\n if (!ClaudeClient.isAvailable()) {\n throw new Error('Claude CLI not available: install Claude Code for intelligent crawling');\n }\n\n try {\n // Step 1: Fetch seed page HTML\n const strategyStartProgress: CrawlProgress = {\n type: 'strategy',\n pagesVisited: 0,\n totalPages: maxPages,\n currentUrl: seedUrl,\n message: 'Analyzing page structure with Claude...',\n };\n this.emit('progress', strategyStartProgress);\n\n const seedHtml = await this.fetchHtml(seedUrl, useHeadless);\n\n // Step 2: Ask Claude which URLs to crawl (pass seedUrl for relative URL resolution)\n strategy = await this.claudeClient.determineCrawlUrls(seedUrl, seedHtml, crawlInstruction);\n\n const strategyCompleteProgress: CrawlProgress = {\n type: 'strategy',\n pagesVisited: 0,\n totalPages: maxPages,\n message: `Claude identified ${String(strategy.urls.length)} URLs to crawl: ${strategy.reasoning}`,\n };\n this.emit('progress', strategyCompleteProgress);\n } catch (error) {\n // Re-throw strategy errors - do not fall back silently\n throw error instanceof Error ? error : new Error(String(error));\n }\n }\n\n // Step 3: Crawl each URL from Claude's strategy\n let pagesVisited = 0;\n\n for (const url of strategy.urls) {\n if (this.stopped || pagesVisited >= maxPages) break;\n if (this.visited.has(url)) continue;\n\n try {\n const result = await this.crawlSinglePage(\n url,\n extractInstruction,\n pagesVisited,\n useHeadless\n );\n pagesVisited++;\n yield result;\n } catch (error) {\n const pageErrorProgress: CrawlProgress = {\n type: 'error',\n pagesVisited,\n totalPages: maxPages,\n currentUrl: url,\n error: error instanceof Error ? error : new Error(String(error)),\n };\n this.emit('progress', pageErrorProgress);\n }\n }\n }\n\n /**\n * Simple mode: BFS crawling with depth limit\n */\n private async *crawlSimple(\n seedUrl: string,\n extractInstruction: string | undefined,\n maxPages: number,\n useHeadless: boolean = false\n ): AsyncIterable<CrawlResult> {\n const queue: Array<{ url: string; depth: number }> = [{ url: seedUrl, depth: 0 }];\n const maxDepth = 2; // Default depth limit for simple mode\n let pagesVisited = 0;\n\n while (queue.length > 0 && pagesVisited < maxPages && !this.stopped) {\n const current = queue.shift();\n\n if (!current || this.visited.has(current.url) || current.depth > maxDepth) {\n continue;\n }\n\n try {\n const result = await this.crawlSinglePage(\n current.url,\n extractInstruction,\n pagesVisited,\n useHeadless\n );\n result.depth = current.depth;\n pagesVisited++;\n\n yield result;\n\n // Add links to queue if we haven't reached max depth\n if (current.depth < maxDepth) {\n try {\n const links = await this.extractLinks(current.url, useHeadless);\n\n if (links.length === 0) {\n logger.debug({ url: current.url }, 'No links found - page may be a leaf node');\n } else {\n logger.debug(\n { url: current.url, linkCount: links.length },\n 'Links extracted from page'\n );\n }\n\n for (const link of links) {\n if (!this.visited.has(link) && this.isSameDomain(seedUrl, link)) {\n queue.push({ url: link, depth: current.depth + 1 });\n }\n }\n } catch (error) {\n // Log link extraction failure but continue crawling other pages\n const errorProgress: CrawlProgress = {\n type: 'error',\n pagesVisited,\n totalPages: maxPages,\n currentUrl: current.url,\n message: `Failed to extract links from ${current.url}`,\n error: error instanceof Error ? error : new Error(String(error)),\n };\n this.emit('progress', errorProgress);\n }\n }\n } catch (error) {\n const errorObj = error instanceof Error ? error : new Error(String(error));\n\n // Re-throw non-recoverable errors (extraction failures, Claude CLI not available, headless failures)\n // These represent failures in user-requested functionality that should not be silently skipped\n if (\n errorObj.message.includes('Extraction failed') ||\n errorObj.message.includes('Claude CLI not available') ||\n errorObj.message.includes('Headless fetch failed')\n ) {\n throw errorObj;\n }\n\n // For recoverable errors (page fetch failures), emit progress and continue\n const simpleErrorProgress: CrawlProgress = {\n type: 'error',\n pagesVisited,\n totalPages: maxPages,\n currentUrl: current.url,\n error: errorObj,\n };\n this.emit('progress', simpleErrorProgress);\n }\n }\n }\n\n /**\n * Crawl a single page: fetch, convert to markdown, optionally extract\n */\n private async crawlSinglePage(\n url: string,\n extractInstruction: string | undefined,\n pagesVisited: number,\n useHeadless: boolean = false\n ): Promise<CrawlResult> {\n const pageProgress: CrawlProgress = {\n type: 'page',\n pagesVisited,\n totalPages: 0,\n currentUrl: url,\n };\n this.emit('progress', pageProgress);\n\n // Mark as visited\n this.visited.add(url);\n\n // Fetch HTML\n const html = await this.fetchHtml(url, useHeadless);\n\n // Convert to clean markdown using slurp-ai techniques\n // Note: convertHtmlToMarkdown throws on errors, no need to check success\n const conversion = await convertHtmlToMarkdown(html, url);\n\n logger.debug(\n {\n url,\n title: conversion.title,\n markdownLength: conversion.markdown.length,\n },\n 'Article converted to markdown'\n );\n\n let extracted: string | undefined;\n\n // Optional: Extract specific information using Claude\n if (extractInstruction !== undefined && extractInstruction !== '') {\n // Throw if extraction requested but Claude CLI isn't available\n if (!ClaudeClient.isAvailable()) {\n throw new Error('Claude CLI not available: install Claude Code for extraction');\n }\n\n const extractionProgress: CrawlProgress = {\n type: 'extraction',\n pagesVisited,\n totalPages: 0,\n currentUrl: url,\n };\n this.emit('progress', extractionProgress);\n\n extracted = await this.claudeClient.extractContent(conversion.markdown, extractInstruction);\n }\n\n return {\n url,\n ...(conversion.title !== undefined && { title: conversion.title }),\n markdown: conversion.markdown,\n ...(extracted !== undefined && { extracted }),\n };\n }\n\n /**\n * Fetch HTML content from a URL\n */\n private async fetchHtml(url: string, useHeadless: boolean = false): Promise<string> {\n const startTime = Date.now();\n logger.debug({ url, useHeadless }, 'Fetching HTML');\n\n if (useHeadless) {\n try {\n const result = await fetchWithPlaywright(url);\n const durationMs = Date.now() - startTime;\n logger.info(\n {\n url,\n useHeadless: true,\n durationMs,\n ...summarizePayload(result.html, 'raw-html', url),\n },\n 'Raw HTML fetched'\n );\n return result.html;\n } catch (error) {\n // Wrap with distinctive message so crawlSimple knows not to recover\n throw new Error(\n `Headless fetch failed: ${error instanceof Error ? error.message : String(error)}`\n );\n }\n }\n\n // Original axios implementation for static sites\n try {\n const response = await axios.get<string>(url, {\n timeout: this.config.timeout ?? DEFAULT_TIMEOUT,\n headers: {\n 'User-Agent': this.config.userAgent ?? DEFAULT_USER_AGENT,\n },\n });\n\n const durationMs = Date.now() - startTime;\n logger.info(\n {\n url,\n useHeadless: false,\n durationMs,\n ...summarizePayload(response.data, 'raw-html', url),\n },\n 'Raw HTML fetched'\n );\n\n return response.data;\n } catch (error) {\n logger.error(\n { url, error: error instanceof Error ? error.message : String(error) },\n 'Failed to fetch HTML'\n );\n throw new Error(\n `Failed to fetch ${url}: ${error instanceof Error ? error.message : String(error)}`\n );\n }\n }\n\n /**\n * Extract links from a page using Playwright or cheerio\n */\n private async extractLinks(url: string, useHeadless: boolean = false): Promise<string[]> {\n try {\n // Use headless mode for link extraction if enabled (JavaScript-rendered pages)\n if (useHeadless) {\n const result = await fetchWithPlaywright(url);\n return result.links;\n }\n\n // For static pages, fetch with axios and extract links with cheerio\n const response = await axios.get<string>(url, {\n timeout: this.config.timeout ?? DEFAULT_TIMEOUT,\n headers: {\n 'User-Agent': this.config.userAgent ?? DEFAULT_USER_AGENT,\n },\n });\n\n return extractLinksFromHtml(response.data, url);\n } catch (error: unknown) {\n // Log the error for debugging\n const errorMessage = error instanceof Error ? error.message : String(error);\n logger.error({ url, error: errorMessage }, 'Failed to extract links');\n\n // Re-throw the error instead of silently swallowing it\n throw new Error(`Link extraction failed for ${url}: ${errorMessage}`);\n }\n }\n\n /**\n * Check if two URLs are from the same domain\n */\n private isSameDomain(url1: string, url2: string): boolean {\n try {\n const domain1 = new URL(url1).hostname.toLowerCase();\n const domain2 = new URL(url2).hostname.toLowerCase();\n return (\n domain1 === domain2 || domain1.endsWith(`.${domain2}`) || domain2.endsWith(`.${domain1}`)\n );\n } catch {\n return false;\n }\n }\n\n /**\n * Stop the crawler\n */\n async stop(): Promise<void> {\n this.stopped = true;\n await closeBrowser();\n }\n}\n","/**\n * Article converter using @extractus/article-extractor and Turndown\n * Produces clean markdown from HTML using slurp-ai techniques\n */\n\nimport { extractFromHtml } from '@extractus/article-extractor';\nimport TurndownService from 'turndown';\nimport { gfm } from 'turndown-plugin-gfm';\nimport { preprocessHtmlForCodeBlocks, cleanupMarkdown } from './markdown-utils.js';\nimport { createLogger, truncateForLog } from '../logging/index.js';\n\nconst logger = createLogger('article-converter');\n\nexport interface ConversionResult {\n markdown: string;\n title?: string;\n}\n\n/**\n * Convert HTML to clean markdown using best practices from slurp-ai\n *\n * Pipeline:\n * 1. Extract main article content (strips navigation, ads, boilerplate)\n * 2. Preprocess HTML (handle MkDocs code blocks)\n * 3. Convert to markdown with Turndown + GFM\n * 4. Cleanup markdown (regex patterns)\n */\nexport async function convertHtmlToMarkdown(html: string, url: string): Promise<ConversionResult> {\n logger.debug({ url, htmlLength: html.length }, 'Starting HTML conversion');\n\n try {\n // Step 1: Extract main article content\n let articleHtml: string;\n let title: string | undefined;\n\n try {\n const article = await extractFromHtml(html, url);\n if (article?.content !== undefined && article.content !== '') {\n articleHtml = article.content;\n title = article.title !== undefined && article.title !== '' ? article.title : undefined;\n logger.debug(\n {\n url,\n title,\n extractedLength: articleHtml.length,\n usedFullHtml: false,\n },\n 'Article content extracted'\n );\n } else {\n // Fallback to full HTML if extraction fails\n articleHtml = html;\n logger.debug(\n { url, usedFullHtml: true },\n 'Article extraction returned empty, using full HTML'\n );\n }\n } catch (extractError) {\n // Fallback to full HTML if extraction fails\n articleHtml = html;\n logger.debug(\n {\n url,\n usedFullHtml: true,\n error: extractError instanceof Error ? extractError.message : String(extractError),\n },\n 'Article extraction failed, using full HTML'\n );\n }\n\n // Step 2: Preprocess HTML for code blocks\n const preprocessed = preprocessHtmlForCodeBlocks(articleHtml);\n\n // Step 3: Configure Turndown with custom rules\n const turndownService = new TurndownService({\n headingStyle: 'atx', // Use # style headings\n codeBlockStyle: 'fenced', // Use ``` style code blocks\n fence: '```',\n emDelimiter: '*',\n strongDelimiter: '**',\n linkStyle: 'inlined',\n });\n\n // Add GitHub Flavored Markdown support (tables, strikethrough, task lists)\n turndownService.use(gfm);\n\n // Custom rule for headings with anchors (from slurp-ai)\n turndownService.addRule('headingsWithAnchors', {\n filter: ['h1', 'h2', 'h3', 'h4', 'h5', 'h6'],\n replacement(content: string, node: HTMLElement): string {\n const level = Number(node.nodeName.charAt(1));\n const hashes = '#'.repeat(level);\n const cleanContent = content\n .replace(/\\[\\]\\([^)]*\\)/g, '') // Remove empty links\n .replace(/\\s+/g, ' ') // Normalize whitespace\n .trim();\n return cleanContent !== '' ? `\\n\\n${hashes} ${cleanContent}\\n\\n` : '';\n },\n });\n\n // Convert to markdown\n const rawMarkdown = turndownService.turndown(preprocessed);\n\n // Step 4: Cleanup markdown with comprehensive regex patterns\n const markdown = cleanupMarkdown(rawMarkdown);\n\n logger.debug(\n {\n url,\n title,\n rawMarkdownLength: rawMarkdown.length,\n finalMarkdownLength: markdown.length,\n },\n 'HTML to markdown conversion complete'\n );\n\n // Log markdown preview at trace level\n logger.trace(\n {\n url,\n markdownPreview: truncateForLog(markdown, 1000),\n },\n 'Markdown content preview'\n );\n\n return {\n markdown,\n ...(title !== undefined && { title }),\n };\n } catch (error) {\n logger.error(\n {\n url,\n error: error instanceof Error ? error.message : String(error),\n },\n 'HTML to markdown conversion failed'\n );\n\n // Re-throw errors - do not return graceful degradation\n throw error instanceof Error ? error : new Error(String(error));\n }\n}\n","/**\n * Markdown conversion utilities ported from slurp-ai\n * Source: https://github.com/ratacat/slurp-ai\n *\n * These utilities handle complex documentation site patterns (MkDocs, Sphinx, etc.)\n * and produce clean, well-formatted markdown.\n */\n\nimport * as cheerio from 'cheerio';\n\n/**\n * Detect language from code element class names.\n * Handles various class naming patterns from different highlighters.\n */\nfunction detectLanguageFromClass(className: string | undefined): string {\n if (className === undefined || className === '') return '';\n\n // Common patterns: \"language-python\", \"lang-js\", \"highlight-python\", \"python\", \"hljs language-python\"\n const patterns = [\n /language-(\\w+)/i,\n /lang-(\\w+)/i,\n /highlight-(\\w+)/i,\n /hljs\\s+(\\w+)/i,\n /^(\\w+)$/i,\n ];\n\n for (const pattern of patterns) {\n const match = className.match(pattern);\n if (match?.[1] !== undefined) {\n const lang = match[1].toLowerCase();\n // Filter out common non-language classes\n if (!['hljs', 'highlight', 'code', 'pre', 'block', 'inline'].includes(lang)) {\n return lang;\n }\n }\n }\n\n return '';\n}\n\n/**\n * Escape HTML special characters for safe embedding in HTML.\n */\nfunction escapeHtml(text: string): string {\n return text\n .replace(/&/g, '&')\n .replace(/</g, '<')\n .replace(/>/g, '>')\n .replace(/\"/g, '"')\n .replace(/'/g, ''');\n}\n\n/**\n * Preprocess HTML to handle MkDocs/Material theme code blocks.\n *\n * MkDocs wraps code in tables for line numbers:\n * <table><tbody><tr><td>line numbers</td><td><pre><code>code</code></pre></td></tr></tbody></table>\n *\n * This function converts them to standard <pre><code> blocks that Turndown handles correctly.\n * Also strips syntax highlighting spans and empty anchors from code.\n */\nexport function preprocessHtmlForCodeBlocks(html: string): string {\n if (!html || typeof html !== 'string') return html;\n\n const $ = cheerio.load(html);\n\n // Handle MkDocs/Material table-wrapped code blocks\n $('table').each((_i, table) => {\n const $table = $(table);\n\n // Check if this table contains a code block\n const $codeCell = $table.find('td pre code, td div pre code');\n\n if ($codeCell.length > 0) {\n // This is a code block table - extract the code\n const $pre = $codeCell.closest('pre');\n const $code = $codeCell.first();\n\n // Get language from class\n let language = detectLanguageFromClass($code.attr('class'));\n if (!language) {\n language = detectLanguageFromClass($pre.attr('class'));\n }\n\n // Get the text content, stripping all inner HTML tags\n const codeText = $code.text();\n\n // Create a clean pre > code block\n const cleanPre = `<pre><code class=\"language-${language}\">${escapeHtml(codeText)}</code></pre>`;\n\n // Replace the entire table with the clean code block\n $table.replaceWith(cleanPre);\n }\n });\n\n // Strip empty anchor tags used for line numbers\n $('pre a, code a').each((_i, anchor) => {\n const $anchor = $(anchor);\n if (!$anchor.text().trim()) {\n $anchor.remove();\n }\n });\n\n // Strip syntax highlighting spans inside code blocks, keeping only text\n $('pre span, code span').each((_i, span) => {\n const $span = $(span);\n $span.replaceWith($span.text());\n });\n\n // Handle standalone pre blocks that might have spans/anchors\n $('pre').each((_i, pre) => {\n const $pre = $(pre);\n // If this pre has a code child, it was already processed\n if ($pre.find('code').length === 0) {\n // Direct pre without code - get text content\n const text = $pre.text();\n const lang = detectLanguageFromClass($pre.attr('class'));\n $pre.html(`<code class=\"language-${lang}\">${escapeHtml(text)}</code>`);\n }\n });\n\n return $.html();\n}\n\n/**\n * Apply comprehensive cleanup rules to markdown content.\n *\n * Formatting rules:\n * - Double newlines between paragraphs and headings\n * - Double newlines before lists when preceded by normal text\n * - Single newlines between list items\n * - No blank lines inside code blocks\n */\nexport function cleanupMarkdown(markdown: string): string {\n if (!markdown) return '';\n\n const trimmed = markdown.trim();\n if (trimmed === '') return '';\n\n let result = trimmed;\n\n // 0. Fix broken headings where ## is on its own line followed by the text\n // Pattern: \"## \\n\\nSome text\" → \"## Some text\"\n result = result.replace(/^(#{1,6})\\s*\\n\\n+(\\S[^\\n]*)/gm, '$1 $2');\n\n // 0.5. Normalize multiple spaces after heading markers to single space\n // Pattern: \"## Subtitle\" → \"## Subtitle\"\n result = result.replace(/(#{1,6})\\s{2,}/g, '$1 ');\n\n // 1. Fix navigation links with excessive whitespace\n result = result.replace(/\\*\\s+\\[\\s*([^\\n]+?)\\s*\\]\\(([^)]+)\\)/g, '* [$1]($2)');\n\n // 2. Handle headings with specific newline requirements\n\n // Text followed by heading should have a single newline between them (no blank line)\n result = result.replace(/([^\\n])\\n\\n+(#\\s)/g, '$1\\n$2');\n\n // Add double newlines between text and next heading\n result = result.replace(/(Some text\\.)\\n(##\\s)/g, '$1\\n\\n$2');\n\n // Double newlines after a heading when followed by text\n result = result.replace(/(#{1,6}\\s[^\\n]+)\\n([^#\\n])/g, '$1\\n\\n$2');\n\n // Double newlines between headings\n result = result.replace(/(#{1,6}\\s[^\\n]+)\\n(#{1,6}\\s)/g, '$1\\n\\n$2');\n\n // 3. Lists - ensure all list items have single newlines only\n result = result.replace(/(\\* Item 1)\\n\\n+(\\* Item 2)\\n\\n+(\\* Item 3)/g, '$1\\n$2\\n$3');\n\n // 3.5. General list item spacing - ensure single newlines between list items\n result = result.replace(/(^\\*\\s[^\\n]+)\\n{2,}(^\\*\\s)/gm, '$1\\n$2');\n\n // 4. Clean up excessive blank lines (3+ newlines → 2 newlines)\n result = result.replace(/\\n{3,}/g, '\\n\\n');\n\n // 5. Code blocks - no blank lines after opening or before closing backticks\n result = result.replace(/(```[^\\n]*)\\n\\n+/g, '$1\\n');\n result = result.replace(/\\n\\n+```/g, '\\n```');\n\n // 6. Remove empty list items\n result = result.replace(/\\*\\s*\\n\\s*\\*/g, '*');\n\n // 7. Strip any remaining HTML tags that leaked through (common in MkDocs/Material)\n // Remove table structure tags\n result = result.replace(/<\\/?table[^>]*>/gi, '');\n result = result.replace(/<\\/?tbody[^>]*>/gi, '');\n result = result.replace(/<\\/?thead[^>]*>/gi, '');\n result = result.replace(/<\\/?tr[^>]*>/gi, '');\n result = result.replace(/<\\/?td[^>]*>/gi, '');\n result = result.replace(/<\\/?th[^>]*>/gi, '');\n\n // Remove empty anchor tags: <a></a> or <a id=\"...\"></a>\n result = result.replace(/<a[^>]*><\\/a>/gi, '');\n\n // Remove span tags (syntax highlighting remnants)\n result = result.replace(/<\\/?span[^>]*>/gi, '');\n\n // Remove div tags\n result = result.replace(/<\\/?div[^>]*>/gi, '');\n\n // Remove pre/code tags that leaked\n result = result.replace(/<\\/?pre[^>]*>/gi, '');\n result = result.replace(/<\\/?code[^>]*>/gi, '');\n\n // 8. Remove empty markdown links: [](url) and []()\n result = result.replace(/\\[\\]\\([^)]*\\)/g, '');\n\n // 9. Remove codelineno references that leaked into content\n // Pattern: [](_file.md#__codelineno-N-M)\n result = result.replace(/\\[\\]\\([^)]*#__codelineno-[^)]+\\)/g, '');\n\n // Also clean inline codelineno patterns\n result = result.replace(/\\[?\\]?\\([^)]*#__codelineno-[^)]*\\)/g, '');\n\n // 10. Clean up any double-escaped HTML entities that might result\n result = result.replace(/&lt;/g, '<');\n result = result.replace(/&gt;/g, '>');\n result = result.replace(/&amp;/g, '&');\n\n // 11. Final cleanup - normalize excessive whitespace from removed tags\n result = result.replace(/\\n{3,}/g, '\\n\\n');\n result = result.replace(/[ \\t]+\\n/g, '\\n');\n\n return result;\n}\n","/**\n * Claude CLI client for intelligent crawling and extraction\n * Uses `claude -p` programmatically to analyze page structure and extract content\n */\n\nimport { spawn, execSync } from 'node:child_process';\nimport { existsSync } from 'node:fs';\nimport { homedir } from 'node:os';\nimport { join } from 'node:path';\n\n/**\n * Schema for crawl strategy response from Claude\n */\nexport interface CrawlStrategy {\n urls: string[];\n reasoning: string;\n}\n\nconst CRAWL_STRATEGY_SCHEMA = {\n type: 'object',\n properties: {\n urls: {\n type: 'array',\n items: { type: 'string' },\n description: 'List of URLs to crawl based on the instruction',\n },\n reasoning: {\n type: 'string',\n description: 'Brief explanation of why these URLs were selected',\n },\n },\n required: ['urls', 'reasoning'],\n};\n\n/**\n * Client for interacting with Claude Code CLI\n */\nexport class ClaudeClient {\n private readonly timeout: number;\n private static availabilityChecked = false;\n private static available = false;\n private static claudePath: string | null = null;\n\n /**\n * Get the path to the Claude CLI binary\n * Checks in order:\n * 1. CLAUDE_BIN environment variable (explicit override)\n * 2. ~/.claude/local/claude (newer installation location)\n * 3. ~/.local/bin/claude (standard installation location)\n * 4. 'claude' in PATH (custom installations)\n */\n static getClaudePath(): string | null {\n // Check environment variable override\n const envPath = process.env['CLAUDE_BIN'];\n if (envPath !== undefined && envPath !== '' && existsSync(envPath)) {\n return envPath;\n }\n\n // Check ~/.claude/local/claude (newer location)\n const claudeLocalPath = join(homedir(), '.claude', 'local', 'claude');\n if (existsSync(claudeLocalPath)) {\n return claudeLocalPath;\n }\n\n // Check ~/.local/bin/claude (standard location)\n const localBinPath = join(homedir(), '.local', 'bin', 'claude');\n if (existsSync(localBinPath)) {\n return localBinPath;\n }\n\n // Check if 'claude' is in PATH (custom installations, uses 'command -v' which handles aliases)\n try {\n const result = execSync('command -v claude', { stdio: ['pipe', 'pipe', 'ignore'] });\n const path = result.toString().trim();\n if (path) {\n return path;\n }\n } catch {\n // Not in PATH\n }\n\n return null;\n }\n\n /**\n * Check if Claude CLI is available\n * Result is cached after first check for performance\n */\n static isAvailable(): boolean {\n if (!ClaudeClient.availabilityChecked) {\n ClaudeClient.claudePath = ClaudeClient.getClaudePath();\n ClaudeClient.available = ClaudeClient.claudePath !== null;\n ClaudeClient.availabilityChecked = true;\n }\n return ClaudeClient.available;\n }\n\n /**\n * Get the cached Claude path (call isAvailable first)\n */\n static getCachedPath(): string | null {\n return ClaudeClient.claudePath;\n }\n\n /**\n * Reset availability cache (for testing)\n */\n static resetAvailabilityCache(): void {\n ClaudeClient.availabilityChecked = false;\n ClaudeClient.available = false;\n }\n\n constructor(options: { timeout?: number } = {}) {\n this.timeout = options.timeout ?? 30000; // 30s default\n }\n\n /**\n * Determine which URLs to crawl based on natural language instruction\n *\n * @param seedUrl - The URL of the seed page (for resolving relative URLs)\n * @param seedHtml - HTML content of the seed page\n * @param instruction - Natural language crawl instruction (e.g., \"scrape all Getting Started pages\")\n * @returns List of URLs to crawl with reasoning\n */\n async determineCrawlUrls(\n seedUrl: string,\n seedHtml: string,\n instruction: string\n ): Promise<CrawlStrategy> {\n const prompt = `You are analyzing a webpage to determine which pages to crawl based on the user's instruction.\n\nBase URL: ${seedUrl}\n\nInstruction: ${instruction}\n\nWebpage HTML (analyze the navigation structure, links, and content):\n${this.truncateHtml(seedHtml, 50000)}\n\nBased on the instruction, extract and return a list of absolute URLs that should be crawled. When you encounter relative URLs (starting with \"/\" or without a protocol), resolve them against the Base URL. For example, if Base URL is \"https://example.com/docs\" and you see href=\"/docs/hooks\", return \"https://example.com/docs/hooks\".\n\nLook for navigation menus, sidebars, headers, and link structures that match the instruction.\n\nReturn only URLs that are relevant to the instruction. If the instruction mentions specific sections (e.g., \"Getting Started\"), find links in those sections.`;\n\n try {\n const result = await this.callClaude(prompt, CRAWL_STRATEGY_SCHEMA);\n const rawParsed: unknown = JSON.parse(result);\n\n // Claude CLI with --json-schema returns wrapper: {type, result, structured_output: {...}}\n // Extract structured_output if present, otherwise use raw response\n const parsed = this.extractStructuredOutput(rawParsed);\n\n // Validate and narrow type\n if (\n typeof parsed !== 'object' ||\n parsed === null ||\n !('urls' in parsed) ||\n !('reasoning' in parsed) ||\n !Array.isArray(parsed.urls) ||\n parsed.urls.length === 0 ||\n typeof parsed.reasoning !== 'string' ||\n !parsed.urls.every((url) => typeof url === 'string')\n ) {\n throw new Error('Claude returned invalid crawl strategy');\n }\n\n // Type is now properly narrowed - urls is string[] after validation\n return { urls: parsed.urls, reasoning: parsed.reasoning };\n } catch (error) {\n throw new Error(\n `Failed to determine crawl strategy: ${error instanceof Error ? error.message : String(error)}`\n );\n }\n }\n\n /**\n * Extract specific information from markdown content using natural language\n *\n * @param markdown - Page content in markdown format\n * @param instruction - Natural language extraction instruction (e.g., \"extract pricing info\")\n * @returns Extracted information as text\n */\n async extractContent(markdown: string, instruction: string): Promise<string> {\n const prompt = `${instruction}\n\nContent to analyze:\n${this.truncateMarkdown(markdown, 100000)}`;\n\n try {\n const result = await this.callClaude(prompt);\n return result.trim();\n } catch (error) {\n throw new Error(\n `Failed to extract content: ${error instanceof Error ? error.message : String(error)}`\n );\n }\n }\n\n /**\n * Call Claude CLI with a prompt\n *\n * @param prompt - The prompt to send to Claude\n * @param jsonSchema - Optional JSON schema for structured output\n * @returns Claude's response as a string\n */\n private async callClaude(prompt: string, jsonSchema?: Record<string, unknown>): Promise<string> {\n return new Promise<string>((resolve, reject) => {\n // Ensure we have Claude path\n const claudePath = ClaudeClient.getCachedPath();\n if (claudePath === null) {\n reject(new Error('Claude CLI not available'));\n return;\n }\n\n const args = ['-p'];\n\n // Add JSON schema if provided\n if (jsonSchema) {\n args.push('--json-schema', JSON.stringify(jsonSchema));\n args.push('--output-format', 'json');\n }\n\n const proc = spawn(claudePath, args, {\n stdio: ['pipe', 'pipe', 'pipe'],\n cwd: process.cwd(),\n env: { ...process.env },\n });\n\n let stdout = '';\n let stderr = '';\n let timeoutId: NodeJS.Timeout | undefined;\n\n // Set timeout\n if (this.timeout > 0) {\n timeoutId = setTimeout(() => {\n proc.kill('SIGTERM');\n reject(new Error(`Claude CLI timed out after ${String(this.timeout)}ms`));\n }, this.timeout);\n }\n\n proc.stdout.on('data', (chunk: Buffer) => {\n stdout += chunk.toString();\n });\n\n proc.stderr.on('data', (chunk: Buffer) => {\n stderr += chunk.toString();\n });\n\n proc.on('close', (code: number | null) => {\n if (timeoutId !== undefined) {\n clearTimeout(timeoutId);\n }\n\n if (code === 0) {\n resolve(stdout.trim());\n } else {\n reject(\n new Error(`Claude CLI exited with code ${String(code)}${stderr ? `: ${stderr}` : ''}`)\n );\n }\n });\n\n proc.on('error', (err) => {\n if (timeoutId !== undefined) {\n clearTimeout(timeoutId);\n }\n reject(new Error(`Failed to spawn Claude CLI: ${err.message}`));\n });\n\n // Write prompt to stdin\n proc.stdin.write(prompt);\n proc.stdin.end();\n });\n }\n\n /**\n * Truncate HTML to a maximum length (keep important parts)\n */\n private truncateHtml(html: string, maxLength: number): string {\n if (html.length <= maxLength) return html;\n\n // Try to keep the beginning (usually has navigation)\n return `${html.substring(0, maxLength)}\\n\\n[... HTML truncated ...]`;\n }\n\n /**\n * Truncate markdown to a maximum length\n */\n private truncateMarkdown(markdown: string, maxLength: number): string {\n if (markdown.length <= maxLength) return markdown;\n\n return `${markdown.substring(0, maxLength)}\\n\\n[... content truncated ...]`;\n }\n\n /**\n * Type guard to check if value is a record (plain object)\n */\n private isRecord(value: unknown): value is Record<string, unknown> {\n return typeof value === 'object' && value !== null && !Array.isArray(value);\n }\n\n /**\n * Extract structured_output from Claude CLI wrapper format if present.\n * Claude CLI with --json-schema returns: {type, result, structured_output: {...}}\n * This method extracts the inner structured_output, or returns the raw value if not wrapped.\n */\n private extractStructuredOutput(rawParsed: unknown): unknown {\n if (this.isRecord(rawParsed) && 'structured_output' in rawParsed) {\n const structuredOutput = rawParsed['structured_output'];\n if (typeof structuredOutput === 'object') {\n return structuredOutput;\n }\n }\n return rawParsed;\n }\n}\n","/**\n * Cheerio-based link extraction utility\n * Used as backup for non-JavaScript pages where Playwright isn't needed\n */\n\nimport * as cheerio from 'cheerio';\n\n/**\n * Extract all links from HTML content\n *\n * @param html - HTML content to parse\n * @param baseUrl - Base URL for resolving relative links\n * @returns Array of absolute URLs found in the HTML\n */\nexport function extractLinks(html: string, baseUrl: string): string[] {\n const $ = cheerio.load(html);\n const links: Set<string> = new Set();\n\n $('a[href]').each((_, el) => {\n const href = $(el).attr('href');\n if (href !== undefined && href !== '') {\n try {\n const absoluteUrl = new URL(href, baseUrl).href;\n // Only include http/https links\n if (absoluteUrl.startsWith('http://') || absoluteUrl.startsWith('https://')) {\n links.add(absoluteUrl);\n }\n } catch {\n // Skip invalid URLs\n }\n }\n });\n\n return [...links];\n}\n\n/**\n * Extract links that match a specific domain\n *\n * @param html - HTML content to parse\n * @param baseUrl - Base URL for resolving relative links\n * @param domain - Domain to filter by (e.g., 'example.com')\n * @returns Array of absolute URLs from the specified domain\n */\nexport function extractSameDomainLinks(html: string, baseUrl: string, domain: string): string[] {\n const allLinks = extractLinks(html, baseUrl);\n return allLinks.filter((link) => {\n try {\n const linkDomain = new URL(link).hostname.toLowerCase();\n const targetDomain = domain.toLowerCase();\n return linkDomain === targetDomain || linkDomain.endsWith(`.${targetDomain}`);\n } catch {\n return false;\n }\n });\n}\n","/**\n * Node.js Playwright-based web crawler\n * Replaces crawl4ai Python dependency for headless browser crawling\n */\n\nimport { chromium, type Browser, type Page } from 'playwright';\nimport { createLogger, summarizePayload } from '../logging/index.js';\n\nconst logger = createLogger('playwright-crawler');\n\n// Default timeout for page navigation\nconst DEFAULT_TIMEOUT = 30000;\n\n// User agent to identify our crawler\nconst USER_AGENT =\n 'Mozilla/5.0 (compatible; BlueraKnowledge/1.0; +https://github.com/blueraai/bluera-knowledge)';\n\nexport interface PlaywrightFetchResult {\n html: string;\n links: string[];\n}\n\n/**\n * Singleton browser instance for connection reuse\n */\nlet browserInstance: Browser | null = null;\n\n/**\n * Get or create the browser instance\n */\nasync function getBrowser(): Promise<Browser> {\n if (browserInstance?.isConnected() !== true) {\n logger.debug('Launching new Chromium browser instance');\n browserInstance = await chromium.launch({\n headless: true,\n });\n }\n return browserInstance;\n}\n\n/**\n * Close the browser instance (call on cleanup)\n */\nexport async function closeBrowser(): Promise<void> {\n if (browserInstance?.isConnected() === true) {\n await browserInstance.close();\n browserInstance = null;\n }\n}\n\n/**\n * Fetch a URL using Playwright headless browser\n * Returns the rendered HTML and discovered links\n */\nexport async function fetchWithPlaywright(\n url: string,\n timeoutMs: number = DEFAULT_TIMEOUT\n): Promise<PlaywrightFetchResult> {\n const startTime = Date.now();\n logger.debug({ url, timeoutMs }, 'Fetching page with Playwright');\n\n const browser = await getBrowser();\n const context = await browser.newContext({\n userAgent: USER_AGENT,\n });\n\n let page: Page | null = null;\n\n try {\n page = await context.newPage();\n\n // Navigate to the URL and wait for network idle\n await page.goto(url, {\n waitUntil: 'networkidle',\n timeout: timeoutMs,\n });\n\n // Get the full rendered HTML\n const html = await page.content();\n\n // Extract all links from the page\n const links = await page.evaluate(() => {\n const anchors = document.querySelectorAll('a[href]');\n const hrefs: string[] = [];\n\n anchors.forEach((anchor) => {\n const href = anchor.getAttribute('href');\n if (href !== null && href !== '') {\n try {\n // Resolve relative URLs to absolute\n const absoluteUrl = new URL(href, document.baseURI).href;\n // Only include http/https links\n if (absoluteUrl.startsWith('http://') || absoluteUrl.startsWith('https://')) {\n hrefs.push(absoluteUrl);\n }\n } catch {\n // Skip invalid URLs\n }\n }\n });\n\n // Deduplicate\n return [...new Set(hrefs)];\n });\n\n const durationMs = Date.now() - startTime;\n logger.info(\n {\n url,\n durationMs,\n linkCount: links.length,\n ...summarizePayload(html, 'raw-html', url),\n },\n 'Page fetched with Playwright'\n );\n\n return { html, links };\n } finally {\n // Clean up the context (closes the page too)\n await context.close();\n }\n}\n\n/**\n * Check if Playwright browsers are installed\n * Fast check without launching a browser\n */\nexport function isPlaywrightInstalled(): boolean {\n try {\n // Try to get the chromium executable path\n // This will throw if browsers aren't installed\n const execPath = chromium.executablePath();\n logger.debug({ execPath }, 'Playwright Chromium executable found');\n return true;\n } catch {\n return false;\n }\n}\n"],"mappings":";;;;;;;AAKA,SAAS,oBAAoB;AAC7B,OAAO,WAAW;;;ACDlB,SAAS,uBAAuB;AAChC,OAAO,qBAAqB;AAC5B,SAAS,WAAW;;;ACCpB,YAAY,aAAa;AAMzB,SAAS,wBAAwB,WAAuC;AACtE,MAAI,cAAc,UAAa,cAAc,GAAI,QAAO;AAGxD,QAAM,WAAW;AAAA,IACf;AAAA,IACA;AAAA,IACA;AAAA,IACA;AAAA,IACA;AAAA,EACF;AAEA,aAAW,WAAW,UAAU;AAC9B,UAAM,QAAQ,UAAU,MAAM,OAAO;AACrC,QAAI,QAAQ,CAAC,MAAM,QAAW;AAC5B,YAAM,OAAO,MAAM,CAAC,EAAE,YAAY;AAElC,UAAI,CAAC,CAAC,QAAQ,aAAa,QAAQ,OAAO,SAAS,QAAQ,EAAE,SAAS,IAAI,GAAG;AAC3E,eAAO;AAAA,MACT;AAAA,IACF;AAAA,EACF;AAEA,SAAO;AACT;AAKA,SAAS,WAAW,MAAsB;AACxC,SAAO,KACJ,QAAQ,MAAM,OAAO,EACrB,QAAQ,MAAM,MAAM,EACpB,QAAQ,MAAM,MAAM,EACpB,QAAQ,MAAM,QAAQ,EACtB,QAAQ,MAAM,QAAQ;AAC3B;AAWO,SAAS,4BAA4B,MAAsB;AAChE,MAAI,CAAC,QAAQ,OAAO,SAAS,SAAU,QAAO;AAE9C,QAAM,IAAY,aAAK,IAAI;AAG3B,IAAE,OAAO,EAAE,KAAK,CAAC,IAAI,UAAU;AAC7B,UAAM,SAAS,EAAE,KAAK;AAGtB,UAAM,YAAY,OAAO,KAAK,8BAA8B;AAE5D,QAAI,UAAU,SAAS,GAAG;AAExB,YAAM,OAAO,UAAU,QAAQ,KAAK;AACpC,YAAM,QAAQ,UAAU,MAAM;AAG9B,UAAI,WAAW,wBAAwB,MAAM,KAAK,OAAO,CAAC;AAC1D,UAAI,CAAC,UAAU;AACb,mBAAW,wBAAwB,KAAK,KAAK,OAAO,CAAC;AAAA,MACvD;AAGA,YAAM,WAAW,MAAM,KAAK;AAG5B,YAAM,WAAW,8BAA8B,QAAQ,KAAK,WAAW,QAAQ,CAAC;AAGhF,aAAO,YAAY,QAAQ;AAAA,IAC7B;AAAA,EACF,CAAC;AAGD,IAAE,eAAe,EAAE,KAAK,CAAC,IAAI,WAAW;AACtC,UAAM,UAAU,EAAE,MAAM;AACxB,QAAI,CAAC,QAAQ,KAAK,EAAE,KAAK,GAAG;AAC1B,cAAQ,OAAO;AAAA,IACjB;AAAA,EACF,CAAC;AAGD,IAAE,qBAAqB,EAAE,KAAK,CAAC,IAAI,SAAS;AAC1C,UAAM,QAAQ,EAAE,IAAI;AACpB,UAAM,YAAY,MAAM,KAAK,CAAC;AAAA,EAChC,CAAC;AAGD,IAAE,KAAK,EAAE,KAAK,CAAC,IAAI,QAAQ;AACzB,UAAM,OAAO,EAAE,GAAG;AAElB,QAAI,KAAK,KAAK,MAAM,EAAE,WAAW,GAAG;AAElC,YAAM,OAAO,KAAK,KAAK;AACvB,YAAM,OAAO,wBAAwB,KAAK,KAAK,OAAO,CAAC;AACvD,WAAK,KAAK,yBAAyB,IAAI,KAAK,WAAW,IAAI,CAAC,SAAS;AAAA,IACvE;AAAA,EACF,CAAC;AAED,SAAO,EAAE,KAAK;AAChB;AAWO,SAAS,gBAAgB,UAA0B;AACxD,MAAI,CAAC,SAAU,QAAO;AAEtB,QAAM,UAAU,SAAS,KAAK;AAC9B,MAAI,YAAY,GAAI,QAAO;AAE3B,MAAI,SAAS;AAIb,WAAS,OAAO,QAAQ,iCAAiC,OAAO;AAIhE,WAAS,OAAO,QAAQ,mBAAmB,KAAK;AAGhD,WAAS,OAAO,QAAQ,wCAAwC,YAAY;AAK5E,WAAS,OAAO,QAAQ,sBAAsB,QAAQ;AAGtD,WAAS,OAAO,QAAQ,0BAA0B,UAAU;AAG5D,WAAS,OAAO,QAAQ,+BAA+B,UAAU;AAGjE,WAAS,OAAO,QAAQ,iCAAiC,UAAU;AAGnE,WAAS,OAAO,QAAQ,gDAAgD,YAAY;AAGpF,WAAS,OAAO,QAAQ,gCAAgC,QAAQ;AAGhE,WAAS,OAAO,QAAQ,WAAW,MAAM;AAGzC,WAAS,OAAO,QAAQ,qBAAqB,MAAM;AACnD,WAAS,OAAO,QAAQ,aAAa,OAAO;AAG5C,WAAS,OAAO,QAAQ,iBAAiB,GAAG;AAI5C,WAAS,OAAO,QAAQ,qBAAqB,EAAE;AAC/C,WAAS,OAAO,QAAQ,qBAAqB,EAAE;AAC/C,WAAS,OAAO,QAAQ,qBAAqB,EAAE;AAC/C,WAAS,OAAO,QAAQ,kBAAkB,EAAE;AAC5C,WAAS,OAAO,QAAQ,kBAAkB,EAAE;AAC5C,WAAS,OAAO,QAAQ,kBAAkB,EAAE;AAG5C,WAAS,OAAO,QAAQ,mBAAmB,EAAE;AAG7C,WAAS,OAAO,QAAQ,oBAAoB,EAAE;AAG9C,WAAS,OAAO,QAAQ,mBAAmB,EAAE;AAG7C,WAAS,OAAO,QAAQ,mBAAmB,EAAE;AAC7C,WAAS,OAAO,QAAQ,oBAAoB,EAAE;AAG9C,WAAS,OAAO,QAAQ,kBAAkB,EAAE;AAI5C,WAAS,OAAO,QAAQ,qCAAqC,EAAE;AAG/D,WAAS,OAAO,QAAQ,uCAAuC,EAAE;AAGjE,WAAS,OAAO,QAAQ,aAAa,MAAM;AAC3C,WAAS,OAAO,QAAQ,aAAa,MAAM;AAC3C,WAAS,OAAO,QAAQ,cAAc,OAAO;AAG7C,WAAS,OAAO,QAAQ,WAAW,MAAM;AACzC,WAAS,OAAO,QAAQ,aAAa,IAAI;AAEzC,SAAO;AACT;;;ADrNA,IAAM,SAAS,aAAa,mBAAmB;AAgB/C,eAAsB,sBAAsB,MAAc,KAAwC;AAChG,SAAO,MAAM,EAAE,KAAK,YAAY,KAAK,OAAO,GAAG,0BAA0B;AAEzE,MAAI;AAEF,QAAI;AACJ,QAAI;AAEJ,QAAI;AACF,YAAM,UAAU,MAAM,gBAAgB,MAAM,GAAG;AAC/C,UAAI,SAAS,YAAY,UAAa,QAAQ,YAAY,IAAI;AAC5D,sBAAc,QAAQ;AACtB,gBAAQ,QAAQ,UAAU,UAAa,QAAQ,UAAU,KAAK,QAAQ,QAAQ;AAC9E,eAAO;AAAA,UACL;AAAA,YACE;AAAA,YACA;AAAA,YACA,iBAAiB,YAAY;AAAA,YAC7B,cAAc;AAAA,UAChB;AAAA,UACA;AAAA,QACF;AAAA,MACF,OAAO;AAEL,sBAAc;AACd,eAAO;AAAA,UACL,EAAE,KAAK,cAAc,KAAK;AAAA,UAC1B;AAAA,QACF;AAAA,MACF;AAAA,IACF,SAAS,cAAc;AAErB,oBAAc;AACd,aAAO;AAAA,QACL;AAAA,UACE;AAAA,UACA,cAAc;AAAA,UACd,OAAO,wBAAwB,QAAQ,aAAa,UAAU,OAAO,YAAY;AAAA,QACnF;AAAA,QACA;AAAA,MACF;AAAA,IACF;AAGA,UAAM,eAAe,4BAA4B,WAAW;AAG5D,UAAM,kBAAkB,IAAI,gBAAgB;AAAA,MAC1C,cAAc;AAAA;AAAA,MACd,gBAAgB;AAAA;AAAA,MAChB,OAAO;AAAA,MACP,aAAa;AAAA,MACb,iBAAiB;AAAA,MACjB,WAAW;AAAA,IACb,CAAC;AAGD,oBAAgB,IAAI,GAAG;AAGvB,oBAAgB,QAAQ,uBAAuB;AAAA,MAC7C,QAAQ,CAAC,MAAM,MAAM,MAAM,MAAM,MAAM,IAAI;AAAA,MAC3C,YAAY,SAAiB,MAA2B;AACtD,cAAM,QAAQ,OAAO,KAAK,SAAS,OAAO,CAAC,CAAC;AAC5C,cAAM,SAAS,IAAI,OAAO,KAAK;AAC/B,cAAM,eAAe,QAClB,QAAQ,kBAAkB,EAAE,EAC5B,QAAQ,QAAQ,GAAG,EACnB,KAAK;AACR,eAAO,iBAAiB,KAAK;AAAA;AAAA,EAAO,MAAM,IAAI,YAAY;AAAA;AAAA,IAAS;AAAA,MACrE;AAAA,IACF,CAAC;AAGD,UAAM,cAAc,gBAAgB,SAAS,YAAY;AAGzD,UAAM,WAAW,gBAAgB,WAAW;AAE5C,WAAO;AAAA,MACL;AAAA,QACE;AAAA,QACA;AAAA,QACA,mBAAmB,YAAY;AAAA,QAC/B,qBAAqB,SAAS;AAAA,MAChC;AAAA,MACA;AAAA,IACF;AAGA,WAAO;AAAA,MACL;AAAA,QACE;AAAA,QACA,iBAAiB,eAAe,UAAU,GAAI;AAAA,MAChD;AAAA,MACA;AAAA,IACF;AAEA,WAAO;AAAA,MACL;AAAA,MACA,GAAI,UAAU,UAAa,EAAE,MAAM;AAAA,IACrC;AAAA,EACF,SAAS,OAAO;AACd,WAAO;AAAA,MACL;AAAA,QACE;AAAA,QACA,OAAO,iBAAiB,QAAQ,MAAM,UAAU,OAAO,KAAK;AAAA,MAC9D;AAAA,MACA;AAAA,IACF;AAGA,UAAM,iBAAiB,QAAQ,QAAQ,IAAI,MAAM,OAAO,KAAK,CAAC;AAAA,EAChE;AACF;;;AExIA,SAAS,OAAO,gBAAgB;AAChC,SAAS,kBAAkB;AAC3B,SAAS,eAAe;AACxB,SAAS,YAAY;AAUrB,IAAM,wBAAwB;AAAA,EAC5B,MAAM;AAAA,EACN,YAAY;AAAA,IACV,MAAM;AAAA,MACJ,MAAM;AAAA,MACN,OAAO,EAAE,MAAM,SAAS;AAAA,MACxB,aAAa;AAAA,IACf;AAAA,IACA,WAAW;AAAA,MACT,MAAM;AAAA,MACN,aAAa;AAAA,IACf;AAAA,EACF;AAAA,EACA,UAAU,CAAC,QAAQ,WAAW;AAChC;AAKO,IAAM,eAAN,MAAM,cAAa;AAAA,EACP;AAAA,EACjB,OAAe,sBAAsB;AAAA,EACrC,OAAe,YAAY;AAAA,EAC3B,OAAe,aAA4B;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,EAU3C,OAAO,gBAA+B;AAEpC,UAAM,UAAU,QAAQ,IAAI,YAAY;AACxC,QAAI,YAAY,UAAa,YAAY,MAAM,WAAW,OAAO,GAAG;AAClE,aAAO;AAAA,IACT;AAGA,UAAM,kBAAkB,KAAK,QAAQ,GAAG,WAAW,SAAS,QAAQ;AACpE,QAAI,WAAW,eAAe,GAAG;AAC/B,aAAO;AAAA,IACT;AAGA,UAAM,eAAe,KAAK,QAAQ,GAAG,UAAU,OAAO,QAAQ;AAC9D,QAAI,WAAW,YAAY,GAAG;AAC5B,aAAO;AAAA,IACT;AAGA,QAAI;AACF,YAAM,SAAS,SAAS,qBAAqB,EAAE,OAAO,CAAC,QAAQ,QAAQ,QAAQ,EAAE,CAAC;AAClF,YAAM,OAAO,OAAO,SAAS,EAAE,KAAK;AACpC,UAAI,MAAM;AACR,eAAO;AAAA,MACT;AAAA,IACF,QAAQ;AAAA,IAER;AAEA,WAAO;AAAA,EACT;AAAA;AAAA;AAAA;AAAA;AAAA,EAMA,OAAO,cAAuB;AAC5B,QAAI,CAAC,cAAa,qBAAqB;AACrC,oBAAa,aAAa,cAAa,cAAc;AACrD,oBAAa,YAAY,cAAa,eAAe;AACrD,oBAAa,sBAAsB;AAAA,IACrC;AACA,WAAO,cAAa;AAAA,EACtB;AAAA;AAAA;AAAA;AAAA,EAKA,OAAO,gBAA+B;AACpC,WAAO,cAAa;AAAA,EACtB;AAAA;AAAA;AAAA;AAAA,EAKA,OAAO,yBAA+B;AACpC,kBAAa,sBAAsB;AACnC,kBAAa,YAAY;AAAA,EAC3B;AAAA,EAEA,YAAY,UAAgC,CAAC,GAAG;AAC9C,SAAK,UAAU,QAAQ,WAAW;AAAA,EACpC;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,EAUA,MAAM,mBACJ,SACA,UACA,aACwB;AACxB,UAAM,SAAS;AAAA;AAAA,YAEP,OAAO;AAAA;AAAA,eAEJ,WAAW;AAAA;AAAA;AAAA,EAGxB,KAAK,aAAa,UAAU,GAAK,CAAC;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAQhC,QAAI;AACF,YAAM,SAAS,MAAM,KAAK,WAAW,QAAQ,qBAAqB;AAClE,YAAM,YAAqB,KAAK,MAAM,MAAM;AAI5C,YAAM,SAAS,KAAK,wBAAwB,SAAS;AAGrD,UACE,OAAO,WAAW,YAClB,WAAW,QACX,EAAE,UAAU,WACZ,EAAE,eAAe,WACjB,CAAC,MAAM,QAAQ,OAAO,IAAI,KAC1B,OAAO,KAAK,WAAW,KACvB,OAAO,OAAO,cAAc,YAC5B,CAAC,OAAO,KAAK,MAAM,CAAC,QAAQ,OAAO,QAAQ,QAAQ,GACnD;AACA,cAAM,IAAI,MAAM,wCAAwC;AAAA,MAC1D;AAGA,aAAO,EAAE,MAAM,OAAO,MAAM,WAAW,OAAO,UAAU;AAAA,IAC1D,SAAS,OAAO;AACd,YAAM,IAAI;AAAA,QACR,uCAAuC,iBAAiB,QAAQ,MAAM,UAAU,OAAO,KAAK,CAAC;AAAA,MAC/F;AAAA,IACF;AAAA,EACF;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,EASA,MAAM,eAAe,UAAkB,aAAsC;AAC3E,UAAM,SAAS,GAAG,WAAW;AAAA;AAAA;AAAA,EAG/B,KAAK,iBAAiB,UAAU,GAAM,CAAC;AAErC,QAAI;AACF,YAAM,SAAS,MAAM,KAAK,WAAW,MAAM;AAC3C,aAAO,OAAO,KAAK;AAAA,IACrB,SAAS,OAAO;AACd,YAAM,IAAI;AAAA,QACR,8BAA8B,iBAAiB,QAAQ,MAAM,UAAU,OAAO,KAAK,CAAC;AAAA,MACtF;AAAA,IACF;AAAA,EACF;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,EASA,MAAc,WAAW,QAAgB,YAAuD;AAC9F,WAAO,IAAI,QAAgB,CAAC,SAAS,WAAW;AAE9C,YAAM,aAAa,cAAa,cAAc;AAC9C,UAAI,eAAe,MAAM;AACvB,eAAO,IAAI,MAAM,0BAA0B,CAAC;AAC5C;AAAA,MACF;AAEA,YAAM,OAAO,CAAC,IAAI;AAGlB,UAAI,YAAY;AACd,aAAK,KAAK,iBAAiB,KAAK,UAAU,UAAU,CAAC;AACrD,aAAK,KAAK,mBAAmB,MAAM;AAAA,MACrC;AAEA,YAAM,OAAO,MAAM,YAAY,MAAM;AAAA,QACnC,OAAO,CAAC,QAAQ,QAAQ,MAAM;AAAA,QAC9B,KAAK,QAAQ,IAAI;AAAA,QACjB,KAAK,EAAE,GAAG,QAAQ,IAAI;AAAA,MACxB,CAAC;AAED,UAAI,SAAS;AACb,UAAI,SAAS;AACb,UAAI;AAGJ,UAAI,KAAK,UAAU,GAAG;AACpB,oBAAY,WAAW,MAAM;AAC3B,eAAK,KAAK,SAAS;AACnB,iBAAO,IAAI,MAAM,8BAA8B,OAAO,KAAK,OAAO,CAAC,IAAI,CAAC;AAAA,QAC1E,GAAG,KAAK,OAAO;AAAA,MACjB;AAEA,WAAK,OAAO,GAAG,QAAQ,CAAC,UAAkB;AACxC,kBAAU,MAAM,SAAS;AAAA,MAC3B,CAAC;AAED,WAAK,OAAO,GAAG,QAAQ,CAAC,UAAkB;AACxC,kBAAU,MAAM,SAAS;AAAA,MAC3B,CAAC;AAED,WAAK,GAAG,SAAS,CAAC,SAAwB;AACxC,YAAI,cAAc,QAAW;AAC3B,uBAAa,SAAS;AAAA,QACxB;AAEA,YAAI,SAAS,GAAG;AACd,kBAAQ,OAAO,KAAK,CAAC;AAAA,QACvB,OAAO;AACL;AAAA,YACE,IAAI,MAAM,+BAA+B,OAAO,IAAI,CAAC,GAAG,SAAS,KAAK,MAAM,KAAK,EAAE,EAAE;AAAA,UACvF;AAAA,QACF;AAAA,MACF,CAAC;AAED,WAAK,GAAG,SAAS,CAAC,QAAQ;AACxB,YAAI,cAAc,QAAW;AAC3B,uBAAa,SAAS;AAAA,QACxB;AACA,eAAO,IAAI,MAAM,+BAA+B,IAAI,OAAO,EAAE,CAAC;AAAA,MAChE,CAAC;AAGD,WAAK,MAAM,MAAM,MAAM;AACvB,WAAK,MAAM,IAAI;AAAA,IACjB,CAAC;AAAA,EACH;AAAA;AAAA;AAAA;AAAA,EAKQ,aAAa,MAAc,WAA2B;AAC5D,QAAI,KAAK,UAAU,UAAW,QAAO;AAGrC,WAAO,GAAG,KAAK,UAAU,GAAG,SAAS,CAAC;AAAA;AAAA;AAAA,EACxC;AAAA;AAAA;AAAA;AAAA,EAKQ,iBAAiB,UAAkB,WAA2B;AACpE,QAAI,SAAS,UAAU,UAAW,QAAO;AAEzC,WAAO,GAAG,SAAS,UAAU,GAAG,SAAS,CAAC;AAAA;AAAA;AAAA,EAC5C;AAAA;AAAA;AAAA;AAAA,EAKQ,SAAS,OAAkD;AACjE,WAAO,OAAO,UAAU,YAAY,UAAU,QAAQ,CAAC,MAAM,QAAQ,KAAK;AAAA,EAC5E;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,EAOQ,wBAAwB,WAA6B;AAC3D,QAAI,KAAK,SAAS,SAAS,KAAK,uBAAuB,WAAW;AAChE,YAAM,mBAAmB,UAAU,mBAAmB;AACtD,UAAI,OAAO,qBAAqB,UAAU;AACxC,eAAO;AAAA,MACT;AAAA,IACF;AACA,WAAO;AAAA,EACT;AACF;;;ACtTA,YAAYA,cAAa;AASlB,SAAS,aAAa,MAAc,SAA2B;AACpE,QAAM,IAAY,cAAK,IAAI;AAC3B,QAAM,QAAqB,oBAAI,IAAI;AAEnC,IAAE,SAAS,EAAE,KAAK,CAAC,GAAG,OAAO;AAC3B,UAAM,OAAO,EAAE,EAAE,EAAE,KAAK,MAAM;AAC9B,QAAI,SAAS,UAAa,SAAS,IAAI;AACrC,UAAI;AACF,cAAM,cAAc,IAAI,IAAI,MAAM,OAAO,EAAE;AAE3C,YAAI,YAAY,WAAW,SAAS,KAAK,YAAY,WAAW,UAAU,GAAG;AAC3E,gBAAM,IAAI,WAAW;AAAA,QACvB;AAAA,MACF,QAAQ;AAAA,MAER;AAAA,IACF;AAAA,EACF,CAAC;AAED,SAAO,CAAC,GAAG,KAAK;AAClB;;;AC7BA,SAAS,gBAAyC;AAGlD,IAAMC,UAAS,aAAa,oBAAoB;AAGhD,IAAM,kBAAkB;AAGxB,IAAM,aACJ;AAUF,IAAI,kBAAkC;AAKtC,eAAe,aAA+B;AAC5C,MAAI,iBAAiB,YAAY,MAAM,MAAM;AAC3C,IAAAA,QAAO,MAAM,yCAAyC;AACtD,sBAAkB,MAAM,SAAS,OAAO;AAAA,MACtC,UAAU;AAAA,IACZ,CAAC;AAAA,EACH;AACA,SAAO;AACT;AAKA,eAAsB,eAA8B;AAClD,MAAI,iBAAiB,YAAY,MAAM,MAAM;AAC3C,UAAM,gBAAgB,MAAM;AAC5B,sBAAkB;AAAA,EACpB;AACF;AAMA,eAAsB,oBACpB,KACA,YAAoB,iBACY;AAChC,QAAM,YAAY,KAAK,IAAI;AAC3B,EAAAA,QAAO,MAAM,EAAE,KAAK,UAAU,GAAG,+BAA+B;AAEhE,QAAM,UAAU,MAAM,WAAW;AACjC,QAAM,UAAU,MAAM,QAAQ,WAAW;AAAA,IACvC,WAAW;AAAA,EACb,CAAC;AAED,MAAI,OAAoB;AAExB,MAAI;AACF,WAAO,MAAM,QAAQ,QAAQ;AAG7B,UAAM,KAAK,KAAK,KAAK;AAAA,MACnB,WAAW;AAAA,MACX,SAAS;AAAA,IACX,CAAC;AAGD,UAAM,OAAO,MAAM,KAAK,QAAQ;AAGhC,UAAM,QAAQ,MAAM,KAAK,SAAS,MAAM;AACtC,YAAM,UAAU,SAAS,iBAAiB,SAAS;AACnD,YAAM,QAAkB,CAAC;AAEzB,cAAQ,QAAQ,CAAC,WAAW;AAC1B,cAAM,OAAO,OAAO,aAAa,MAAM;AACvC,YAAI,SAAS,QAAQ,SAAS,IAAI;AAChC,cAAI;AAEF,kBAAM,cAAc,IAAI,IAAI,MAAM,SAAS,OAAO,EAAE;AAEpD,gBAAI,YAAY,WAAW,SAAS,KAAK,YAAY,WAAW,UAAU,GAAG;AAC3E,oBAAM,KAAK,WAAW;AAAA,YACxB;AAAA,UACF,QAAQ;AAAA,UAER;AAAA,QACF;AAAA,MACF,CAAC;AAGD,aAAO,CAAC,GAAG,IAAI,IAAI,KAAK,CAAC;AAAA,IAC3B,CAAC;AAED,UAAM,aAAa,KAAK,IAAI,IAAI;AAChC,IAAAA,QAAO;AAAA,MACL;AAAA,QACE;AAAA,QACA;AAAA,QACA,WAAW,MAAM;AAAA,QACjB,GAAG,iBAAiB,MAAM,YAAY,GAAG;AAAA,MAC3C;AAAA,MACA;AAAA,IACF;AAEA,WAAO,EAAE,MAAM,MAAM;AAAA,EACvB,UAAE;AAEA,UAAM,QAAQ,MAAM;AAAA,EACtB;AACF;;;AL5GA,IAAMC,UAAS,aAAa,SAAS;AASrC,eAAsB,iBACpB,SACA,kBACA,cAAuB,OACC;AACxB,MAAI,CAAC,aAAa,YAAY,GAAG;AAC/B,UAAM,IAAI,MAAM,wEAAwE;AAAA,EAC1F;AAEA,QAAM,SAAS,IAAI,aAAa;AAGhC,MAAI;AACJ,MAAI,aAAa;AACf,UAAM,iBAAiB,MAAM,oBAAoB,OAAO;AACxD,eAAW,eAAe;AAAA,EAC5B,OAAO;AACL,UAAM,WAAW,MAAM,MAAM,IAAI,SAAS;AAAA,MACxC,SAAS;AAAA,MACT,SAAS;AAAA,QACP,cACE;AAAA,MACJ;AAAA,IACF,CAAC;AACD,eAAW,OAAO,SAAS,SAAS,WAAW,SAAS,OAAO,KAAK,UAAU,SAAS,IAAI;AAAA,EAC7F;AAGA,SAAO,OAAO,mBAAmB,SAAS,UAAU,gBAAgB;AACtE;AAoCA,IAAM,qBACJ;AACF,IAAMC,mBAAkB;AAKjB,IAAM,qBAAN,cAAiC,aAAa;AAAA,EAClC;AAAA,EACA;AAAA,EACA;AAAA,EACT;AAAA,EAER,YAAY,QAAsB;AAChC,UAAM;AACN,SAAK,eAAe,IAAI,aAAa;AACrC,SAAK,UAAU,oBAAI,IAAI;AACvB,SAAK,SAAS,UAAU,CAAC;AACzB,SAAK,UAAU;AAAA,EACjB;AAAA;AAAA;AAAA;AAAA,EAKA,OAAO,MAAM,SAAiB,UAAwB,CAAC,GAA+B;AACpF,UAAM,EAAE,kBAAkB,oBAAoB,WAAW,IAAI,SAAS,MAAM,IAAI;AAEhF,SAAK,QAAQ,MAAM;AACnB,SAAK,UAAU;AAEf,IAAAD,QAAO;AAAA,MACL;AAAA,QACE;AAAA,QACA;AAAA,QACA,MAAM,SACF,WACA,qBAAqB,UAAa,qBAAqB,KACrD,gBACA;AAAA,QACN,uBAAuB,uBAAuB;AAAA,MAChD;AAAA,MACA;AAAA,IACF;AAEA,UAAM,gBAA+B;AAAA,MACnC,MAAM;AAAA,MACN,cAAc;AAAA,MACd,YAAY;AAAA,IACd;AACA,SAAK,KAAK,YAAY,aAAa;AAGnC,UAAM,qBAAqB,CAAC,UAAU,qBAAqB,UAAa,qBAAqB;AAE7F,QAAI,oBAAoB;AAEtB,aAAO,KAAK;AAAA,QACV;AAAA,QACA;AAAA,QACA;AAAA,QACA;AAAA,QACA,QAAQ,eAAe;AAAA,QACvB,QAAQ;AAAA,MACV;AAAA,IACF,OAAO;AACL,aAAO,KAAK,YAAY,SAAS,oBAAoB,UAAU,QAAQ,eAAe,KAAK;AAAA,IAC7F;AAEA,IAAAA,QAAO;AAAA,MACL;AAAA,QACE;AAAA,QACA,cAAc,KAAK,QAAQ;AAAA,MAC7B;AAAA,MACA;AAAA,IACF;AAGA,QAAI,KAAK,QAAQ,SAAS,KAAK,WAAW,GAAG;AAC3C,YAAM,kBAAiC;AAAA,QACrC,MAAM;AAAA,QACN,cAAc,KAAK,QAAQ;AAAA,QAC3B,YAAY;AAAA,QACZ,SAAS,iDAAiD,OAAO,QAAQ,CAAC;AAAA,QAC1E,OAAO,IAAI,MAAM,oBAAoB;AAAA,MACvC;AACA,WAAK,KAAK,YAAY,eAAe;AAAA,IACvC;AAEA,UAAM,mBAAkC;AAAA,MACtC,MAAM;AAAA,MACN,cAAc,KAAK,QAAQ;AAAA,MAC3B,YAAY,KAAK,QAAQ;AAAA,IAC3B;AACA,SAAK,KAAK,YAAY,gBAAgB;AAAA,EACxC;AAAA;AAAA;AAAA;AAAA,EAKA,OAAe,iBACb,SACA,kBACA,oBACA,UACA,cAAuB,OACvB,qBAC4B;AAC5B,QAAI;AAGJ,QAAI,wBAAwB,QAAW;AACrC,iBAAW;AACX,YAAM,mBAAkC;AAAA,QACtC,MAAM;AAAA,QACN,cAAc;AAAA,QACd,YAAY;AAAA,QACZ,SAAS,gCAAgC,OAAO,SAAS,KAAK,MAAM,CAAC;AAAA,MACvE;AACA,WAAK,KAAK,YAAY,gBAAgB;AAAA,IACxC,OAAO;AAEL,UAAI,CAAC,aAAa,YAAY,GAAG;AAC/B,cAAM,IAAI,MAAM,wEAAwE;AAAA,MAC1F;AAEA,UAAI;AAEF,cAAM,wBAAuC;AAAA,UAC3C,MAAM;AAAA,UACN,cAAc;AAAA,UACd,YAAY;AAAA,UACZ,YAAY;AAAA,UACZ,SAAS;AAAA,QACX;AACA,aAAK,KAAK,YAAY,qBAAqB;AAE3C,cAAM,WAAW,MAAM,KAAK,UAAU,SAAS,WAAW;AAG1D,mBAAW,MAAM,KAAK,aAAa,mBAAmB,SAAS,UAAU,gBAAgB;AAEzF,cAAM,2BAA0C;AAAA,UAC9C,MAAM;AAAA,UACN,cAAc;AAAA,UACd,YAAY;AAAA,UACZ,SAAS,qBAAqB,OAAO,SAAS,KAAK,MAAM,CAAC,mBAAmB,SAAS,SAAS;AAAA,QACjG;AACA,aAAK,KAAK,YAAY,wBAAwB;AAAA,MAChD,SAAS,OAAO;AAEd,cAAM,iBAAiB,QAAQ,QAAQ,IAAI,MAAM,OAAO,KAAK,CAAC;AAAA,MAChE;AAAA,IACF;AAGA,QAAI,eAAe;AAEnB,eAAW,OAAO,SAAS,MAAM;AAC/B,UAAI,KAAK,WAAW,gBAAgB,SAAU;AAC9C,UAAI,KAAK,QAAQ,IAAI,GAAG,EAAG;AAE3B,UAAI;AACF,cAAM,SAAS,MAAM,KAAK;AAAA,UACxB;AAAA,UACA;AAAA,UACA;AAAA,UACA;AAAA,QACF;AACA;AACA,cAAM;AAAA,MACR,SAAS,OAAO;AACd,cAAM,oBAAmC;AAAA,UACvC,MAAM;AAAA,UACN;AAAA,UACA,YAAY;AAAA,UACZ,YAAY;AAAA,UACZ,OAAO,iBAAiB,QAAQ,QAAQ,IAAI,MAAM,OAAO,KAAK,CAAC;AAAA,QACjE;AACA,aAAK,KAAK,YAAY,iBAAiB;AAAA,MACzC;AAAA,IACF;AAAA,EACF;AAAA;AAAA;AAAA;AAAA,EAKA,OAAe,YACb,SACA,oBACA,UACA,cAAuB,OACK;AAC5B,UAAM,QAA+C,CAAC,EAAE,KAAK,SAAS,OAAO,EAAE,CAAC;AAChF,UAAM,WAAW;AACjB,QAAI,eAAe;AAEnB,WAAO,MAAM,SAAS,KAAK,eAAe,YAAY,CAAC,KAAK,SAAS;AACnE,YAAM,UAAU,MAAM,MAAM;AAE5B,UAAI,CAAC,WAAW,KAAK,QAAQ,IAAI,QAAQ,GAAG,KAAK,QAAQ,QAAQ,UAAU;AACzE;AAAA,MACF;AAEA,UAAI;AACF,cAAM,SAAS,MAAM,KAAK;AAAA,UACxB,QAAQ;AAAA,UACR;AAAA,UACA;AAAA,UACA;AAAA,QACF;AACA,eAAO,QAAQ,QAAQ;AACvB;AAEA,cAAM;AAGN,YAAI,QAAQ,QAAQ,UAAU;AAC5B,cAAI;AACF,kBAAM,QAAQ,MAAM,KAAK,aAAa,QAAQ,KAAK,WAAW;AAE9D,gBAAI,MAAM,WAAW,GAAG;AACtB,cAAAA,QAAO,MAAM,EAAE,KAAK,QAAQ,IAAI,GAAG,0CAA0C;AAAA,YAC/E,OAAO;AACL,cAAAA,QAAO;AAAA,gBACL,EAAE,KAAK,QAAQ,KAAK,WAAW,MAAM,OAAO;AAAA,gBAC5C;AAAA,cACF;AAAA,YACF;AAEA,uBAAW,QAAQ,OAAO;AACxB,kBAAI,CAAC,KAAK,QAAQ,IAAI,IAAI,KAAK,KAAK,aAAa,SAAS,IAAI,GAAG;AAC/D,sBAAM,KAAK,EAAE,KAAK,MAAM,OAAO,QAAQ,QAAQ,EAAE,CAAC;AAAA,cACpD;AAAA,YACF;AAAA,UACF,SAAS,OAAO;AAEd,kBAAM,gBAA+B;AAAA,cACnC,MAAM;AAAA,cACN;AAAA,cACA,YAAY;AAAA,cACZ,YAAY,QAAQ;AAAA,cACpB,SAAS,gCAAgC,QAAQ,GAAG;AAAA,cACpD,OAAO,iBAAiB,QAAQ,QAAQ,IAAI,MAAM,OAAO,KAAK,CAAC;AAAA,YACjE;AACA,iBAAK,KAAK,YAAY,aAAa;AAAA,UACrC;AAAA,QACF;AAAA,MACF,SAAS,OAAO;AACd,cAAM,WAAW,iBAAiB,QAAQ,QAAQ,IAAI,MAAM,OAAO,KAAK,CAAC;AAIzE,YACE,SAAS,QAAQ,SAAS,mBAAmB,KAC7C,SAAS,QAAQ,SAAS,0BAA0B,KACpD,SAAS,QAAQ,SAAS,uBAAuB,GACjD;AACA,gBAAM;AAAA,QACR;AAGA,cAAM,sBAAqC;AAAA,UACzC,MAAM;AAAA,UACN;AAAA,UACA,YAAY;AAAA,UACZ,YAAY,QAAQ;AAAA,UACpB,OAAO;AAAA,QACT;AACA,aAAK,KAAK,YAAY,mBAAmB;AAAA,MAC3C;AAAA,IACF;AAAA,EACF;AAAA;AAAA;AAAA;AAAA,EAKA,MAAc,gBACZ,KACA,oBACA,cACA,cAAuB,OACD;AACtB,UAAM,eAA8B;AAAA,MAClC,MAAM;AAAA,MACN;AAAA,MACA,YAAY;AAAA,MACZ,YAAY;AAAA,IACd;AACA,SAAK,KAAK,YAAY,YAAY;AAGlC,SAAK,QAAQ,IAAI,GAAG;AAGpB,UAAM,OAAO,MAAM,KAAK,UAAU,KAAK,WAAW;AAIlD,UAAM,aAAa,MAAM,sBAAsB,MAAM,GAAG;AAExD,IAAAA,QAAO;AAAA,MACL;AAAA,QACE;AAAA,QACA,OAAO,WAAW;AAAA,QAClB,gBAAgB,WAAW,SAAS;AAAA,MACtC;AAAA,MACA;AAAA,IACF;AAEA,QAAI;AAGJ,QAAI,uBAAuB,UAAa,uBAAuB,IAAI;AAEjE,UAAI,CAAC,aAAa,YAAY,GAAG;AAC/B,cAAM,IAAI,MAAM,8DAA8D;AAAA,MAChF;AAEA,YAAM,qBAAoC;AAAA,QACxC,MAAM;AAAA,QACN;AAAA,QACA,YAAY;AAAA,QACZ,YAAY;AAAA,MACd;AACA,WAAK,KAAK,YAAY,kBAAkB;AAExC,kBAAY,MAAM,KAAK,aAAa,eAAe,WAAW,UAAU,kBAAkB;AAAA,IAC5F;AAEA,WAAO;AAAA,MACL;AAAA,MACA,GAAI,WAAW,UAAU,UAAa,EAAE,OAAO,WAAW,MAAM;AAAA,MAChE,UAAU,WAAW;AAAA,MACrB,GAAI,cAAc,UAAa,EAAE,UAAU;AAAA,IAC7C;AAAA,EACF;AAAA;AAAA;AAAA;AAAA,EAKA,MAAc,UAAU,KAAa,cAAuB,OAAwB;AAClF,UAAM,YAAY,KAAK,IAAI;AAC3B,IAAAA,QAAO,MAAM,EAAE,KAAK,YAAY,GAAG,eAAe;AAElD,QAAI,aAAa;AACf,UAAI;AACF,cAAM,SAAS,MAAM,oBAAoB,GAAG;AAC5C,cAAM,aAAa,KAAK,IAAI,IAAI;AAChC,QAAAA,QAAO;AAAA,UACL;AAAA,YACE;AAAA,YACA,aAAa;AAAA,YACb;AAAA,YACA,GAAG,iBAAiB,OAAO,MAAM,YAAY,GAAG;AAAA,UAClD;AAAA,UACA;AAAA,QACF;AACA,eAAO,OAAO;AAAA,MAChB,SAAS,OAAO;AAEd,cAAM,IAAI;AAAA,UACR,0BAA0B,iBAAiB,QAAQ,MAAM,UAAU,OAAO,KAAK,CAAC;AAAA,QAClF;AAAA,MACF;AAAA,IACF;AAGA,QAAI;AACF,YAAM,WAAW,MAAM,MAAM,IAAY,KAAK;AAAA,QAC5C,SAAS,KAAK,OAAO,WAAWC;AAAA,QAChC,SAAS;AAAA,UACP,cAAc,KAAK,OAAO,aAAa;AAAA,QACzC;AAAA,MACF,CAAC;AAED,YAAM,aAAa,KAAK,IAAI,IAAI;AAChC,MAAAD,QAAO;AAAA,QACL;AAAA,UACE;AAAA,UACA,aAAa;AAAA,UACb;AAAA,UACA,GAAG,iBAAiB,SAAS,MAAM,YAAY,GAAG;AAAA,QACpD;AAAA,QACA;AAAA,MACF;AAEA,aAAO,SAAS;AAAA,IAClB,SAAS,OAAO;AACd,MAAAA,QAAO;AAAA,QACL,EAAE,KAAK,OAAO,iBAAiB,QAAQ,MAAM,UAAU,OAAO,KAAK,EAAE;AAAA,QACrE;AAAA,MACF;AACA,YAAM,IAAI;AAAA,QACR,mBAAmB,GAAG,KAAK,iBAAiB,QAAQ,MAAM,UAAU,OAAO,KAAK,CAAC;AAAA,MACnF;AAAA,IACF;AAAA,EACF;AAAA;AAAA;AAAA;AAAA,EAKA,MAAc,aAAa,KAAa,cAAuB,OAA0B;AACvF,QAAI;AAEF,UAAI,aAAa;AACf,cAAM,SAAS,MAAM,oBAAoB,GAAG;AAC5C,eAAO,OAAO;AAAA,MAChB;AAGA,YAAM,WAAW,MAAM,MAAM,IAAY,KAAK;AAAA,QAC5C,SAAS,KAAK,OAAO,WAAWC;AAAA,QAChC,SAAS;AAAA,UACP,cAAc,KAAK,OAAO,aAAa;AAAA,QACzC;AAAA,MACF,CAAC;AAED,aAAO,aAAqB,SAAS,MAAM,GAAG;AAAA,IAChD,SAAS,OAAgB;AAEvB,YAAM,eAAe,iBAAiB,QAAQ,MAAM,UAAU,OAAO,KAAK;AAC1E,MAAAD,QAAO,MAAM,EAAE,KAAK,OAAO,aAAa,GAAG,yBAAyB;AAGpE,YAAM,IAAI,MAAM,8BAA8B,GAAG,KAAK,YAAY,EAAE;AAAA,IACtE;AAAA,EACF;AAAA;AAAA;AAAA;AAAA,EAKQ,aAAa,MAAc,MAAuB;AACxD,QAAI;AACF,YAAM,UAAU,IAAI,IAAI,IAAI,EAAE,SAAS,YAAY;AACnD,YAAM,UAAU,IAAI,IAAI,IAAI,EAAE,SAAS,YAAY;AACnD,aACE,YAAY,WAAW,QAAQ,SAAS,IAAI,OAAO,EAAE,KAAK,QAAQ,SAAS,IAAI,OAAO,EAAE;AAAA,IAE5F,QAAQ;AACN,aAAO;AAAA,IACT;AAAA,EACF;AAAA;AAAA;AAAA;AAAA,EAKA,MAAM,OAAsB;AAC1B,SAAK,UAAU;AACf,UAAM,aAAa;AAAA,EACrB;AACF;","names":["cheerio","logger","logger","DEFAULT_TIMEOUT"]}
|