brainfood 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +180 -0
- package/dist/crawl.d.ts +7 -0
- package/dist/crawl.js +380 -0
- package/dist/crawl.js.map +1 -0
- package/dist/extract.d.ts +2 -0
- package/dist/extract.js +250 -0
- package/dist/extract.js.map +1 -0
- package/dist/index.d.ts +2 -0
- package/dist/index.js +199 -0
- package/dist/index.js.map +1 -0
- package/dist/local.d.ts +2 -0
- package/dist/local.js +110 -0
- package/dist/local.js.map +1 -0
- package/dist/output.d.ts +2 -0
- package/dist/output.js +235 -0
- package/dist/output.js.map +1 -0
- package/dist/sitemap.d.ts +6 -0
- package/dist/sitemap.js +71 -0
- package/dist/sitemap.js.map +1 -0
- package/dist/structure.d.ts +3 -0
- package/dist/structure.js +423 -0
- package/dist/structure.js.map +1 -0
- package/dist/summarize.d.ts +2 -0
- package/dist/summarize.js +57 -0
- package/dist/summarize.js.map +1 -0
- package/dist/types.d.ts +131 -0
- package/dist/types.js +2 -0
- package/dist/types.js.map +1 -0
- package/dist/utils.d.ts +14 -0
- package/dist/utils.js +151 -0
- package/dist/utils.js.map +1 -0
- package/package.json +69 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Capxel
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,180 @@
|
|
|
1
|
+
# 🧠brainfood
|
|
2
|
+
|
|
3
|
+
**Structured knowledge for hungry agents.**
|
|
4
|
+
|
|
5
|
+
Your AI agent is only as smart as what you feed it. Brainfood turns your messy knowledge — YouTube transcripts, PDFs, docs, notes, websites — into clean, structured data your agent actually understands.
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## What does it do?
|
|
10
|
+
|
|
11
|
+
You have knowledge trapped in files your AI can't use. Brainfood fixes that.
|
|
12
|
+
|
|
13
|
+
| Source | What brainfood does |
|
|
14
|
+
|--------|-------------------|
|
|
15
|
+
| PDFs | Extracts text, structures it, outputs clean JSON/Markdown/Obsidian notes |
|
|
16
|
+
| Local docs (Markdown, HTML, text, DOCX) | Parses, organizes, builds a knowledge graph |
|
|
17
|
+
| Websites | Crawls pages, extracts content, maps structure |
|
|
18
|
+
| Sitemaps | Reads sitemap XML, fetches and processes all listed pages |
|
|
19
|
+
|
|
20
|
+
Every source becomes structured, linked, agent-ready output.
|
|
21
|
+
|
|
22
|
+
## Quick start
|
|
23
|
+
|
|
24
|
+
```bash
|
|
25
|
+
# Install
|
|
26
|
+
npm install -g brainfood
|
|
27
|
+
|
|
28
|
+
# Process local files (PDFs, docs, transcripts)
|
|
29
|
+
brainfood local ./my-knowledge --format both
|
|
30
|
+
|
|
31
|
+
# Crawl a website
|
|
32
|
+
brainfood crawl https://example.com --depth 2
|
|
33
|
+
|
|
34
|
+
# Generate Obsidian-ready notes
|
|
35
|
+
brainfood local ./research --format obsidian
|
|
36
|
+
|
|
37
|
+
# Read from a sitemap
|
|
38
|
+
brainfood sitemap https://example.com/sitemap.xml
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
Or run without installing:
|
|
42
|
+
|
|
43
|
+
```bash
|
|
44
|
+
npx brainfood local ./docs --format json
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
## Output formats
|
|
48
|
+
|
|
49
|
+
### `--format json`
|
|
50
|
+
Structured JSON nodes — perfect for AI agent ingestion, pipelines, and APIs.
|
|
51
|
+
|
|
52
|
+
### `--format markdown`
|
|
53
|
+
Clean Markdown files — readable by humans and machines.
|
|
54
|
+
|
|
55
|
+
### `--format obsidian`
|
|
56
|
+
Obsidian-ready Markdown with YAML frontmatter, tags, and `[[wiki-links]]` — drop directly into your vault.
|
|
57
|
+
|
|
58
|
+
### `--format both`
|
|
59
|
+
JSON + Markdown together.
|
|
60
|
+
|
|
61
|
+
Every format also writes a `brainfood.json` knowledge graph index.
|
|
62
|
+
|
|
63
|
+
## Obsidian integration
|
|
64
|
+
|
|
65
|
+
Brainfood speaks Obsidian natively:
|
|
66
|
+
|
|
67
|
+
```bash
|
|
68
|
+
brainfood local ./research-papers --format obsidian --output ~/Documents/Obsidian\ Vault/research/
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
Output includes:
|
|
72
|
+
- **YAML frontmatter** — title, date, source, tags, type
|
|
73
|
+
- **Wiki-links** — entity names automatically linked as `[[Entity Name]]`
|
|
74
|
+
- **Clean filenames** — slugified, no special characters
|
|
75
|
+
- **Tags** — extracted topics become Obsidian tags
|
|
76
|
+
|
|
77
|
+
Your vault becomes a living knowledge base — searchable, linked, and graph-ready.
|
|
78
|
+
|
|
79
|
+
## How it works
|
|
80
|
+
|
|
81
|
+
1. **Ingest** — point brainfood at files, a folder, a URL, or a sitemap
|
|
82
|
+
2. **Extract** — content is parsed, cleaned, and structured using Mozilla Readability + Cheerio
|
|
83
|
+
3. **Structure** — topics, entities, and relationships are identified and linked
|
|
84
|
+
4. **Output** — clean knowledge nodes in your chosen format
|
|
85
|
+
|
|
86
|
+
Each document becomes a knowledge node:
|
|
87
|
+
|
|
88
|
+
```json
|
|
89
|
+
{
|
|
90
|
+
"id": "a1b2c3d4e5f6",
|
|
91
|
+
"title": "Document Title",
|
|
92
|
+
"content": "# Clean structured content...",
|
|
93
|
+
"summary": "AI-generated or extractive summary",
|
|
94
|
+
"topics": ["topic1", "topic2"],
|
|
95
|
+
"entities": [
|
|
96
|
+
{ "name": "Key Concept", "type": "topic" }
|
|
97
|
+
],
|
|
98
|
+
"relationships": [],
|
|
99
|
+
"metadata": {
|
|
100
|
+
"sourceType": "local",
|
|
101
|
+
"wordCount": 1250,
|
|
102
|
+
"generatedAt": "2026-03-17T00:00:00.000Z"
|
|
103
|
+
}
|
|
104
|
+
}
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
## CLI reference
|
|
108
|
+
|
|
109
|
+
### `brainfood local <directory>`
|
|
110
|
+
Process local Markdown, HTML, text, PDF, or DOCX files.
|
|
111
|
+
|
|
112
|
+
### `brainfood crawl <url>`
|
|
113
|
+
Crawl a website with configurable depth and rate limiting.
|
|
114
|
+
|
|
115
|
+
### `brainfood sitemap <url>`
|
|
116
|
+
Parse a sitemap and fetch all listed pages.
|
|
117
|
+
|
|
118
|
+
### Common options
|
|
119
|
+
|
|
120
|
+
| Option | Default | Description |
|
|
121
|
+
|--------|---------|-------------|
|
|
122
|
+
| `-o, --output <dir>` | `./brainfood-output` | Output directory |
|
|
123
|
+
| `-f, --format <format>` | `json` | Output format: json, markdown, obsidian, or both |
|
|
124
|
+
| `--summarize` | off | Generate AI summaries (requires OPENAI_API_KEY) |
|
|
125
|
+
| `--model <model>` | `gpt-4.1-mini` | OpenAI model for summaries |
|
|
126
|
+
| `--depth <n>` | `2` | Max crawl depth (crawl mode) |
|
|
127
|
+
| `--max-pages <n>` | `50` | Max pages to process |
|
|
128
|
+
| `--concurrency <n>` | `3` | Concurrent requests (max 10) |
|
|
129
|
+
| `--rate-limit <ms>` | `1000` | Minimum ms between requests |
|
|
130
|
+
| `--exclude <patterns>` | — | Comma-separated URL patterns to skip |
|
|
131
|
+
|
|
132
|
+
## Using brainfood with OpenClaw
|
|
133
|
+
|
|
134
|
+
Already running an OpenClaw agent? Just tell it what to process:
|
|
135
|
+
|
|
136
|
+
> "Install brainfood and process my research folder into Obsidian notes"
|
|
137
|
+
|
|
138
|
+
Your agent will run:
|
|
139
|
+
```bash
|
|
140
|
+
npm install -g brainfood
|
|
141
|
+
brainfood local ./research --format obsidian --output ~/Documents/Obsidian\ Vault/research/
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
> "Crawl my company website and give me structured data"
|
|
145
|
+
|
|
146
|
+
```bash
|
|
147
|
+
brainfood crawl https://yoursite.com --depth 2 --format json
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
> "Convert these PDFs into something you can actually read"
|
|
151
|
+
|
|
152
|
+
```bash
|
|
153
|
+
brainfood local ./documents --format both
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
That's it. One install, one command, your agent gets structured knowledge it can actually use.
|
|
157
|
+
|
|
158
|
+
**Tip for non-technical users:** Copy any of the commands above and paste them to your OpenClaw agent in chat. It handles the rest.
|
|
159
|
+
|
|
160
|
+
## Use cases
|
|
161
|
+
|
|
162
|
+
**Feed your AI agent** — Convert your knowledge base into structured data any LLM agent can ingest.
|
|
163
|
+
|
|
164
|
+
**Build an Obsidian vault** — Turn PDFs, transcripts, and research into linked, searchable notes.
|
|
165
|
+
|
|
166
|
+
**Audit a website** — Extract and map all content from any site for analysis or migration.
|
|
167
|
+
|
|
168
|
+
**Power a knowledge pipeline** — Automate ingestion from docs folders, sitemaps, or web sources.
|
|
169
|
+
|
|
170
|
+
## Built by Capxel
|
|
171
|
+
|
|
172
|
+
[Capxel](https://capxel.com) builds AI-native intelligence infrastructure. Brainfood is open source under MIT.
|
|
173
|
+
|
|
174
|
+
## Contributing
|
|
175
|
+
|
|
176
|
+
PRs welcome. See [issues](https://github.com/Capxel/brainfood/issues) for open work.
|
|
177
|
+
|
|
178
|
+
## License
|
|
179
|
+
|
|
180
|
+
MIT
|
package/dist/crawl.d.ts
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
import type { CrawlOptions, ProgressUpdate, RawDocument } from './types.js';
|
|
2
|
+
export interface CrawlResult {
|
|
3
|
+
documents: RawDocument[];
|
|
4
|
+
errors: string[];
|
|
5
|
+
}
|
|
6
|
+
export declare function crawlWebsite(rootUrl: string, options: CrawlOptions, onProgress?: (update: ProgressUpdate) => void): Promise<CrawlResult>;
|
|
7
|
+
export declare function fetchDocumentsFromUrls(urls: string[], options: Omit<CrawlOptions, 'depth'>, sourceType: 'sitemap', onProgress?: (update: ProgressUpdate) => void): Promise<CrawlResult>;
|
package/dist/crawl.js
ADDED
|
@@ -0,0 +1,380 @@
|
|
|
1
|
+
import * as cheerio from 'cheerio';
|
|
2
|
+
import fetch from 'node-fetch';
|
|
3
|
+
import { DEFAULT_USER_AGENT, normalizeUrl, safeIsoDate, sleep, urlToRelativeOutputPath } from './utils.js';
|
|
4
|
+
const HTML_CONTENT_TYPES = ['text/html', 'application/xhtml+xml'];
|
|
5
|
+
const TEXT_CONTENT_TYPES = ['text/plain', 'text/markdown', 'text/x-markdown'];
|
|
6
|
+
class RateLimitedFetcher {
|
|
7
|
+
options;
|
|
8
|
+
nextRequestAt = 0;
|
|
9
|
+
reservation = Promise.resolve();
|
|
10
|
+
constructor(options) {
|
|
11
|
+
this.options = options;
|
|
12
|
+
}
|
|
13
|
+
async reserveStartSlot() {
|
|
14
|
+
let waitMs = 0;
|
|
15
|
+
const reservation = this.reservation.then(() => {
|
|
16
|
+
const now = Date.now();
|
|
17
|
+
const startAt = Math.max(now, this.nextRequestAt);
|
|
18
|
+
this.nextRequestAt = startAt + this.options.rateLimitMs;
|
|
19
|
+
waitMs = Math.max(0, startAt - now);
|
|
20
|
+
});
|
|
21
|
+
this.reservation = reservation.catch(() => undefined);
|
|
22
|
+
await reservation;
|
|
23
|
+
if (waitMs > 0) {
|
|
24
|
+
await sleep(waitMs);
|
|
25
|
+
}
|
|
26
|
+
}
|
|
27
|
+
async fetch(url, init = {}) {
|
|
28
|
+
await this.reserveStartSlot();
|
|
29
|
+
const controller = new AbortController();
|
|
30
|
+
const timeout = setTimeout(() => controller.abort(), this.options.timeoutMs);
|
|
31
|
+
try {
|
|
32
|
+
return await fetch(url, {
|
|
33
|
+
...init,
|
|
34
|
+
headers: {
|
|
35
|
+
'user-agent': this.options.userAgent || DEFAULT_USER_AGENT,
|
|
36
|
+
accept: 'text/html,application/xhtml+xml,application/xml,text/plain;q=0.9,*/*;q=0.1',
|
|
37
|
+
...(init.headers || {})
|
|
38
|
+
},
|
|
39
|
+
redirect: 'follow',
|
|
40
|
+
signal: controller.signal
|
|
41
|
+
});
|
|
42
|
+
}
|
|
43
|
+
finally {
|
|
44
|
+
clearTimeout(timeout);
|
|
45
|
+
}
|
|
46
|
+
}
|
|
47
|
+
}
|
|
48
|
+
function isHtmlContentType(contentType) {
|
|
49
|
+
return HTML_CONTENT_TYPES.some((type) => contentType.includes(type));
|
|
50
|
+
}
|
|
51
|
+
function isTextContentType(contentType) {
|
|
52
|
+
return TEXT_CONTENT_TYPES.some((type) => contentType.includes(type));
|
|
53
|
+
}
|
|
54
|
+
function extractLinks(html, baseUrl) {
|
|
55
|
+
const $ = cheerio.load(html);
|
|
56
|
+
const seen = new Set();
|
|
57
|
+
const links = [];
|
|
58
|
+
$('a[href]').each((_, anchor) => {
|
|
59
|
+
const href = ($(anchor).attr('href') || '').trim();
|
|
60
|
+
if (!href || href.startsWith('#') || href.startsWith('mailto:') || href.startsWith('tel:') || href.startsWith('javascript:')) {
|
|
61
|
+
return;
|
|
62
|
+
}
|
|
63
|
+
try {
|
|
64
|
+
const absoluteUrl = normalizeUrl(new URL(href, baseUrl).toString());
|
|
65
|
+
if (seen.has(absoluteUrl)) {
|
|
66
|
+
return;
|
|
67
|
+
}
|
|
68
|
+
seen.add(absoluteUrl);
|
|
69
|
+
links.push({ href, absoluteUrl });
|
|
70
|
+
}
|
|
71
|
+
catch {
|
|
72
|
+
links.push({ href });
|
|
73
|
+
}
|
|
74
|
+
});
|
|
75
|
+
return links;
|
|
76
|
+
}
|
|
77
|
+
function parseRobotsTxt(content, userAgent) {
|
|
78
|
+
const desiredAgents = [userAgent.toLowerCase(), 'brainfood', '*'];
|
|
79
|
+
const lines = content.split(/\r?\n/);
|
|
80
|
+
const rules = [];
|
|
81
|
+
let applies = false;
|
|
82
|
+
for (const rawLine of lines) {
|
|
83
|
+
const line = rawLine.replace(/#.*$/, '').trim();
|
|
84
|
+
if (!line) {
|
|
85
|
+
continue;
|
|
86
|
+
}
|
|
87
|
+
const separator = line.indexOf(':');
|
|
88
|
+
if (separator === -1) {
|
|
89
|
+
continue;
|
|
90
|
+
}
|
|
91
|
+
const key = line.slice(0, separator).trim().toLowerCase();
|
|
92
|
+
const value = line.slice(separator + 1).trim();
|
|
93
|
+
if (key === 'user-agent') {
|
|
94
|
+
applies = desiredAgents.some((agent) => value.toLowerCase().includes(agent));
|
|
95
|
+
continue;
|
|
96
|
+
}
|
|
97
|
+
if (!applies) {
|
|
98
|
+
continue;
|
|
99
|
+
}
|
|
100
|
+
if ((key === 'allow' || key === 'disallow') && value) {
|
|
101
|
+
rules.push({ allow: key === 'allow', path: value });
|
|
102
|
+
}
|
|
103
|
+
}
|
|
104
|
+
return { rules };
|
|
105
|
+
}
|
|
106
|
+
function isAllowedByRobots(targetUrl, robots) {
|
|
107
|
+
if (!robots || robots.rules.length === 0) {
|
|
108
|
+
return true;
|
|
109
|
+
}
|
|
110
|
+
const pathname = new URL(targetUrl).pathname || '/';
|
|
111
|
+
let bestMatch = null;
|
|
112
|
+
for (const rule of robots.rules) {
|
|
113
|
+
if (!pathname.startsWith(rule.path)) {
|
|
114
|
+
continue;
|
|
115
|
+
}
|
|
116
|
+
if (!bestMatch || rule.path.length >= bestMatch.path.length) {
|
|
117
|
+
bestMatch = rule;
|
|
118
|
+
}
|
|
119
|
+
}
|
|
120
|
+
return bestMatch ? bestMatch.allow : true;
|
|
121
|
+
}
|
|
122
|
+
function matchesExcludePattern(pathname, pattern) {
|
|
123
|
+
if (!pattern) {
|
|
124
|
+
return false;
|
|
125
|
+
}
|
|
126
|
+
if (pattern.endsWith('/*')) {
|
|
127
|
+
return pathname.startsWith(pattern.slice(0, -1));
|
|
128
|
+
}
|
|
129
|
+
if (pattern.startsWith('*')) {
|
|
130
|
+
return pathname.endsWith(pattern.slice(1));
|
|
131
|
+
}
|
|
132
|
+
if (pattern.endsWith('*')) {
|
|
133
|
+
return pathname.startsWith(pattern.slice(0, -1));
|
|
134
|
+
}
|
|
135
|
+
return pathname === pattern;
|
|
136
|
+
}
|
|
137
|
+
function getExcludedPath(targetUrl, patterns) {
|
|
138
|
+
const url = new URL(targetUrl);
|
|
139
|
+
const pathname = `${url.pathname || '/'}${url.search || ''}`;
|
|
140
|
+
return patterns.some((pattern) => matchesExcludePattern(pathname, pattern) || matchesExcludePattern(url.pathname || '/', pattern))
|
|
141
|
+
? pathname
|
|
142
|
+
: null;
|
|
143
|
+
}
|
|
144
|
+
function reportExcluded(stage, targetUrl, options, saved, queued, active, onProgress) {
|
|
145
|
+
const excludedPath = getExcludedPath(targetUrl, options.excludePatterns);
|
|
146
|
+
if (!excludedPath) {
|
|
147
|
+
return false;
|
|
148
|
+
}
|
|
149
|
+
onProgress?.({
|
|
150
|
+
stage,
|
|
151
|
+
current: saved,
|
|
152
|
+
queued,
|
|
153
|
+
saved,
|
|
154
|
+
active,
|
|
155
|
+
message: `Skipping ${excludedPath} (excluded)`
|
|
156
|
+
});
|
|
157
|
+
return true;
|
|
158
|
+
}
|
|
159
|
+
async function fetchRobots(origin, fetcher) {
|
|
160
|
+
try {
|
|
161
|
+
const response = await fetcher.fetch(`${origin}/robots.txt`);
|
|
162
|
+
if (!response.ok) {
|
|
163
|
+
return null;
|
|
164
|
+
}
|
|
165
|
+
const content = await response.text();
|
|
166
|
+
return parseRobotsTxt(content, DEFAULT_USER_AGENT);
|
|
167
|
+
}
|
|
168
|
+
catch {
|
|
169
|
+
return null;
|
|
170
|
+
}
|
|
171
|
+
}
|
|
172
|
+
async function fetchSingleUrl(url, sourceType, fetcher) {
|
|
173
|
+
const response = await fetcher.fetch(url);
|
|
174
|
+
if (!response.ok) {
|
|
175
|
+
throw new Error(`Request failed for ${url} with status ${response.status}`);
|
|
176
|
+
}
|
|
177
|
+
const contentType = (response.headers.get('content-type') || 'text/plain').toLowerCase();
|
|
178
|
+
const canonicalSource = normalizeUrl(response.url || url);
|
|
179
|
+
const lastModified = safeIsoDate(response.headers.get('last-modified'));
|
|
180
|
+
if (isHtmlContentType(contentType)) {
|
|
181
|
+
const html = await response.text();
|
|
182
|
+
return {
|
|
183
|
+
sourceType,
|
|
184
|
+
canonicalSource,
|
|
185
|
+
url: canonicalSource,
|
|
186
|
+
relativeOutputPath: urlToRelativeOutputPath(canonicalSource),
|
|
187
|
+
contentType,
|
|
188
|
+
html,
|
|
189
|
+
languageHint: null,
|
|
190
|
+
discoveredLinks: extractLinks(html, canonicalSource),
|
|
191
|
+
lastModified,
|
|
192
|
+
statusCode: response.status
|
|
193
|
+
};
|
|
194
|
+
}
|
|
195
|
+
if (isTextContentType(contentType)) {
|
|
196
|
+
const text = await response.text();
|
|
197
|
+
return {
|
|
198
|
+
sourceType,
|
|
199
|
+
canonicalSource,
|
|
200
|
+
url: canonicalSource,
|
|
201
|
+
relativeOutputPath: urlToRelativeOutputPath(canonicalSource),
|
|
202
|
+
contentType,
|
|
203
|
+
text,
|
|
204
|
+
discoveredLinks: [],
|
|
205
|
+
lastModified,
|
|
206
|
+
statusCode: response.status
|
|
207
|
+
};
|
|
208
|
+
}
|
|
209
|
+
return null;
|
|
210
|
+
}
|
|
211
|
+
async function fetchBatch(entries, sourceType, fetcher, saved, queued, onProgress) {
|
|
212
|
+
const active = entries.length;
|
|
213
|
+
return Promise.all(entries.map(async (entry, index) => {
|
|
214
|
+
onProgress?.({
|
|
215
|
+
stage: sourceType,
|
|
216
|
+
current: saved + index + 1,
|
|
217
|
+
queued,
|
|
218
|
+
saved,
|
|
219
|
+
active,
|
|
220
|
+
message: `Fetching ${entry.url}`
|
|
221
|
+
});
|
|
222
|
+
try {
|
|
223
|
+
const document = await fetchSingleUrl(entry.url, sourceType, fetcher);
|
|
224
|
+
if (!document) {
|
|
225
|
+
return {
|
|
226
|
+
url: entry.url,
|
|
227
|
+
error: `Skipped unsupported content type at ${entry.url}`
|
|
228
|
+
};
|
|
229
|
+
}
|
|
230
|
+
return {
|
|
231
|
+
url: entry.url,
|
|
232
|
+
document
|
|
233
|
+
};
|
|
234
|
+
}
|
|
235
|
+
catch (error) {
|
|
236
|
+
return {
|
|
237
|
+
url: entry.url,
|
|
238
|
+
error: error instanceof Error ? error.message : `Unknown crawl error for ${entry.url}`
|
|
239
|
+
};
|
|
240
|
+
}
|
|
241
|
+
}));
|
|
242
|
+
}
|
|
243
|
+
export async function crawlWebsite(rootUrl, options, onProgress) {
|
|
244
|
+
const normalizedRoot = normalizeUrl(rootUrl);
|
|
245
|
+
const root = new URL(normalizedRoot);
|
|
246
|
+
const fetcher = new RateLimitedFetcher({
|
|
247
|
+
timeoutMs: options.timeoutMs,
|
|
248
|
+
rateLimitMs: options.rateLimitMs,
|
|
249
|
+
userAgent: options.userAgent
|
|
250
|
+
});
|
|
251
|
+
const robots = await fetchRobots(root.origin, fetcher);
|
|
252
|
+
const queue = [{ url: normalizedRoot, depth: 0 }];
|
|
253
|
+
const queued = new Set([normalizedRoot]);
|
|
254
|
+
const seen = new Set();
|
|
255
|
+
const documents = [];
|
|
256
|
+
const errors = [];
|
|
257
|
+
while (queue.length > 0 && documents.length < options.maxPages) {
|
|
258
|
+
const batch = [];
|
|
259
|
+
while (queue.length > 0 && batch.length < options.concurrency && documents.length + batch.length < options.maxPages) {
|
|
260
|
+
const current = queue.shift();
|
|
261
|
+
if (!current || seen.has(current.url)) {
|
|
262
|
+
continue;
|
|
263
|
+
}
|
|
264
|
+
seen.add(current.url);
|
|
265
|
+
if (reportExcluded('crawl', current.url, options, documents.length, queue.length, batch.length, onProgress)) {
|
|
266
|
+
continue;
|
|
267
|
+
}
|
|
268
|
+
if (!isAllowedByRobots(current.url, robots)) {
|
|
269
|
+
errors.push(`Skipped by robots.txt: ${current.url}`);
|
|
270
|
+
continue;
|
|
271
|
+
}
|
|
272
|
+
batch.push(current);
|
|
273
|
+
}
|
|
274
|
+
if (batch.length === 0) {
|
|
275
|
+
continue;
|
|
276
|
+
}
|
|
277
|
+
const results = await fetchBatch(batch, 'crawl', fetcher, documents.length, queue.length, onProgress);
|
|
278
|
+
for (const [index, result] of results.entries()) {
|
|
279
|
+
if (result.error) {
|
|
280
|
+
errors.push(result.error);
|
|
281
|
+
onProgress?.({
|
|
282
|
+
stage: 'crawl',
|
|
283
|
+
current: documents.length,
|
|
284
|
+
queued: queue.length,
|
|
285
|
+
saved: documents.length,
|
|
286
|
+
active: Math.max(0, results.length - index - 1),
|
|
287
|
+
message: result.error
|
|
288
|
+
});
|
|
289
|
+
continue;
|
|
290
|
+
}
|
|
291
|
+
const document = result.document;
|
|
292
|
+
if (!document) {
|
|
293
|
+
continue;
|
|
294
|
+
}
|
|
295
|
+
documents.push(document);
|
|
296
|
+
const current = batch[index];
|
|
297
|
+
if (!current || !document.html || current.depth >= options.depth) {
|
|
298
|
+
continue;
|
|
299
|
+
}
|
|
300
|
+
for (const link of document.discoveredLinks) {
|
|
301
|
+
if (!link.absoluteUrl) {
|
|
302
|
+
continue;
|
|
303
|
+
}
|
|
304
|
+
const normalizedLink = normalizeUrl(link.absoluteUrl);
|
|
305
|
+
const target = new URL(normalizedLink);
|
|
306
|
+
if (target.origin !== root.origin || seen.has(normalizedLink) || queued.has(normalizedLink)) {
|
|
307
|
+
continue;
|
|
308
|
+
}
|
|
309
|
+
if (reportExcluded('crawl', normalizedLink, options, documents.length, queue.length, 0, onProgress)) {
|
|
310
|
+
seen.add(normalizedLink);
|
|
311
|
+
continue;
|
|
312
|
+
}
|
|
313
|
+
queue.push({ url: normalizedLink, depth: current.depth + 1 });
|
|
314
|
+
queued.add(normalizedLink);
|
|
315
|
+
}
|
|
316
|
+
}
|
|
317
|
+
}
|
|
318
|
+
return { documents, errors };
|
|
319
|
+
}
|
|
320
|
+
export async function fetchDocumentsFromUrls(urls, options, sourceType, onProgress) {
|
|
321
|
+
const fetcher = new RateLimitedFetcher({
|
|
322
|
+
timeoutMs: options.timeoutMs,
|
|
323
|
+
rateLimitMs: options.rateLimitMs,
|
|
324
|
+
userAgent: options.userAgent
|
|
325
|
+
});
|
|
326
|
+
const robotsCache = new Map();
|
|
327
|
+
const normalizedUrls = Array.from(new Set(urls.map((url) => normalizeUrl(url)))).slice(0, options.maxPages);
|
|
328
|
+
const documents = [];
|
|
329
|
+
const errors = [];
|
|
330
|
+
for (let start = 0; start < normalizedUrls.length; start += options.concurrency) {
|
|
331
|
+
const chunk = normalizedUrls.slice(start, start + options.concurrency);
|
|
332
|
+
const active = chunk.length;
|
|
333
|
+
const results = await Promise.all(chunk.map(async (url, offset) => {
|
|
334
|
+
if (reportExcluded(sourceType, url, options, documents.length, Math.max(0, normalizedUrls.length - start - offset - 1), active, onProgress)) {
|
|
335
|
+
return { url };
|
|
336
|
+
}
|
|
337
|
+
const origin = new URL(url).origin;
|
|
338
|
+
if (!robotsCache.has(origin)) {
|
|
339
|
+
robotsCache.set(origin, await fetchRobots(origin, fetcher));
|
|
340
|
+
}
|
|
341
|
+
if (!isAllowedByRobots(url, robotsCache.get(origin) || null)) {
|
|
342
|
+
return {
|
|
343
|
+
url,
|
|
344
|
+
error: `Skipped by robots.txt: ${url}`
|
|
345
|
+
};
|
|
346
|
+
}
|
|
347
|
+
onProgress?.({
|
|
348
|
+
stage: sourceType,
|
|
349
|
+
current: start + offset + 1,
|
|
350
|
+
queued: Math.max(0, normalizedUrls.length - start - offset - 1),
|
|
351
|
+
saved: documents.length,
|
|
352
|
+
active,
|
|
353
|
+
message: `Fetching ${url}`
|
|
354
|
+
});
|
|
355
|
+
try {
|
|
356
|
+
return {
|
|
357
|
+
url,
|
|
358
|
+
document: await fetchSingleUrl(url, sourceType, fetcher)
|
|
359
|
+
};
|
|
360
|
+
}
|
|
361
|
+
catch (error) {
|
|
362
|
+
return {
|
|
363
|
+
url,
|
|
364
|
+
error: error instanceof Error ? error.message : `Unknown fetch error for ${url}`
|
|
365
|
+
};
|
|
366
|
+
}
|
|
367
|
+
}));
|
|
368
|
+
for (const result of results) {
|
|
369
|
+
if (result.error) {
|
|
370
|
+
errors.push(result.error);
|
|
371
|
+
continue;
|
|
372
|
+
}
|
|
373
|
+
if (result.document) {
|
|
374
|
+
documents.push(result.document);
|
|
375
|
+
}
|
|
376
|
+
}
|
|
377
|
+
}
|
|
378
|
+
return { documents, errors };
|
|
379
|
+
}
|
|
380
|
+
//# sourceMappingURL=crawl.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"crawl.js","sourceRoot":"","sources":["../src/crawl.ts"],"names":[],"mappings":"AAAA,OAAO,KAAK,OAAO,MAAM,SAAS,CAAC;AACnC,OAAO,KAA0C,MAAM,YAAY,CAAC;AAGpE,OAAO,EAAE,kBAAkB,EAAE,YAAY,EAAE,WAAW,EAAE,KAAK,EAAE,uBAAuB,EAAE,MAAM,YAAY,CAAC;AAE3G,MAAM,kBAAkB,GAAG,CAAC,WAAW,EAAE,uBAAuB,CAAC,CAAC;AAClE,MAAM,kBAAkB,GAAG,CAAC,YAAY,EAAE,eAAe,EAAE,iBAAiB,CAAC,CAAC;AAiC9E,MAAM,kBAAkB;IAIO;IAHrB,aAAa,GAAG,CAAC,CAAC;IAClB,WAAW,GAAkB,OAAO,CAAC,OAAO,EAAE,CAAC;IAEvD,YAA6B,OAA4B;QAA5B,YAAO,GAAP,OAAO,CAAqB;IAAG,CAAC;IAErD,KAAK,CAAC,gBAAgB;QAC5B,IAAI,MAAM,GAAG,CAAC,CAAC;QACf,MAAM,WAAW,GAAG,IAAI,CAAC,WAAW,CAAC,IAAI,CAAC,GAAG,EAAE;YAC7C,MAAM,GAAG,GAAG,IAAI,CAAC,GAAG,EAAE,CAAC;YACvB,MAAM,OAAO,GAAG,IAAI,CAAC,GAAG,CAAC,GAAG,EAAE,IAAI,CAAC,aAAa,CAAC,CAAC;YAClD,IAAI,CAAC,aAAa,GAAG,OAAO,GAAG,IAAI,CAAC,OAAO,CAAC,WAAW,CAAC;YACxD,MAAM,GAAG,IAAI,CAAC,GAAG,CAAC,CAAC,EAAE,OAAO,GAAG,GAAG,CAAC,CAAC;QACtC,CAAC,CAAC,CAAC;QAEH,IAAI,CAAC,WAAW,GAAG,WAAW,CAAC,KAAK,CAAC,GAAG,EAAE,CAAC,SAAS,CAAC,CAAC;QACtD,MAAM,WAAW,CAAC;QAElB,IAAI,MAAM,GAAG,CAAC,EAAE,CAAC;YACf,MAAM,KAAK,CAAC,MAAM,CAAC,CAAC;QACtB,CAAC;IACH,CAAC;IAED,KAAK,CAAC,KAAK,CAAC,GAAW,EAAE,OAAoB,EAAE;QAC7C,MAAM,IAAI,CAAC,gBAAgB,EAAE,CAAC;QAE9B,MAAM,UAAU,GAAG,IAAI,eAAe,EAAE,CAAC;QACzC,MAAM,OAAO,GAAG,UAAU,CAAC,GAAG,EAAE,CAAC,UAAU,CAAC,KAAK,EAAE,EAAE,IAAI,CAAC,OAAO,CAAC,SAAS,CAAC,CAAC;QAE7E,IAAI,CAAC;YACH,OAAO,MAAM,KAAK,CAAC,GAAG,EAAE;gBACtB,GAAG,IAAI;gBACP,OAAO,EAAE;oBACP,YAAY,EAAE,IAAI,CAAC,OAAO,CAAC,SAAS,IAAI,kBAAkB;oBAC1D,MAAM,EAAE,4EAA4E;oBACpF,GAAG,CAAC,IAAI,CAAC,OAAO,IAAI,EAAE,CAAC;iBACxB;gBACD,QAAQ,EAAE,QAAQ;gBAClB,MAAM,EAAE,UAAU,CAAC,MAAM;aAC1B,CAAC,CAAC;QACL,CAAC;gBAAS,CAAC;YACT,YAAY,CAAC,OAAO,CAAC,CAAC;QACxB,CAAC;IACH,CAAC;CACF;AAED,SAAS,iBAAiB,CAAC,WAAmB;IAC5C,OAAO,kBAAkB,CAAC,IAAI,CAAC,CAAC,IAAI,EAAE,EAAE,CAAC,WAAW,CAAC,QAAQ,CAAC,IAAI,CAAC,CAAC,CAAC;AACvE,CAAC;AAED,SAAS,iBAAiB,CAAC,WAAmB;IAC5C,OAAO,kBAAkB,CAAC,IAAI,CAAC,CAAC,IAAI,EAAE,EAAE,CAAC,WAAW,CAAC,QAAQ,CAAC,IAAI,CAAC,CAAC,CAAC;AACvE,CAAC;AAED,SAAS,YAAY,CAAC,IAAY,EAAE,OAAe;IACjD,MAAM,CAAC,GAAG,OAAO,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC;IAC7B,MAAM,IAAI,GAAG,IAAI,GAAG,EAAU,CAAC;IAC/B,MAAM,KAAK,GAAqB,EAAE,CAAC;IAEnC,CAAC,CAAC,SAAS,CAAC,CAAC,IAAI,CAAC,CAAC,CAAC,EAAE,MAAM,EAAE,EAAE;QAC9B,MAAM,IAAI,GAAG,CAAC,CAAC,CAAC,MAAM,CAAC,CAAC,IAAI,CAAC,MAAM,CAAC,IAAI,EAAE,CAAC,CAAC,IAAI,EAAE,CAAC;QACnD,IAAI,CAAC,IAAI,IAAI,IAAI,CAAC,UAAU,CAAC,GAAG,CAAC,IAAI,IAAI,CAAC,UAAU,CAAC,SAAS,CAAC,IAAI,IAAI,CAAC,UAAU,CAAC,MAAM,CAAC,IAAI,IAAI,CAAC,UAAU,CAAC,aAAa,CAAC,EAAE,CAAC;YAC7H,OAAO;QACT,CAAC;QAED,IAAI,CAAC;YACH,MAAM,WAAW,GAAG,YAAY,CAAC,IAAI,GAAG,CAAC,IAAI,EAAE,OAAO,CAAC,CAAC,QAAQ,EAAE,CAAC,CAAC;YACpE,IAAI,IAAI,CAAC,GAAG,CAAC,WAAW,CAAC,EAAE,CAAC;gBAC1B,OAAO;YACT,CAAC;YAED,IAAI,CAAC,GAAG,CAAC,WAAW,CAAC,CAAC;YACtB,KAAK,CAAC,IAAI,CAAC,EAAE,IAAI,EAAE,WAAW,EAAE,CAAC,CAAC;QACpC,CAAC;QAAC,MAAM,CAAC;YACP,KAAK,CAAC,IAAI,CAAC,EAAE,IAAI,EAAE,CAAC,CAAC;QACvB,CAAC;IACH,CAAC,CAAC,CAAC;IAEH,OAAO,KAAK,CAAC;AACf,CAAC;AAED,SAAS,cAAc,CAAC,OAAe,EAAE,SAAiB;IACxD,MAAM,aAAa,GAAG,CAAC,SAAS,CAAC,WAAW,EAAE,EAAE,WAAW,EAAE,GAAG,CAAC,CAAC;IAClE,MAAM,KAAK,GAAG,OAAO,CAAC,KAAK,CAAC,OAAO,CAAC,CAAC;IACrC,MAAM,KAAK,GAAiB,EAAE,CAAC;IAC/B,IAAI,OAAO,GAAG,KAAK,CAAC;IAEpB,KAAK,MAAM,OAAO,IAAI,KAAK,EAAE,CAAC;QAC5B,MAAM,IAAI,GAAG,OAAO,CAAC,OAAO,CAAC,MAAM,EAAE,EAAE,CAAC,CAAC,IAAI,EAAE,CAAC;QAChD,IAAI,CAAC,IAAI,EAAE,CAAC;YACV,SAAS;QACX,CAAC;QAED,MAAM,SAAS,GAAG,IAAI,CAAC,OAAO,CAAC,GAAG,CAAC,CAAC;QACpC,IAAI,SAAS,KAAK,CAAC,CAAC,EAAE,CAAC;YACrB,SAAS;QACX,CAAC;QAED,MAAM,GAAG,GAAG,IAAI,CAAC,KAAK,CAAC,CAAC,EAAE,SAAS,CAAC,CAAC,IAAI,EAAE,CAAC,WAAW,EAAE,CAAC;QAC1D,MAAM,KAAK,GAAG,IAAI,CAAC,KAAK,CAAC,SAAS,GAAG,CAAC,CAAC,CAAC,IAAI,EAAE,CAAC;QAE/C,IAAI,GAAG,KAAK,YAAY,EAAE,CAAC;YACzB,OAAO,GAAG,aAAa,CAAC,IAAI,CAAC,CAAC,KAAK,EAAE,EAAE,CAAC,KAAK,CAAC,WAAW,EAAE,CAAC,QAAQ,CAAC,KAAK,CAAC,CAAC,CAAC;YAC7E,SAAS;QACX,CAAC;QAED,IAAI,CAAC,OAAO,EAAE,CAAC;YACb,SAAS;QACX,CAAC;QAED,IAAI,CAAC,GAAG,KAAK,OAAO,IAAI,GAAG,KAAK,UAAU,CAAC,IAAI,KAAK,EAAE,CAAC;YACrD,KAAK,CAAC,IAAI,CAAC,EAAE,KAAK,EAAE,GAAG,KAAK,OAAO,EAAE,IAAI,EAAE,KAAK,EAAE,CAAC,CAAC;QACtD,CAAC;IACH,CAAC;IAED,OAAO,EAAE,KAAK,EAAE,CAAC;AACnB,CAAC;AAED,SAAS,iBAAiB,CAAC,SAAiB,EAAE,MAA2B;IACvE,IAAI,CAAC,MAAM,IAAI,MAAM,CAAC,KAAK,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;QACzC,OAAO,IAAI,CAAC;IACd,CAAC;IAED,MAAM,QAAQ,GAAG,IAAI,GAAG,CAAC,SAAS,CAAC,CAAC,QAAQ,IAAI,GAAG,CAAC;IACpD,IAAI,SAAS,GAAsB,IAAI,CAAC;IAExC,KAAK,MAAM,IAAI,IAAI,MAAM,CAAC,KAAK,EAAE,CAAC;QAChC,IAAI,CAAC,QAAQ,CAAC,UAAU,CAAC,IAAI,CAAC,IAAI,CAAC,EAAE,CAAC;YACpC,SAAS;QACX,CAAC;QAED,IAAI,CAAC,SAAS,IAAI,IAAI,CAAC,IAAI,CAAC,MAAM,IAAI,SAAS,CAAC,IAAI,CAAC,MAAM,EAAE,CAAC;YAC5D,SAAS,GAAG,IAAI,CAAC;QACnB,CAAC;IACH,CAAC;IAED,OAAO,SAAS,CAAC,CAAC,CAAC,SAAS,CAAC,KAAK,CAAC,CAAC,CAAC,IAAI,CAAC;AAC5C,CAAC;AAED,SAAS,qBAAqB,CAAC,QAAgB,EAAE,OAAe;IAC9D,IAAI,CAAC,OAAO,EAAE,CAAC;QACb,OAAO,KAAK,CAAC;IACf,CAAC;IAED,IAAI,OAAO,CAAC,QAAQ,CAAC,IAAI,CAAC,EAAE,CAAC;QAC3B,OAAO,QAAQ,CAAC,UAAU,CAAC,OAAO,CAAC,KAAK,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC,CAAC,CAAC;IACnD,CAAC;IAED,IAAI,OAAO,CAAC,UAAU,CAAC,GAAG,CAAC,EAAE,CAAC;QAC5B,OAAO,QAAQ,CAAC,QAAQ,CAAC,OAAO,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,CAAC;IAC7C,CAAC;IAED,IAAI,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAC,EAAE,CAAC;QAC1B,OAAO,QAAQ,CAAC,UAAU,CAAC,OAAO,CAAC,KAAK,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC,CAAC,CAAC;IACnD,CAAC;IAED,OAAO,QAAQ,KAAK,OAAO,CAAC;AAC9B,CAAC;AAED,SAAS,eAAe,CAAC,SAAiB,EAAE,QAAkB;IAC5D,MAAM,GAAG,GAAG,IAAI,GAAG,CAAC,SAAS,CAAC,CAAC;IAC/B,MAAM,QAAQ,GAAG,GAAG,GAAG,CAAC,QAAQ,IAAI,GAAG,GAAG,GAAG,CAAC,MAAM,IAAI,EAAE,EAAE,CAAC;IAC7D,OAAO,QAAQ,CAAC,IAAI,CAAC,CAAC,OAAO,EAAE,EAAE,CAAC,qBAAqB,CAAC,QAAQ,EAAE,OAAO,CAAC,IAAI,qBAAqB,CAAC,GAAG,CAAC,QAAQ,IAAI,GAAG,EAAE,OAAO,CAAC,CAAC;QAChI,CAAC,CAAC,QAAQ;QACV,CAAC,CAAC,IAAI,CAAC;AACX,CAAC;AAED,SAAS,cAAc,CACrB,KAA0B,EAC1B,SAAiB,EACjB,OAA8C,EAC9C,KAAa,EACb,MAAc,EACd,MAAc,EACd,UAA6C;IAE7C,MAAM,YAAY,GAAG,eAAe,CAAC,SAAS,EAAE,OAAO,CAAC,eAAe,CAAC,CAAC;IACzE,IAAI,CAAC,YAAY,EAAE,CAAC;QAClB,OAAO,KAAK,CAAC;IACf,CAAC;IAED,UAAU,EAAE,CAAC;QACX,KAAK;QACL,OAAO,EAAE,KAAK;QACd,MAAM;QACN,KAAK;QACL,MAAM;QACN,OAAO,EAAE,YAAY,YAAY,aAAa;KAC/C,CAAC,CAAC;IACH,OAAO,IAAI,CAAC;AACd,CAAC;AAED,KAAK,UAAU,WAAW,CAAC,MAAc,EAAE,OAA2B;IACpE,IAAI,CAAC;QACH,MAAM,QAAQ,GAAG,MAAM,OAAO,CAAC,KAAK,CAAC,GAAG,MAAM,aAAa,CAAC,CAAC;QAC7D,IAAI,CAAC,QAAQ,CAAC,EAAE,EAAE,CAAC;YACjB,OAAO,IAAI,CAAC;QACd,CAAC;QAED,MAAM,OAAO,GAAG,MAAM,QAAQ,CAAC,IAAI,EAAE,CAAC;QACtC,OAAO,cAAc,CAAC,OAAO,EAAE,kBAAkB,CAAC,CAAC;IACrD,CAAC;IAAC,MAAM,CAAC;QACP,OAAO,IAAI,CAAC;IACd,CAAC;AACH,CAAC;AAED,KAAK,UAAU,cAAc,CAAC,GAAW,EAAE,UAA+B,EAAE,OAA2B;IACrG,MAAM,QAAQ,GAAG,MAAM,OAAO,CAAC,KAAK,CAAC,GAAG,CAAC,CAAC;IAC1C,IAAI,CAAC,QAAQ,CAAC,EAAE,EAAE,CAAC;QACjB,MAAM,IAAI,KAAK,CAAC,sBAAsB,GAAG,gBAAgB,QAAQ,CAAC,MAAM,EAAE,CAAC,CAAC;IAC9E,CAAC;IAED,MAAM,WAAW,GAAG,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,cAAc,CAAC,IAAI,YAAY,CAAC,CAAC,WAAW,EAAE,CAAC;IACzF,MAAM,eAAe,GAAG,YAAY,CAAC,QAAQ,CAAC,GAAG,IAAI,GAAG,CAAC,CAAC;IAC1D,MAAM,YAAY,GAAG,WAAW,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,CAAC,eAAe,CAAC,CAAC,CAAC;IAExE,IAAI,iBAAiB,CAAC,WAAW,CAAC,EAAE,CAAC;QACnC,MAAM,IAAI,GAAG,MAAM,QAAQ,CAAC,IAAI,EAAE,CAAC;QACnC,OAAO;YACL,UAAU;YACV,eAAe;YACf,GAAG,EAAE,eAAe;YACpB,kBAAkB,EAAE,uBAAuB,CAAC,eAAe,CAAC;YAC5D,WAAW;YACX,IAAI;YACJ,YAAY,EAAE,IAAI;YAClB,eAAe,EAAE,YAAY,CAAC,IAAI,EAAE,eAAe,CAAC;YACpD,YAAY;YACZ,UAAU,EAAE,QAAQ,CAAC,MAAM;SAC5B,CAAC;IACJ,CAAC;IAED,IAAI,iBAAiB,CAAC,WAAW,CAAC,EAAE,CAAC;QACnC,MAAM,IAAI,GAAG,MAAM,QAAQ,CAAC,IAAI,EAAE,CAAC;QACnC,OAAO;YACL,UAAU;YACV,eAAe;YACf,GAAG,EAAE,eAAe;YACpB,kBAAkB,EAAE,uBAAuB,CAAC,eAAe,CAAC;YAC5D,WAAW;YACX,IAAI;YACJ,eAAe,EAAE,EAAE;YACnB,YAAY;YACZ,UAAU,EAAE,QAAQ,CAAC,MAAM;SAC5B,CAAC;IACJ,CAAC;IAED,OAAO,IAAI,CAAC;AACd,CAAC;AAED,KAAK,UAAU,UAAU,CACvB,OAAqB,EACrB,UAA+B,EAC/B,OAA2B,EAC3B,KAAa,EACb,MAAc,EACd,UAA6C;IAE7C,MAAM,MAAM,GAAG,OAAO,CAAC,MAAM,CAAC;IAE9B,OAAO,OAAO,CAAC,GAAG,CAChB,OAAO,CAAC,GAAG,CAAC,KAAK,EAAE,KAAK,EAAE,KAAK,EAAE,EAAE;QACjC,UAAU,EAAE,CAAC;YACX,KAAK,EAAE,UAAU;YACjB,OAAO,EAAE,KAAK,GAAG,KAAK,GAAG,CAAC;YAC1B,MAAM;YACN,KAAK;YACL,MAAM;YACN,OAAO,EAAE,YAAY,KAAK,CAAC,GAAG,EAAE;SACjC,CAAC,CAAC;QAEH,IAAI,CAAC;YACH,MAAM,QAAQ,GAAG,MAAM,cAAc,CAAC,KAAK,CAAC,GAAG,EAAE,UAAU,EAAE,OAAO,CAAC,CAAC;YACtE,IAAI,CAAC,QAAQ,EAAE,CAAC;gBACd,OAAO;oBACL,GAAG,EAAE,KAAK,CAAC,GAAG;oBACd,KAAK,EAAE,uCAAuC,KAAK,CAAC,GAAG,EAAE;iBAC1D,CAAC;YACJ,CAAC;YAED,OAAO;gBACL,GAAG,EAAE,KAAK,CAAC,GAAG;gBACd,QAAQ;aACT,CAAC;QACJ,CAAC;QAAC,OAAO,KAAK,EAAE,CAAC;YACf,OAAO;gBACL,GAAG,EAAE,KAAK,CAAC,GAAG;gBACd,KAAK,EAAE,KAAK,YAAY,KAAK,CAAC,CAAC,CAAC,KAAK,CAAC,OAAO,CAAC,CAAC,CAAC,2BAA2B,KAAK,CAAC,GAAG,EAAE;aACvF,CAAC;QACJ,CAAC;IACH,CAAC,CAAC,CACH,CAAC;AACJ,CAAC;AAED,MAAM,CAAC,KAAK,UAAU,YAAY,CAChC,OAAe,EACf,OAAqB,EACrB,UAA6C;IAE7C,MAAM,cAAc,GAAG,YAAY,CAAC,OAAO,CAAC,CAAC;IAC7C,MAAM,IAAI,GAAG,IAAI,GAAG,CAAC,cAAc,CAAC,CAAC;IACrC,MAAM,OAAO,GAAG,IAAI,kBAAkB,CAAC;QACrC,SAAS,EAAE,OAAO,CAAC,SAAS;QAC5B,WAAW,EAAE,OAAO,CAAC,WAAW;QAChC,SAAS,EAAE,OAAO,CAAC,SAAS;KAC7B,CAAC,CAAC;IACH,MAAM,MAAM,GAAG,MAAM,WAAW,CAAC,IAAI,CAAC,MAAM,EAAE,OAAO,CAAC,CAAC;IACvD,MAAM,KAAK,GAAiB,CAAC,EAAE,GAAG,EAAE,cAAc,EAAE,KAAK,EAAE,CAAC,EAAE,CAAC,CAAC;IAChE,MAAM,MAAM,GAAG,IAAI,GAAG,CAAS,CAAC,cAAc,CAAC,CAAC,CAAC;IACjD,MAAM,IAAI,GAAG,IAAI,GAAG,EAAU,CAAC;IAC/B,MAAM,SAAS,GAAkB,EAAE,CAAC;IACpC,MAAM,MAAM,GAAa,EAAE,CAAC;IAE5B,OAAO,KAAK,CAAC,MAAM,GAAG,CAAC,IAAI,SAAS,CAAC,MAAM,GAAG,OAAO,CAAC,QAAQ,EAAE,CAAC;QAC/D,MAAM,KAAK,GAAiB,EAAE,CAAC;QAE/B,OAAO,KAAK,CAAC,MAAM,GAAG,CAAC,IAAI,KAAK,CAAC,MAAM,GAAG,OAAO,CAAC,WAAW,IAAI,SAAS,CAAC,MAAM,GAAG,KAAK,CAAC,MAAM,GAAG,OAAO,CAAC,QAAQ,EAAE,CAAC;YACpH,MAAM,OAAO,GAAG,KAAK,CAAC,KAAK,EAAE,CAAC;YAC9B,IAAI,CAAC,OAAO,IAAI,IAAI,CAAC,GAAG,CAAC,OAAO,CAAC,GAAG,CAAC,EAAE,CAAC;gBACtC,SAAS;YACX,CAAC;YAED,IAAI,CAAC,GAAG,CAAC,OAAO,CAAC,GAAG,CAAC,CAAC;YAEtB,IAAI,cAAc,CAAC,OAAO,EAAE,OAAO,CAAC,GAAG,EAAE,OAAO,EAAE,SAAS,CAAC,MAAM,EAAE,KAAK,CAAC,MAAM,EAAE,KAAK,CAAC,MAAM,EAAE,UAAU,CAAC,EAAE,CAAC;gBAC5G,SAAS;YACX,CAAC;YAED,IAAI,CAAC,iBAAiB,CAAC,OAAO,CAAC,GAAG,EAAE,MAAM,CAAC,EAAE,CAAC;gBAC5C,MAAM,CAAC,IAAI,CAAC,0BAA0B,OAAO,CAAC,GAAG,EAAE,CAAC,CAAC;gBACrD,SAAS;YACX,CAAC;YAED,KAAK,CAAC,IAAI,CAAC,OAAO,CAAC,CAAC;QACtB,CAAC;QAED,IAAI,KAAK,CAAC,MAAM,KAAK,CAAC,EAAE,CAAC;YACvB,SAAS;QACX,CAAC;QAED,MAAM,OAAO,GAAG,MAAM,UAAU,CAAC,KAAK,EAAE,OAAO,EAAE,OAAO,EAAE,SAAS,CAAC,MAAM,EAAE,KAAK,CAAC,MAAM,EAAE,UAAU,CAAC,CAAC;QAEtG,KAAK,MAAM,CAAC,KAAK,EAAE,MAAM,CAAC,IAAI,OAAO,CAAC,OAAO,EAAE,EAAE,CAAC;YAChD,IAAI,MAAM,CAAC,KAAK,EAAE,CAAC;gBACjB,MAAM,CAAC,IAAI,CAAC,MAAM,CAAC,KAAK,CAAC,CAAC;gBAC1B,UAAU,EAAE,CAAC;oBACX,KAAK,EAAE,OAAO;oBACd,OAAO,EAAE,SAAS,CAAC,MAAM;oBACzB,MAAM,EAAE,KAAK,CAAC,MAAM;oBACpB,KAAK,EAAE,SAAS,CAAC,MAAM;oBACvB,MAAM,EAAE,IAAI,CAAC,GAAG,CAAC,CAAC,EAAE,OAAO,CAAC,MAAM,GAAG,KAAK,GAAG,CAAC,CAAC;oBAC/C,OAAO,EAAE,MAAM,CAAC,KAAK;iBACtB,CAAC,CAAC;gBACH,SAAS;YACX,CAAC;YAED,MAAM,QAAQ,GAAG,MAAM,CAAC,QAAQ,CAAC;YACjC,IAAI,CAAC,QAAQ,EAAE,CAAC;gBACd,SAAS;YACX,CAAC;YAED,SAAS,CAAC,IAAI,CAAC,QAAQ,CAAC,CAAC;YACzB,MAAM,OAAO,GAAG,KAAK,CAAC,KAAK,CAAC,CAAC;YAC7B,IAAI,CAAC,OAAO,IAAI,CAAC,QAAQ,CAAC,IAAI,IAAI,OAAO,CAAC,KAAK,IAAI,OAAO,CAAC,KAAK,EAAE,CAAC;gBACjE,SAAS;YACX,CAAC;YAED,KAAK,MAAM,IAAI,IAAI,QAAQ,CAAC,eAAe,EAAE,CAAC;gBAC5C,IAAI,CAAC,IAAI,CAAC,WAAW,EAAE,CAAC;oBACtB,SAAS;gBACX,CAAC;gBAED,MAAM,cAAc,GAAG,YAAY,CAAC,IAAI,CAAC,WAAW,CAAC,CAAC;gBACtD,MAAM,MAAM,GAAG,IAAI,GAAG,CAAC,cAAc,CAAC,CAAC;gBACvC,IAAI,MAAM,CAAC,MAAM,KAAK,IAAI,CAAC,MAAM,IAAI,IAAI,CAAC,GAAG,CAAC,cAAc,CAAC,IAAI,MAAM,CAAC,GAAG,CAAC,cAAc,CAAC,EAAE,CAAC;oBAC5F,SAAS;gBACX,CAAC;gBAED,IAAI,cAAc,CAAC,OAAO,EAAE,cAAc,EAAE,OAAO,EAAE,SAAS,CAAC,MAAM,EAAE,KAAK,CAAC,MAAM,EAAE,CAAC,EAAE,UAAU,CAAC,EAAE,CAAC;oBACpG,IAAI,CAAC,GAAG,CAAC,cAAc,CAAC,CAAC;oBACzB,SAAS;gBACX,CAAC;gBAED,KAAK,CAAC,IAAI,CAAC,EAAE,GAAG,EAAE,cAAc,EAAE,KAAK,EAAE,OAAO,CAAC,KAAK,GAAG,CAAC,EAAE,CAAC,CAAC;gBAC9D,MAAM,CAAC,GAAG,CAAC,cAAc,CAAC,CAAC;YAC7B,CAAC;QACH,CAAC;IACH,CAAC;IAED,OAAO,EAAE,SAAS,EAAE,MAAM,EAAE,CAAC;AAC/B,CAAC;AAED,MAAM,CAAC,KAAK,UAAU,sBAAsB,CAC1C,IAAc,EACd,OAAoC,EACpC,UAAqB,EACrB,UAA6C;IAE7C,MAAM,OAAO,GAAG,IAAI,kBAAkB,CAAC;QACrC,SAAS,EAAE,OAAO,CAAC,SAAS;QAC5B,WAAW,EAAE,OAAO,CAAC,WAAW;QAChC,SAAS,EAAE,OAAO,CAAC,SAAS;KAC7B,CAAC,CAAC;IAEH,MAAM,WAAW,GAAG,IAAI,GAAG,EAA+B,CAAC;IAC3D,MAAM,cAAc,GAAG,KAAK,CAAC,IAAI,CAAC,IAAI,GAAG,CAAC,IAAI,CAAC,GAAG,CAAC,CAAC,GAAG,EAAE,EAAE,CAAC,YAAY,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,EAAE,OAAO,CAAC,QAAQ,CAAC,CAAC;IAC5G,MAAM,SAAS,GAAkB,EAAE,CAAC;IACpC,MAAM,MAAM,GAAa,EAAE,CAAC;IAE5B,KAAK,IAAI,KAAK,GAAG,CAAC,EAAE,KAAK,GAAG,cAAc,CAAC,MAAM,EAAE,KAAK,IAAI,OAAO,CAAC,WAAW,EAAE,CAAC;QAChF,MAAM,KAAK,GAAG,cAAc,CAAC,KAAK,CAAC,KAAK,EAAE,KAAK,GAAG,OAAO,CAAC,WAAW,CAAC,CAAC;QACvE,MAAM,MAAM,GAAG,KAAK,CAAC,MAAM,CAAC;QAC5B,MAAM,OAAO,GAAG,MAAM,OAAO,CAAC,GAAG,CAC/B,KAAK,CAAC,GAAG,CAAC,KAAK,EAAE,GAAG,EAAE,MAAM,EAAE,EAAE;YAC9B,IAAI,cAAc,CAAC,UAAU,EAAE,GAAG,EAAE,OAAO,EAAE,SAAS,CAAC,MAAM,EAAE,IAAI,CAAC,GAAG,CAAC,CAAC,EAAE,cAAc,CAAC,MAAM,GAAG,KAAK,GAAG,MAAM,GAAG,CAAC,CAAC,EAAE,MAAM,EAAE,UAAU,CAAC,EAAE,CAAC;gBAC5I,OAAO,EAAE,GAAG,EAA+B,CAAC;YAC9C,CAAC;YAED,MAAM,MAAM,GAAG,IAAI,GAAG,CAAC,GAAG,CAAC,CAAC,MAAM,CAAC;YACnC,IAAI,CAAC,WAAW,CAAC,GAAG,CAAC,MAAM,CAAC,EAAE,CAAC;gBAC7B,WAAW,CAAC,GAAG,CAAC,MAAM,EAAE,MAAM,WAAW,CAAC,MAAM,EAAE,OAAO,CAAC,CAAC,CAAC;YAC9D,CAAC;YAED,IAAI,CAAC,iBAAiB,CAAC,GAAG,EAAE,WAAW,CAAC,GAAG,CAAC,MAAM,CAAC,IAAI,IAAI,CAAC,EAAE,CAAC;gBAC7D,OAAO;oBACL,GAAG;oBACH,KAAK,EAAE,0BAA0B,GAAG,EAAE;iBACV,CAAC;YACjC,CAAC;YAED,UAAU,EAAE,CAAC;gBACX,KAAK,EAAE,UAAU;gBACjB,OAAO,EAAE,KAAK,GAAG,MAAM,GAAG,CAAC;gBAC3B,MAAM,EAAE,IAAI,CAAC,GAAG,CAAC,CAAC,EAAE,cAAc,CAAC,MAAM,GAAG,KAAK,GAAG,MAAM,GAAG,CAAC,CAAC;gBAC/D,KAAK,EAAE,SAAS,CAAC,MAAM;gBACvB,MAAM;gBACN,OAAO,EAAE,YAAY,GAAG,EAAE;aAC3B,CAAC,CAAC;YAEH,IAAI,CAAC;gBACH,OAAO;oBACL,GAAG;oBACH,QAAQ,EAAE,MAAM,cAAc,CAAC,GAAG,EAAE,UAAU,EAAE,OAAO,CAAC;iBAC5B,CAAC;YACjC,CAAC;YAAC,OAAO,KAAK,EAAE,CAAC;gBACf,OAAO;oBACL,GAAG;oBACH,KAAK,EAAE,KAAK,YAAY,KAAK,CAAC,CAAC,CAAC,KAAK,CAAC,OAAO,CAAC,CAAC,CAAC,2BAA2B,GAAG,EAAE;iBACpD,CAAC;YACjC,CAAC;QACH,CAAC,CAAC,CACH,CAAC;QAEF,KAAK,MAAM,MAAM,IAAI,OAAO,EAAE,CAAC;YAC7B,IAAI,MAAM,CAAC,KAAK,EAAE,CAAC;gBACjB,MAAM,CAAC,IAAI,CAAC,MAAM,CAAC,KAAK,CAAC,CAAC;gBAC1B,SAAS;YACX,CAAC;YAED,IAAI,MAAM,CAAC,QAAQ,EAAE,CAAC;gBACpB,SAAS,CAAC,IAAI,CAAC,MAAM,CAAC,QAAQ,CAAC,CAAC;YAClC,CAAC;QACH,CAAC;IACH,CAAC;IAED,OAAO,EAAE,SAAS,EAAE,MAAM,EAAE,CAAC;AAC/B,CAAC"}
|