aem-live-docs-mcp 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Jigar Karangiya
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,163 @@
1
+ # AEM Live Docs MCP
2
+
3
+ An MCP (Model Context Protocol) server that gives AI assistants (Cursor, Claude, etc.) direct access to the official **Adobe Experience Manager (AEM)** documentation at [aem.live](https://www.aem.live).
4
+
5
+ Every page on aem.live is available as a native markdown file. This MCP fetches the source markdown directly — no HTML scraping, no conversion artefacts, just clean content.
6
+
7
+ ---
8
+
9
+ ## What's Indexed
10
+
11
+ ~185 pages from [www.aem.live](https://www.aem.live), organised into sections:
12
+
13
+ | Section | Contents |
14
+ |---|---|
15
+ | `docs` | Main documentation — authoring, CDN setup, sidekick, redirects, metadata, security, etc. |
16
+ | `developer` | Developer guides — tutorial, block collection, anatomy of a project, markup/sections, indexing, forms, etc. |
17
+ | `blog` | AEM team blog posts |
18
+ | `business` | Business & community pages |
19
+ | `community` | Community hub |
20
+
21
+ ---
22
+
23
+ ## Tools
24
+
25
+ | Tool | Description |
26
+ |---|---|
27
+ | `search_aem_docs` | BM25 search with synonym expansion and fuzzy matching |
28
+ | `get_aem_doc_content` | Fetch full page as clean markdown |
29
+ | `get_aem_page_toc` | Table of contents (headings only) for cheap page preview |
30
+ | `get_aem_code_examples` | Extract only code blocks from a page |
31
+ | `get_related_aem_docs` | Find sibling pages in the same folder |
32
+ | `list_aem_doc_sections` | List all sections with page counts |
33
+ | `lookup_aem_topic` | Lookup a topic and auto-fetch the top result |
34
+ | `multi_aem_search` | Run multiple queries at once, de-duplicated |
35
+ | `refresh_aem_index` | Force-refresh the cached page index |
36
+
37
+ ## Prompts
38
+
39
+ | Prompt | Description |
40
+ |---|---|
41
+ | `aem-site-setup` | Step-by-step guide to set up a new AEM site |
42
+ | `aem-block-creator` | Design and scaffold an AEM block/component |
43
+ | `aem-performance-guide` | Achieve a Lighthouse score of 100 |
44
+ | `aem-cdn-setup` | Configure a CDN for AEM production delivery |
45
+
46
+ ## Resources
47
+
48
+ | Resource | URI | Description |
49
+ |---|---|---|
50
+ | Sections | `aem-docs://sections` | All sections with page counts |
51
+ | Stats | `aem-docs://stats` | Server status, uptime, index size |
52
+ | Section docs | `aem-docs://docs/{section}` | Browse pages in a section |
53
+
54
+ ---
55
+
56
+ ## Quick Setup (Cursor)
57
+
58
+ ```bash
59
+ bash setup-cursor.sh
60
+ ```
61
+
62
+ Or add manually to `~/.cursor/mcp.json`:
63
+
64
+ ```json
65
+ {
66
+ "mcpServers": {
67
+ "aem-live-docs": {
68
+ "command": "npx",
69
+ "args": ["-y", "aem-live-docs-mcp"]
70
+ }
71
+ }
72
+ }
73
+ ```
74
+
75
+ ### From local build
76
+
77
+ ```bash
78
+ npm install && npm run build
79
+
80
+ # Add to ~/.cursor/mcp.json:
81
+ {
82
+ "mcpServers": {
83
+ "aem-live-docs": {
84
+ "command": "node",
85
+ "args": ["/path/to/aem-live-docs-mcp/dist/index.js"]
86
+ }
87
+ }
88
+ }
89
+ ```
90
+
91
+ ---
92
+
93
+ ## Example Prompts
94
+
95
+ ```
96
+ "How do I set up a new AEM site with Google Drive as content source?"
97
+ "Show me how to create an AEM hero block"
98
+ "How do I achieve a Lighthouse score of 100 in AEM?"
99
+ "Configure Cloudflare for AEM with push invalidation"
100
+ "What is the AEM Sidekick and how do I install it?"
101
+ "How does the AEM query-index.json work for indexing?"
102
+ "What are AEM fragments and how do I use them?"
103
+ "Explain the AEM block collection and auto-blocking"
104
+ ```
105
+
106
+ ---
107
+
108
+ ## How Content Fetching Works
109
+
110
+ AEM live serves every page as clean markdown at `<page-url>.md`. For example:
111
+
112
+ - `https://www.aem.live/docs/faq` → `https://www.aem.live/docs/faq.md`
113
+ - `https://www.aem.live/developer/tutorial` → `https://www.aem.live/developer/tutorial.md`
114
+
115
+ This means content is always fresh and authoritative — no GitHub repo mapping required.
116
+
117
+ The page index is loaded from the [query-index.json](https://www.aem.live/query-index.json) endpoint which provides `path`, `title`, `description`, and `lastModified` for all ~185 pages in a single request.
118
+
119
+ ---
120
+
121
+ ## Caching
122
+
123
+ | Cache | TTL | Location |
124
+ |---|---|---|
125
+ | Page index | 24 h | `~/.cache/aem-live-docs-mcp/index-cache.json` |
126
+ | Page content (disk) | 7 days | `~/.cache/aem-live-docs-mcp/pages/<hash>.md` |
127
+ | Page content (memory) | 1 h | In-process LRU (100 entries) |
128
+
129
+ Use `refresh_aem_index` to force a re-fetch of the page index.
130
+
131
+ ## Environment Variables
132
+
133
+ | Variable | Default | Description |
134
+ |---|---|---|
135
+ | `AEM_BASE_URL` | `https://www.aem.live` | AEM site base URL |
136
+ | `QUERY_INDEX_URL` | `https://www.aem.live/query-index.json` | Query index endpoint |
137
+ | `CACHE_DIR` | `~/.cache/aem-live-docs-mcp` | Cache directory |
138
+ | `MAX_CONTENT_LENGTH` | `15000` | Max chars per page (smart truncation at heading boundaries) |
139
+ | `PORT` | `3000` | HTTP port (when using `--http` mode) |
140
+ | `LOG_LEVEL` | `info` | Log level: debug/info/warn/error |
141
+
142
+ ## HTTP Mode
143
+
144
+ ```bash
145
+ npm run start:http
146
+ # or
147
+ node dist/index.js --http
148
+ ```
149
+
150
+ Server listens on `http://localhost:3000` with CORS enabled.
151
+
152
+ ---
153
+
154
+ ## Related MCPs
155
+
156
+ - [adobe-commerce-docs-mcp](https://github.com/jigarkkarangiya/adobe-commerce-docs-mcp) — Adobe Commerce merchant/user docs
157
+ - [adobe-commerce-dev-docs-mcp](https://github.com/jigarkkarangiya/adobe-commerce-dev-docs-mcp) — Adobe Commerce developer docs
158
+
159
+ ---
160
+
161
+ ## License
162
+
163
+ MIT © Jigar Karangiya
@@ -0,0 +1,19 @@
1
+ export declare const config: {
2
+ version: string;
3
+ baseUrl: string;
4
+ queryIndexUrl: string;
5
+ cacheDir: string;
6
+ indexCacheFile: string;
7
+ pageCacheDir: string;
8
+ indexCacheTtlMs: number;
9
+ pageCacheMemoryMax: number;
10
+ pageCacheMemoryTtlMs: number;
11
+ pageCacheDiskTtlMs: number;
12
+ maxContentLength: number;
13
+ maxConcurrentFetches: number;
14
+ httpPort: number;
15
+ logLevel: "debug" | "info" | "warn" | "error";
16
+ userAgent: string;
17
+ docPathPrefixes: string[];
18
+ };
19
+ //# sourceMappingURL=config.d.ts.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"config.d.ts","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AA8BA,eAAO,MAAM,MAAM;;;;;;;;;;;;;;cA0BwB,OAAO,GAAG,MAAM,GAAG,MAAM,GAAG,OAAO;;;CAM7E,CAAC"}
package/dist/config.js ADDED
@@ -0,0 +1,45 @@
1
+ import { join } from "node:path";
2
+ import { homedir } from "node:os";
3
+ function envInt(key, fallback) {
4
+ const v = process.env[key];
5
+ if (!v)
6
+ return fallback;
7
+ const n = parseInt(v, 10);
8
+ return isNaN(n) ? fallback : n;
9
+ }
10
+ function envStr(key, fallback) {
11
+ return process.env[key] || fallback;
12
+ }
13
+ const cacheDir = envStr("CACHE_DIR", join(homedir(), ".cache", "aem-live-docs-mcp"));
14
+ // Path prefixes considered part of the AEM documentation.
15
+ // Pages outside these prefixes (footer, gnav, tools/organizer, experiments, etc.)
16
+ // are excluded from the index.
17
+ const docPathPrefixes = [
18
+ "/docs",
19
+ "/developer",
20
+ "/blog",
21
+ "/business",
22
+ "/community",
23
+ ];
24
+ export const config = {
25
+ version: "1.0.0",
26
+ // AEM live base URL — all page content is available as markdown at <baseUrl><path>.md
27
+ baseUrl: envStr("AEM_BASE_URL", "https://www.aem.live"),
28
+ // AEM live query-index.json endpoint — provides the full page listing with
29
+ // path, title, description, lastModified in one request.
30
+ queryIndexUrl: envStr("QUERY_INDEX_URL", "https://www.aem.live/query-index.json"),
31
+ cacheDir,
32
+ indexCacheFile: join(cacheDir, "index-cache.json"),
33
+ pageCacheDir: join(cacheDir, "pages"),
34
+ indexCacheTtlMs: envInt("INDEX_CACHE_TTL_MS", 24 * 60 * 60 * 1000),
35
+ pageCacheMemoryMax: envInt("PAGE_CACHE_MAX", 100),
36
+ pageCacheMemoryTtlMs: envInt("PAGE_CACHE_TTL_MS", 60 * 60 * 1000),
37
+ pageCacheDiskTtlMs: envInt("PAGE_DISK_CACHE_TTL_MS", 7 * 24 * 60 * 60 * 1000),
38
+ maxContentLength: envInt("MAX_CONTENT_LENGTH", 15000),
39
+ maxConcurrentFetches: envInt("MAX_CONCURRENT_FETCHES", 5),
40
+ httpPort: envInt("PORT", 3000),
41
+ logLevel: envStr("LOG_LEVEL", "info"),
42
+ userAgent: "Mozilla/5.0 (compatible; AEMLiveDocsMCP/1.0; +https://github.com/jigarkkarangiya/aem-live-docs-mcp)",
43
+ docPathPrefixes,
44
+ };
45
+ //# sourceMappingURL=config.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"config.js","sourceRoot":"","sources":["../src/config.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,IAAI,EAAE,MAAM,WAAW,CAAC;AACjC,OAAO,EAAE,OAAO,EAAE,MAAM,SAAS,CAAC;AAElC,SAAS,MAAM,CAAC,GAAW,EAAE,QAAgB;IAC3C,MAAM,CAAC,GAAG,OAAO,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC;IAC3B,IAAI,CAAC,CAAC;QAAE,OAAO,QAAQ,CAAC;IACxB,MAAM,CAAC,GAAG,QAAQ,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC;IAC1B,OAAO,KAAK,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,QAAQ,CAAC,CAAC,CAAC,CAAC,CAAC;AACjC,CAAC;AAED,SAAS,MAAM,CAAC,GAAW,EAAE,QAAgB;IAC3C,OAAO,OAAO,CAAC,GAAG,CAAC,GAAG,CAAC,IAAI,QAAQ,CAAC;AACtC,CAAC;AAED,MAAM,QAAQ,GAAG,MAAM,CACrB,WAAW,EACX,IAAI,CAAC,OAAO,EAAE,EAAE,QAAQ,EAAE,mBAAmB,CAAC,CAC/C,CAAC;AAEF,0DAA0D;AAC1D,kFAAkF;AAClF,+BAA+B;AAC/B,MAAM,eAAe,GAAG;IACtB,OAAO;IACP,YAAY;IACZ,OAAO;IACP,WAAW;IACX,YAAY;CACb,CAAC;AAEF,MAAM,CAAC,MAAM,MAAM,GAAG;IACpB,OAAO,EAAE,OAAO;IAEhB,sFAAsF;IACtF,OAAO,EAAE,MAAM,CAAC,cAAc,EAAE,sBAAsB,CAAC;IAEvD,2EAA2E;IAC3E,yDAAyD;IACzD,aAAa,EAAE,MAAM,CACnB,iBAAiB,EACjB,uCAAuC,CACxC;IAED,QAAQ;IACR,cAAc,EAAE,IAAI,CAAC,QAAQ,EAAE,kBAAkB,CAAC;IAClD,YAAY,EAAE,IAAI,CAAC,QAAQ,EAAE,OAAO,CAAC;IAErC,eAAe,EAAE,MAAM,CAAC,oBAAoB,EAAE,EAAE,GAAG,EAAE,GAAG,EAAE,GAAG,IAAI,CAAC;IAClE,kBAAkB,EAAE,MAAM,CAAC,gBAAgB,EAAE,GAAG,CAAC;IACjD,oBAAoB,EAAE,MAAM,CAAC,mBAAmB,EAAE,EAAE,GAAG,EAAE,GAAG,IAAI,CAAC;IACjE,kBAAkB,EAAE,MAAM,CAAC,wBAAwB,EAAE,CAAC,GAAG,EAAE,GAAG,EAAE,GAAG,EAAE,GAAG,IAAI,CAAC;IAE7E,gBAAgB,EAAE,MAAM,CAAC,oBAAoB,EAAE,KAAK,CAAC;IACrD,oBAAoB,EAAE,MAAM,CAAC,wBAAwB,EAAE,CAAC,CAAC;IAEzD,QAAQ,EAAE,MAAM,CAAC,MAAM,EAAE,IAAI,CAAC;IAC9B,QAAQ,EAAE,MAAM,CAAC,WAAW,EAAE,MAAM,CAAwC;IAE5E,SAAS,EACP,qGAAqG;IAEvG,eAAe;CAChB,CAAC"}
@@ -0,0 +1,16 @@
1
+ export interface TocEntry {
2
+ level: number;
3
+ title: string;
4
+ }
5
+ export declare function clearMemoryCache(): void;
6
+ export declare function buildMarkdownUrl(pageUrl: string): string;
7
+ export declare function cleanAemMarkdown(raw: string): string;
8
+ export declare function smartTruncate(content: string, maxLen: number): string;
9
+ export declare function fetchPageContent(url: string): Promise<string>;
10
+ export declare function fetchRawContent(url: string): Promise<string>;
11
+ export declare function extractCodeExamples(markdown: string): {
12
+ language: string;
13
+ code: string;
14
+ }[];
15
+ export declare function extractPageToc(markdown: string): TocEntry[];
16
+ //# sourceMappingURL=content.d.ts.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"content.d.ts","sourceRoot":"","sources":["../src/content.ts"],"names":[],"mappings":"AAOA,MAAM,WAAW,QAAQ;IACvB,KAAK,EAAE,MAAM,CAAC;IACd,KAAK,EAAE,MAAM,CAAC;CACf;AAgCD,wBAAgB,gBAAgB,IAAI,IAAI,CAEvC;AAqCD,wBAAgB,gBAAgB,CAAC,OAAO,EAAE,MAAM,GAAG,MAAM,CASxD;AA4ED,wBAAgB,gBAAgB,CAAC,GAAG,EAAE,MAAM,GAAG,MAAM,CAgBpD;AAID,wBAAgB,aAAa,CAAC,OAAO,EAAE,MAAM,EAAE,MAAM,EAAE,MAAM,GAAG,MAAM,CAgCrE;AAWD,wBAAsB,gBAAgB,CAAC,GAAG,EAAE,MAAM,GAAG,OAAO,CAAC,MAAM,CAAC,CAmBnE;AAED,wBAAsB,eAAe,CAAC,GAAG,EAAE,MAAM,GAAG,OAAO,CAAC,MAAM,CAAC,CAOlE;AAID,wBAAgB,mBAAmB,CACjC,QAAQ,EAAE,MAAM,GACf;IAAE,QAAQ,EAAE,MAAM,CAAC;IAAC,IAAI,EAAE,MAAM,CAAA;CAAE,EAAE,CAWtC;AAED,wBAAgB,cAAc,CAAC,QAAQ,EAAE,MAAM,GAAG,QAAQ,EAAE,CAW3D"}
@@ -0,0 +1,239 @@
1
+ import { readFile, writeFile, mkdir, stat } from "node:fs/promises";
2
+ import { join } from "node:path";
3
+ import { createHash } from "node:crypto";
4
+ import { config } from "./config.js";
5
+ const memoryCache = new Map();
6
+ function getFromMemoryCache(url) {
7
+ const entry = memoryCache.get(url);
8
+ if (!entry)
9
+ return null;
10
+ if (Date.now() - entry.timestamp > config.pageCacheMemoryTtlMs) {
11
+ memoryCache.delete(url);
12
+ return null;
13
+ }
14
+ // LRU: move to end
15
+ memoryCache.delete(url);
16
+ memoryCache.set(url, entry);
17
+ return entry.content;
18
+ }
19
+ function setMemoryCache(url, content) {
20
+ if (memoryCache.size >= config.pageCacheMemoryMax) {
21
+ const oldest = memoryCache.keys().next().value;
22
+ if (oldest !== undefined)
23
+ memoryCache.delete(oldest);
24
+ }
25
+ memoryCache.set(url, { content, timestamp: Date.now() });
26
+ }
27
+ export function clearMemoryCache() {
28
+ memoryCache.clear();
29
+ }
30
+ // --- Persistent disk page cache ---
31
+ function urlToHash(url) {
32
+ return createHash("sha256").update(url).digest("hex").slice(0, 16);
33
+ }
34
+ async function getFromDiskCache(url) {
35
+ try {
36
+ const filePath = join(config.pageCacheDir, `${urlToHash(url)}.md`);
37
+ const info = await stat(filePath);
38
+ if (Date.now() - info.mtimeMs > config.pageCacheDiskTtlMs)
39
+ return null;
40
+ return await readFile(filePath, "utf-8");
41
+ }
42
+ catch {
43
+ return null;
44
+ }
45
+ }
46
+ async function setDiskCache(url, content) {
47
+ try {
48
+ await mkdir(config.pageCacheDir, { recursive: true });
49
+ await writeFile(join(config.pageCacheDir, `${urlToHash(url)}.md`), content, "utf-8");
50
+ }
51
+ catch {
52
+ // non-critical
53
+ }
54
+ }
55
+ // --- AEM live markdown URL builder ---
56
+ //
57
+ // AEM live serves every page as clean markdown at <path>.md.
58
+ // E.g. https://www.aem.live/docs/faq → https://www.aem.live/docs/faq.md
59
+ export function buildMarkdownUrl(pageUrl) {
60
+ try {
61
+ const u = new URL(pageUrl);
62
+ // Strip trailing slash then append .md
63
+ u.pathname = u.pathname.replace(/\/+$/, "") + ".md";
64
+ return u.toString();
65
+ }
66
+ catch {
67
+ return pageUrl + ".md";
68
+ }
69
+ }
70
+ // --- Fetch helpers ---
71
+ const FETCH_HEADERS = { "User-Agent": config.userAgent };
72
+ async function fetchMarkdown(pageUrl) {
73
+ const mdUrl = buildMarkdownUrl(pageUrl);
74
+ try {
75
+ const res = await fetch(mdUrl, {
76
+ headers: { ...FETCH_HEADERS, Accept: "text/plain, text/markdown" },
77
+ redirect: "follow",
78
+ });
79
+ if (!res.ok)
80
+ return null;
81
+ const text = await res.text();
82
+ return text.length > 0 ? text : null;
83
+ }
84
+ catch {
85
+ return null;
86
+ }
87
+ }
88
+ async function fetchAndParseHtml(url) {
89
+ const res = await fetch(url, {
90
+ headers: { ...FETCH_HEADERS, Accept: "text/html" },
91
+ });
92
+ if (!res.ok) {
93
+ throw new Error(`Failed to fetch page: ${res.status} ${res.statusText}`);
94
+ }
95
+ return extractMainContent(await res.text());
96
+ }
97
+ // --- HTML → Markdown fallback ---
98
+ function extractMainContent(html) {
99
+ let content = html;
100
+ const mainMatch = content.match(/<main[^>]*>([\s\S]*?)<\/main>/i) ??
101
+ content.match(/<article[^>]*>([\s\S]*?)<\/article>/i) ??
102
+ content.match(/<div[^>]*class="[^"]*content[^"]*"[^>]*>([\s\S]*?)<\/div>/i);
103
+ if (mainMatch)
104
+ content = mainMatch[1];
105
+ return content
106
+ .replace(/<script[^>]*>[\s\S]*?<\/script>/gi, "")
107
+ .replace(/<style[^>]*>[\s\S]*?<\/style>/gi, "")
108
+ .replace(/<nav[^>]*>[\s\S]*?<\/nav>/gi, "")
109
+ .replace(/<footer[^>]*>[\s\S]*?<\/footer>/gi, "")
110
+ .replace(/<header[^>]*>[\s\S]*?<\/header>/gi, "")
111
+ .replace(/<h1[^>]*>([\s\S]*?)<\/h1>/gi, "\n# $1\n")
112
+ .replace(/<h2[^>]*>([\s\S]*?)<\/h2>/gi, "\n## $1\n")
113
+ .replace(/<h3[^>]*>([\s\S]*?)<\/h3>/gi, "\n### $1\n")
114
+ .replace(/<h4[^>]*>([\s\S]*?)<\/h4>/gi, "\n#### $1\n")
115
+ .replace(/<pre[^>]*><code[^>]*>([\s\S]*?)<\/code><\/pre>/gi, "\n```\n$1\n```\n")
116
+ .replace(/<code[^>]*>([\s\S]*?)<\/code>/gi, "`$1`")
117
+ .replace(/<a[^>]*href="([^"]*)"[^>]*>([\s\S]*?)<\/a>/gi, "[$2]($1)")
118
+ .replace(/<li[^>]*>([\s\S]*?)<\/li>/gi, "- $1\n")
119
+ .replace(/<\/?[uo]l[^>]*>/gi, "\n")
120
+ .replace(/<p[^>]*>([\s\S]*?)<\/p>/gi, "\n$1\n")
121
+ .replace(/<br\s*\/?>/gi, "\n")
122
+ .replace(/<\/div>/gi, "\n")
123
+ .replace(/<[^>]+>/g, "")
124
+ .replace(/&amp;/g, "&")
125
+ .replace(/&lt;/g, "<")
126
+ .replace(/&gt;/g, ">")
127
+ .replace(/&quot;/g, '"')
128
+ .replace(/&#39;/g, "'")
129
+ .replace(/&nbsp;/g, " ")
130
+ .replace(/\n{3,}/g, "\n\n")
131
+ .trim();
132
+ }
133
+ // --- AEM markdown cleanup ---
134
+ //
135
+ // AEM live markdown files use a table-based block syntax (not MDX) and include
136
+ // metadata rows. We normalise the content to clean, readable markdown.
137
+ export function cleanAemMarkdown(raw) {
138
+ let out = raw;
139
+ // Strip YAML-style frontmatter if present (some pages have it)
140
+ const frontmatterMatch = out.match(/^---\s*\r?\n[\s\S]*?\r?\n---\s*\r?\n/);
141
+ if (frontmatterMatch) {
142
+ out = out.slice(frontmatterMatch[0].length);
143
+ }
144
+ // AEM uses pipe-delimited Markdown tables for blocks. Keep them as-is since
145
+ // they are valid Markdown — no transformation needed.
146
+ // Collapse runs of 3+ blank lines down to 2
147
+ out = out.replace(/\n{3,}/g, "\n\n");
148
+ return out.trim();
149
+ }
150
+ // --- Smart truncation (heading-boundary aware) ---
151
+ export function smartTruncate(content, maxLen) {
152
+ if (content.length <= maxLen)
153
+ return content;
154
+ const lines = content.split("\n");
155
+ let charCount = 0;
156
+ let lastHeadingIdx = -1;
157
+ let lastBlankIdx = -1;
158
+ for (let i = 0; i < lines.length; i++) {
159
+ charCount += lines[i].length + 1;
160
+ if (charCount > maxLen)
161
+ break;
162
+ if (/^#{1,6}\s/.test(lines[i]))
163
+ lastHeadingIdx = i;
164
+ if (lines[i].trim() === "")
165
+ lastBlankIdx = i;
166
+ }
167
+ const threshold = lines.length * 0.3;
168
+ const cutIdx = lastHeadingIdx > threshold
169
+ ? lastHeadingIdx
170
+ : lastBlankIdx > threshold
171
+ ? lastBlankIdx
172
+ : -1;
173
+ if (cutIdx > 0) {
174
+ const remaining = lines.length - cutIdx;
175
+ return (lines.slice(0, cutIdx).join("\n").trim() +
176
+ `\n\n... [truncated — ${remaining} more lines]`);
177
+ }
178
+ return content.substring(0, maxLen) + "\n\n... [content truncated]";
179
+ }
180
+ // --- Fetch orchestration ---
181
+ async function fetchAndClean(url) {
182
+ const md = await fetchMarkdown(url);
183
+ if (md)
184
+ return cleanAemMarkdown(md);
185
+ // Fallback to HTML parsing
186
+ return fetchAndParseHtml(url);
187
+ }
188
+ export async function fetchPageContent(url) {
189
+ const memoryCached = getFromMemoryCache(url);
190
+ if (memoryCached)
191
+ return memoryCached;
192
+ const diskCached = await getFromDiskCache(url);
193
+ if (diskCached) {
194
+ const truncated = smartTruncate(diskCached, config.maxContentLength);
195
+ const result = `Source: ${url}\n\n${truncated}`;
196
+ setMemoryCache(url, result);
197
+ return result;
198
+ }
199
+ const rawContent = await fetchAndClean(url);
200
+ await setDiskCache(url, rawContent);
201
+ const truncated = smartTruncate(rawContent, config.maxContentLength);
202
+ const result = `Source: ${url}\n\n${truncated}`;
203
+ setMemoryCache(url, result);
204
+ return result;
205
+ }
206
+ export async function fetchRawContent(url) {
207
+ const diskCached = await getFromDiskCache(url);
208
+ if (diskCached)
209
+ return diskCached;
210
+ const rawContent = await fetchAndClean(url);
211
+ await setDiskCache(url, rawContent);
212
+ return rawContent;
213
+ }
214
+ // --- Content extraction helpers ---
215
+ export function extractCodeExamples(markdown) {
216
+ const examples = [];
217
+ const fenced = /```(\w*)\n([\s\S]*?)```/g;
218
+ let m;
219
+ while ((m = fenced.exec(markdown)) !== null) {
220
+ const code = m[2].trim();
221
+ if (code.length > 0) {
222
+ examples.push({ language: m[1] || "text", code });
223
+ }
224
+ }
225
+ return examples;
226
+ }
227
+ export function extractPageToc(markdown) {
228
+ const entries = [];
229
+ const heading = /^(#{1,6})\s+(.+)$/gm;
230
+ let m;
231
+ while ((m = heading.exec(markdown)) !== null) {
232
+ entries.push({
233
+ level: m[1].length,
234
+ title: m[2].trim().replace(/[`*_]/g, ""),
235
+ });
236
+ }
237
+ return entries;
238
+ }
239
+ //# sourceMappingURL=content.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"content.js","sourceRoot":"","sources":["../src/content.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,QAAQ,EAAE,SAAS,EAAE,KAAK,EAAE,IAAI,EAAE,MAAM,kBAAkB,CAAC;AACpE,OAAO,EAAE,IAAI,EAAE,MAAM,WAAW,CAAC;AACjC,OAAO,EAAE,UAAU,EAAE,MAAM,aAAa,CAAC;AACzC,OAAO,EAAE,MAAM,EAAE,MAAM,aAAa,CAAC;AAgBrC,MAAM,WAAW,GAAG,IAAI,GAAG,EAAyB,CAAC;AAErD,SAAS,kBAAkB,CAAC,GAAW;IACrC,MAAM,KAAK,GAAG,WAAW,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC;IACnC,IAAI,CAAC,KAAK;QAAE,OAAO,IAAI,CAAC;IACxB,IAAI,IAAI,CAAC,GAAG,EAAE,GAAG,KAAK,CAAC,SAAS,GAAG,MAAM,CAAC,oBAAoB,EAAE,CAAC;QAC/D,WAAW,CAAC,MAAM,CAAC,GAAG,CAAC,CAAC;QACxB,OAAO,IAAI,CAAC;IACd,CAAC;IACD,mBAAmB;IACnB,WAAW,CAAC,MAAM,CAAC,GAAG,CAAC,CAAC;IACxB,WAAW,CAAC,GAAG,CAAC,GAAG,EAAE,KAAK,CAAC,CAAC;IAC5B,OAAO,KAAK,CAAC,OAAO,CAAC;AACvB,CAAC;AAED,SAAS,cAAc,CAAC,GAAW,EAAE,OAAe;IAClD,IAAI,WAAW,CAAC,IAAI,IAAI,MAAM,CAAC,kBAAkB,EAAE,CAAC;QAClD,MAAM,MAAM,GAAG,WAAW,CAAC,IAAI,EAAE,CAAC,IAAI,EAAE,CAAC,KAAK,CAAC;QAC/C,IAAI,MAAM,KAAK,SAAS;YAAE,WAAW,CAAC,MAAM,CAAC,MAAM,CAAC,CAAC;IACvD,CAAC;IACD,WAAW,CAAC,GAAG,CAAC,GAAG,EAAE,EAAE,OAAO,EAAE,SAAS,EAAE,IAAI,CAAC,GAAG,EAAE,EAAE,CAAC,CAAC;AAC3D,CAAC;AAED,MAAM,UAAU,gBAAgB;IAC9B,WAAW,CAAC,KAAK,EAAE,CAAC;AACtB,CAAC;AAED,qCAAqC;AAErC,SAAS,SAAS,CAAC,GAAW;IAC5B,OAAO,UAAU,CAAC,QAAQ,CAAC,CAAC,MAAM,CAAC,GAAG,CAAC,CAAC,MAAM,CAAC,KAAK,CAAC,CAAC,KAAK,CAAC,CAAC,EAAE,EAAE,CAAC,CAAC;AACrE,CAAC;AAED,KAAK,UAAU,gBAAgB,CAAC,GAAW;IACzC,IAAI,CAAC;QACH,MAAM,QAAQ,GAAG,IAAI,CAAC,MAAM,CAAC,YAAY,EAAE,GAAG,SAAS,CAAC,GAAG,CAAC,KAAK,CAAC,CAAC;QACnE,MAAM,IAAI,GAAG,MAAM,IAAI,CAAC,QAAQ,CAAC,CAAC;QAClC,IAAI,IAAI,CAAC,GAAG,EAAE,GAAG,IAAI,CAAC,OAAO,GAAG,MAAM,CAAC,kBAAkB;YAAE,OAAO,IAAI,CAAC;QACvE,OAAO,MAAM,QAAQ,CAAC,QAAQ,EAAE,OAAO,CAAC,CAAC;IAC3C,CAAC;IAAC,MAAM,CAAC;QACP,OAAO,IAAI,CAAC;IACd,CAAC;AACH,CAAC;AAED,KAAK,UAAU,YAAY,CAAC,GAAW,EAAE,OAAe;IACtD,IAAI,CAAC;QACH,MAAM,KAAK,CAAC,MAAM,CAAC,YAAY,EAAE,EAAE,SAAS,EAAE,IAAI,EAAE,CAAC,CAAC;QACtD,MAAM,SAAS,CACb,IAAI,CAAC,MAAM,CAAC,YAAY,EAAE,GAAG,SAAS,CAAC,GAAG,CAAC,KAAK,CAAC,EACjD,OAAO,EACP,OAAO,CACR,CAAC;IACJ,CAAC;IAAC,MAAM,CAAC;QACP,eAAe;IACjB,CAAC;AACH,CAAC;AAED,wCAAwC;AACxC,EAAE;AACF,6DAA6D;AAC7D,0EAA0E;AAE1E,MAAM,UAAU,gBAAgB,CAAC,OAAe;IAC9C,IAAI,CAAC;QACH,MAAM,CAAC,GAAG,IAAI,GAAG,CAAC,OAAO,CAAC,CAAC;QAC3B,uCAAuC;QACvC,CAAC,CAAC,QAAQ,GAAG,CAAC,CAAC,QAAQ,CAAC,OAAO,CAAC,MAAM,EAAE,EAAE,CAAC,GAAG,KAAK,CAAC;QACpD,OAAO,CAAC,CAAC,QAAQ,EAAE,CAAC;IACtB,CAAC;IAAC,MAAM,CAAC;QACP,OAAO,OAAO,GAAG,KAAK,CAAC;IACzB,CAAC;AACH,CAAC;AAED,wBAAwB;AAExB,MAAM,aAAa,GAAG,EAAE,YAAY,EAAE,MAAM,CAAC,SAAS,EAAE,CAAC;AAEzD,KAAK,UAAU,aAAa,CAAC,OAAe;IAC1C,MAAM,KAAK,GAAG,gBAAgB,CAAC,OAAO,CAAC,CAAC;IACxC,IAAI,CAAC;QACH,MAAM,GAAG,GAAG,MAAM,KAAK,CAAC,KAAK,EAAE;YAC7B,OAAO,EAAE,EAAE,GAAG,aAAa,EAAE,MAAM,EAAE,2BAA2B,EAAE;YAClE,QAAQ,EAAE,QAAQ;SACnB,CAAC,CAAC;QACH,IAAI,CAAC,GAAG,CAAC,EAAE;YAAE,OAAO,IAAI,CAAC;QACzB,MAAM,IAAI,GAAG,MAAM,GAAG,CAAC,IAAI,EAAE,CAAC;QAC9B,OAAO,IAAI,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,IAAI,CAAC,CAAC,CAAC,IAAI,CAAC;IACvC,CAAC;IAAC,MAAM,CAAC;QACP,OAAO,IAAI,CAAC;IACd,CAAC;AACH,CAAC;AAED,KAAK,UAAU,iBAAiB,CAAC,GAAW;IAC1C,MAAM,GAAG,GAAG,MAAM,KAAK,CAAC,GAAG,EAAE;QAC3B,OAAO,EAAE,EAAE,GAAG,aAAa,EAAE,MAAM,EAAE,WAAW,EAAE;KACnD,CAAC,CAAC;IACH,IAAI,CAAC,GAAG,CAAC,EAAE,EAAE,CAAC;QACZ,MAAM,IAAI,KAAK,CAAC,yBAAyB,GAAG,CAAC,MAAM,IAAI,GAAG,CAAC,UAAU,EAAE,CAAC,CAAC;IAC3E,CAAC;IACD,OAAO,kBAAkB,CAAC,MAAM,GAAG,CAAC,IAAI,EAAE,CAAC,CAAC;AAC9C,CAAC;AAED,mCAAmC;AAEnC,SAAS,kBAAkB,CAAC,IAAY;IACtC,IAAI,OAAO,GAAG,IAAI,CAAC;IACnB,MAAM,SAAS,GACb,OAAO,CAAC,KAAK,CAAC,gCAAgC,CAAC;QAC/C,OAAO,CAAC,KAAK,CAAC,sCAAsC,CAAC;QACrD,OAAO,CAAC,KAAK,CAAC,4DAA4D,CAAC,CAAC;IAE9E,IAAI,SAAS;QAAE,OAAO,GAAG,SAAS,CAAC,CAAC,CAAC,CAAC;IAEtC,OAAO,OAAO;SACX,OAAO,CAAC,mCAAmC,EAAE,EAAE,CAAC;SAChD,OAAO,CAAC,iCAAiC,EAAE,EAAE,CAAC;SAC9C,OAAO,CAAC,6BAA6B,EAAE,EAAE,CAAC;SAC1C,OAAO,CAAC,mCAAmC,EAAE,EAAE,CAAC;SAChD,OAAO,CAAC,mCAAmC,EAAE,EAAE,CAAC;SAChD,OAAO,CAAC,6BAA6B,EAAE,UAAU,CAAC;SAClD,OAAO,CAAC,6BAA6B,EAAE,WAAW,CAAC;SACnD,OAAO,CAAC,6BAA6B,EAAE,YAAY,CAAC;SACpD,OAAO,CAAC,6BAA6B,EAAE,aAAa,CAAC;SACrD,OAAO,CAAC,kDAAkD,EAAE,kBAAkB,CAAC;SAC/E,OAAO,CAAC,iCAAiC,EAAE,MAAM,CAAC;SAClD,OAAO,CAAC,8CAA8C,EAAE,UAAU,CAAC;SACnE,OAAO,CAAC,6BAA6B,EAAE,QAAQ,CAAC;SAChD,OAAO,CAAC,mBAAmB,EAAE,IAAI,CAAC;SAClC,OAAO,CAAC,2BAA2B,EAAE,QAAQ,CAAC;SAC9C,OAAO,CAAC,cAAc,EAAE,IAAI,CAAC;SAC7B,OAAO,CAAC,WAAW,EAAE,IAAI,CAAC;SAC1B,OAAO,CAAC,UAAU,EAAE,EAAE,CAAC;SACvB,OAAO,CAAC,QAAQ,EAAE,GAAG,CAAC;SACtB,OAAO,CAAC,OAAO,EAAE,GAAG,CAAC;SACrB,OAAO,CAAC,OAAO,EAAE,GAAG,CAAC;SACrB,OAAO,CAAC,SAAS,EAAE,GAAG,CAAC;SACvB,OAAO,CAAC,QAAQ,EAAE,GAAG,CAAC;SACtB,OAAO,CAAC,SAAS,EAAE,GAAG,CAAC;SACvB,OAAO,CAAC,SAAS,EAAE,MAAM,CAAC;SAC1B,IAAI,EAAE,CAAC;AACZ,CAAC;AAED,+BAA+B;AAC/B,EAAE;AACF,+EAA+E;AAC/E,uEAAuE;AAEvE,MAAM,UAAU,gBAAgB,CAAC,GAAW;IAC1C,IAAI,GAAG,GAAG,GAAG,CAAC;IAEd,+DAA+D;IAC/D,MAAM,gBAAgB,GAAG,GAAG,CAAC,KAAK,CAAC,sCAAsC,CAAC,CAAC;IAC3E,IAAI,gBAAgB,EAAE,CAAC;QACrB,GAAG,GAAG,GAAG,CAAC,KAAK,CAAC,gBAAgB,CAAC,CAAC,CAAC,CAAC,MAAM,CAAC,CAAC;IAC9C,CAAC;IAED,4EAA4E;IAC5E,sDAAsD;IAEtD,4CAA4C;IAC5C,GAAG,GAAG,GAAG,CAAC,OAAO,CAAC,SAAS,EAAE,MAAM,CAAC,CAAC;IAErC,OAAO,GAAG,CAAC,IAAI,EAAE,CAAC;AACpB,CAAC;AAED,oDAAoD;AAEpD,MAAM,UAAU,aAAa,CAAC,OAAe,EAAE,MAAc;IAC3D,IAAI,OAAO,CAAC,MAAM,IAAI,MAAM;QAAE,OAAO,OAAO,CAAC;IAE7C,MAAM,KAAK,GAAG,OAAO,CAAC,KAAK,CAAC,IAAI,CAAC,CAAC;IAClC,IAAI,SAAS,GAAG,CAAC,CAAC;IAClB,IAAI,cAAc,GAAG,CAAC,CAAC,CAAC;IACxB,IAAI,YAAY,GAAG,CAAC,CAAC,CAAC;IAEtB,KAAK,IAAI,CAAC,GAAG,CAAC,EAAE,CAAC,GAAG,KAAK,CAAC,MAAM,EAAE,CAAC,EAAE,EAAE,CAAC;QACtC,SAAS,IAAI,KAAK,CAAC,CAAC,CAAC,CAAC,MAAM,GAAG,CAAC,CAAC;QACjC,IAAI,SAAS,GAAG,MAAM;YAAE,MAAM;QAC9B,IAAI,WAAW,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC;YAAE,cAAc,GAAG,CAAC,CAAC;QACnD,IAAI,KAAK,CAAC,CAAC,CAAC,CAAC,IAAI,EAAE,KAAK,EAAE;YAAE,YAAY,GAAG,CAAC,CAAC;IAC/C,CAAC;IAED,MAAM,SAAS,GAAG,KAAK,CAAC,MAAM,GAAG,GAAG,CAAC;IACrC,MAAM,MAAM,GACV,cAAc,GAAG,SAAS;QACxB,CAAC,CAAC,cAAc;QAChB,CAAC,CAAC,YAAY,GAAG,SAAS;YACxB,CAAC,CAAC,YAAY;YACd,CAAC,CAAC,CAAC,CAAC,CAAC;IAEX,IAAI,MAAM,GAAG,CAAC,EAAE,CAAC;QACf,MAAM,SAAS,GAAG,KAAK,CAAC,MAAM,GAAG,MAAM,CAAC;QACxC,OAAO,CACL,KAAK,CAAC,KAAK,CAAC,CAAC,EAAE,MAAM,CAAC,CAAC,IAAI,CAAC,IAAI,CAAC,CAAC,IAAI,EAAE;YACxC,wBAAwB,SAAS,cAAc,CAChD,CAAC;IACJ,CAAC;IAED,OAAO,OAAO,CAAC,SAAS,CAAC,CAAC,EAAE,MAAM,CAAC,GAAG,6BAA6B,CAAC;AACtE,CAAC;AAED,8BAA8B;AAE9B,KAAK,UAAU,aAAa,CAAC,GAAW;IACtC,MAAM,EAAE,GAAG,MAAM,aAAa,CAAC,GAAG,CAAC,CAAC;IACpC,IAAI,EAAE;QAAE,OAAO,gBAAgB,CAAC,EAAE,CAAC,CAAC;IACpC,2BAA2B;IAC3B,OAAO,iBAAiB,CAAC,GAAG,CAAC,CAAC;AAChC,CAAC;AAED,MAAM,CAAC,KAAK,UAAU,gBAAgB,CAAC,GAAW;IAChD,MAAM,YAAY,GAAG,kBAAkB,CAAC,GAAG,CAAC,CAAC;IAC7C,IAAI,YAAY;QAAE,OAAO,YAAY,CAAC;IAEtC,MAAM,UAAU,GAAG,MAAM,gBAAgB,CAAC,GAAG,CAAC,CAAC;IAC/C,IAAI,UAAU,EAAE,CAAC;QACf,MAAM,SAAS,GAAG,aAAa,CAAC,UAAU,EAAE,MAAM,CAAC,gBAAgB,CAAC,CAAC;QACrE,MAAM,MAAM,GAAG,WAAW,GAAG,OAAO,SAAS,EAAE,CAAC;QAChD,cAAc,CAAC,GAAG,EAAE,MAAM,CAAC,CAAC;QAC5B,OAAO,MAAM,CAAC;IAChB,CAAC;IAED,MAAM,UAAU,GAAG,MAAM,aAAa,CAAC,GAAG,CAAC,CAAC;IAC5C,MAAM,YAAY,CAAC,GAAG,EAAE,UAAU,CAAC,CAAC;IAEpC,MAAM,SAAS,GAAG,aAAa,CAAC,UAAU,EAAE,MAAM,CAAC,gBAAgB,CAAC,CAAC;IACrE,MAAM,MAAM,GAAG,WAAW,GAAG,OAAO,SAAS,EAAE,CAAC;IAChD,cAAc,CAAC,GAAG,EAAE,MAAM,CAAC,CAAC;IAC5B,OAAO,MAAM,CAAC;AAChB,CAAC;AAED,MAAM,CAAC,KAAK,UAAU,eAAe,CAAC,GAAW;IAC/C,MAAM,UAAU,GAAG,MAAM,gBAAgB,CAAC,GAAG,CAAC,CAAC;IAC/C,IAAI,UAAU;QAAE,OAAO,UAAU,CAAC;IAElC,MAAM,UAAU,GAAG,MAAM,aAAa,CAAC,GAAG,CAAC,CAAC;IAC5C,MAAM,YAAY,CAAC,GAAG,EAAE,UAAU,CAAC,CAAC;IACpC,OAAO,UAAU,CAAC;AACpB,CAAC;AAED,qCAAqC;AAErC,MAAM,UAAU,mBAAmB,CACjC,QAAgB;IAEhB,MAAM,QAAQ,GAAyC,EAAE,CAAC;IAC1D,MAAM,MAAM,GAAG,0BAA0B,CAAC;IAC1C,IAAI,CAAC,CAAC;IACN,OAAO,CAAC,CAAC,GAAG,MAAM,CAAC,IAAI,CAAC,QAAQ,CAAC,CAAC,KAAK,IAAI,EAAE,CAAC;QAC5C,MAAM,IAAI,GAAG,CAAC,CAAC,CAAC,CAAC,CAAC,IAAI,EAAE,CAAC;QACzB,IAAI,IAAI,CAAC,MAAM,GAAG,CAAC,EAAE,CAAC;YACpB,QAAQ,CAAC,IAAI,CAAC,EAAE,QAAQ,EAAE,CAAC,CAAC,CAAC,CAAC,IAAI,MAAM,EAAE,IAAI,EAAE,CAAC,CAAC;QACpD,CAAC;IACH,CAAC;IACD,OAAO,QAAQ,CAAC;AAClB,CAAC;AAED,MAAM,UAAU,cAAc,CAAC,QAAgB;IAC7C,MAAM,OAAO,GAAe,EAAE,CAAC;IAC/B,MAAM,OAAO,GAAG,qBAAqB,CAAC;IACtC,IAAI,CAAC,CAAC;IACN,OAAO,CAAC,CAAC,GAAG,OAAO,CAAC,IAAI,CAAC,QAAQ,CAAC,CAAC,KAAK,IAAI,EAAE,CAAC;QAC7C,OAAO,CAAC,IAAI,CAAC;YACX,KAAK,EAAE,CAAC,CAAC,CAAC,CAAC,CAAC,MAAM;YAClB,KAAK,EAAE,CAAC,CAAC,CAAC,CAAC,CAAC,IAAI,EAAE,CAAC,OAAO,CAAC,QAAQ,EAAE,EAAE,CAAC;SACzC,CAAC,CAAC;IACL,CAAC;IACD,OAAO,OAAO,CAAC;AACjB,CAAC"}
@@ -0,0 +1,3 @@
1
+ #!/usr/bin/env node
2
+ export {};
3
+ //# sourceMappingURL=index.d.ts.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":""}