mdzilla 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -23,11 +23,11 @@ Works best with [Docus](https://docus.dev)/[Undocs](https://undocs.pages.dev/) d
23
23
  ## Quick Start
24
24
 
25
25
  ```sh
26
- npx mdzilla <dir> # Browse local docs directory
27
- npx mdzilla <file.md> # Render a single markdown file
28
- npx mdzilla gh:owner/repo # Browse GitHub repo docs
29
- npx mdzilla npm:package-name # Browse npm package docs
30
- npx mdzilla https://example.com # Browse remote docs via HTTP
26
+ npx mdzilla <source> # Open docs in browser
27
+ npx mdzilla <source> <path> # Render a specific page
28
+ npx mdzilla <source> <query> # Search docs
29
+ npx mdzilla <file.md> # Render a single markdown file
30
+ npx mdzilla <source> --export <outdir> # Export docs to flat .md files
31
31
  ```
32
32
 
33
33
  ## Agent Skill
@@ -58,18 +58,18 @@ Flatten any docs source into plain `.md` files:
58
58
  npx mdzilla <source> --export <outdir>
59
59
  ```
60
60
 
61
- ### Single Page
61
+ ### Smart Resolve
62
62
 
63
- Print a specific page by path and exit:
63
+ The second positional argument is smart-resolved: if it matches a navigation path, the page is rendered; otherwise it's treated as a search query.
64
64
 
65
65
  ```sh
66
- npx mdzilla gh:nuxt/nuxt --page /getting-started/seo-meta
67
- npx mdzilla gh:nuxt/nuxt --plain --page /getting-started/seo-meta
66
+ npx mdzilla gh:unjs/h3 /guide/basics # Render a specific page
67
+ npx mdzilla gh:unjs/h3 router # Search for 'router'
68
68
  ```
69
69
 
70
- ### Headless Mode
70
+ ### Plain Mode
71
71
 
72
- Use `--plain` (or `--headless`) for non-interactive output — works like `cat` but for rendered markdown. Auto-enabled when piping output or when called by AI agents.
72
+ Use `--plain` for plain text output. Auto-enabled when piping output or when called by AI agents.
73
73
 
74
74
  ```sh
75
75
  npx mdzilla README.md --plain # Pretty-print a markdown file
@@ -77,69 +77,56 @@ npx mdzilla README.md | head # Auto-plain when piped (no TTY)
77
77
  npx mdzilla gh:unjs/h3 --plain # List all pages in plain text
78
78
  ```
79
79
 
80
- ### Keyboard Controls
81
-
82
- <details>
83
- <summary><strong>Browse mode</strong></summary>
84
-
85
- | Key | Action |
86
- | :-------------------- | :------------------- |
87
- | `↑` `↓` / `j` `k` | Navigate entries |
88
- | `Enter` / `Tab` / `→` | Focus content |
89
- | `Space` / `PgDn` | Page down |
90
- | `b` / `PgUp` | Page up |
91
- | `g` / `G` | Jump to first / last |
92
- | `/` | Search |
93
- | `t` | Toggle sidebar |
94
- | `q` | Quit |
80
+ ## Programmatic API
95
81
 
96
- </details>
82
+ ### Export Docs
97
83
 
98
- <details>
99
- <summary><strong>Content mode</strong></summary>
100
-
101
- | Key | Action |
102
- | :------------------ | :-------------------- |
103
- | `↑` `↓` / `j` `k` | Scroll |
104
- | `Space` / `PgDn` | Page down |
105
- | `b` / `PgUp` | Page up |
106
- | `g` / `G` | Jump to top / bottom |
107
- | `/` | Search in page |
108
- | `n` / `N` | Next / previous match |
109
- | `Tab` / `Shift+Tab` | Cycle links |
110
- | `Enter` | Open link |
111
- | `Backspace` / `Esc` | Back to nav |
112
- | `q` | Quit |
84
+ One-call export — resolves source, loads, and writes flat `.md` files:
113
85
 
114
- </details>
86
+ ```js
87
+ import { exportSource } from "mdzilla";
115
88
 
116
- <details>
117
- <summary><strong>Search mode</strong></summary>
89
+ await exportSource("./docs", "./dist/docs", {
90
+ title: "My Docs",
91
+ filter: (e) => !e.entry.path.startsWith("/blog"),
92
+ });
118
93
 
119
- | Key | Action |
120
- | :------ | :--------------- |
121
- | _Type_ | Filter results |
122
- | `↑` `↓` | Navigate results |
123
- | `Enter` | Confirm |
124
- | `Esc` | Cancel |
94
+ // Works with any source
95
+ await exportSource("gh:unjs/h3", "./dist/h3-docs");
96
+ await exportSource("npm:h3", "./dist/h3-docs", { plainText: true });
97
+ await exportSource("https://h3.unjs.io", "./dist/h3-docs");
98
+ ```
125
99
 
126
- </details>
100
+ ### Collection
127
101
 
128
- ## Programmatic API
102
+ `Collection` is the main class for working with documentation programmatically — browse the nav tree, read page content, search, and filter entries.
129
103
 
130
104
  ```js
131
- import { DocsManager, DocsSourceFS } from "mdzilla";
105
+ import { Collection, resolveSource } from "mdzilla";
132
106
 
133
- const docs = new DocsManager(new DocsSourceFS("./docs"));
107
+ const docs = new Collection(resolveSource("./docs"));
134
108
  await docs.load();
135
109
 
136
- // Browse the navigation tree
137
- console.log(docs.tree);
110
+ docs.tree; // NavEntry[] nested navigation tree
111
+ docs.flat; // FlatEntry[] — flattened list with depth info
112
+ docs.pages; // FlatEntry[] — only navigable pages (no directory stubs)
113
+
114
+ // Read page content
115
+ const page = docs.findByPath("/guide/installation");
116
+ const content = await docs.getContent(page);
138
117
 
139
- // Get page content
140
- const content = await docs.getContent(docs.flat[0]);
118
+ // Resolve a page flexibly (exact match, prefix stripping, direct fetch)
119
+ const { entry, raw } = await docs.resolvePage("/docs/guide/installation");
120
+
121
+ // Fuzzy search
122
+ const results = docs.filter("instal"); // sorted by match score
123
+
124
+ // Substring match (returns indices into docs.flat)
125
+ const indices = docs.matchIndices("getting started");
141
126
  ```
142
127
 
128
+ `resolveSource` auto-detects the source type from the input string (`gh:`, `npm:`, `https://`, or local path). You can also use specific source classes directly (`FSSource`, `GitSource`, `NpmSource`, `HTTPSource`).
129
+
143
130
  ## Development
144
131
 
145
132
  <details>
@@ -1,10 +1,10 @@
1
+ import { parseMeta, renderToMarkdown, renderToText } from "md4x";
1
2
  import { mkdir, readFile, readdir, stat, writeFile } from "node:fs/promises";
2
3
  import { basename, dirname, extname, join } from "node:path";
3
- import { parseMeta, renderToMarkdown, renderToText } from "md4x";
4
4
  import { existsSync } from "node:fs";
5
5
  import { tmpdir } from "node:os";
6
- //#region src/docs/manager.ts
7
- var DocsManager = class {
6
+ //#region src/collection.ts
7
+ var Collection = class {
8
8
  source;
9
9
  tree = [];
10
10
  flat = [];
@@ -36,10 +36,51 @@ var DocsManager = class {
36
36
  invalidate(filePath) {
37
37
  this._contentCache.delete(filePath);
38
38
  }
39
- /** Fuzzy filter flat entries by query string. */
39
+ /** Fuzzy filter flat entries by query string (title and path only). */
40
40
  filter(query) {
41
41
  return fuzzyFilter(this.flat, query, ({ entry }) => [entry.title, entry.path]);
42
42
  }
43
+ /** Search flat entries by query string, including page contents. Yields scored results as found. */
44
+ async *search(query) {
45
+ if (!query) return;
46
+ const lower = query.toLowerCase();
47
+ const terms = lower.split(/\s+/).filter(Boolean);
48
+ const matchAll = (text) => terms.every((t) => text.includes(t));
49
+ const seen = /* @__PURE__ */ new Set();
50
+ for (const flat of this.flat) {
51
+ if (flat.entry.page === false) continue;
52
+ if (seen.has(flat.entry.path)) continue;
53
+ seen.add(flat.entry.path);
54
+ const titleLower = flat.entry.title.toLowerCase();
55
+ const titleMatch = matchAll(titleLower) || matchAll(flat.entry.path.toLowerCase());
56
+ const content = await this.getContent(flat);
57
+ const contentLower = content?.toLowerCase();
58
+ const contentHit = contentLower ? matchAll(contentLower) : false;
59
+ if (!titleMatch && !contentHit) continue;
60
+ let score = 300;
61
+ let heading;
62
+ if (titleMatch) score = titleLower === lower ? 0 : 100;
63
+ else if (content) {
64
+ const meta = parseMeta(content);
65
+ for (const h of meta.headings || []) {
66
+ const hLower = h.text.toLowerCase();
67
+ if (matchAll(hLower)) {
68
+ score = hLower === lower ? 150 : 200;
69
+ heading = h.text;
70
+ break;
71
+ }
72
+ }
73
+ }
74
+ const contentMatches = content ? findMatchLines(content, lower) : [];
75
+ yield {
76
+ flat,
77
+ score,
78
+ titleMatch,
79
+ heading,
80
+ contentMatches
81
+ };
82
+ }
83
+ }
43
84
  /** Flat entries that are navigable pages (excludes directory stubs). */
44
85
  get pages() {
45
86
  return this.flat.filter((f) => f.entry.page !== false);
@@ -79,23 +120,18 @@ var DocsManager = class {
79
120
  if (raw) return { raw };
80
121
  return {};
81
122
  }
82
- /** Return indices of matching flat entries (case-insensitive substring). */
83
- matchIndices(query) {
84
- if (!query) return [];
85
- const lower = query.toLowerCase();
86
- const matched = /* @__PURE__ */ new Set();
87
- for (let i = 0; i < this.flat.length; i++) {
88
- const { entry } = this.flat[i];
89
- if (entry.title.toLowerCase().includes(lower) || entry.path.toLowerCase().includes(lower)) {
90
- matched.add(i);
91
- const parentDepth = this.flat[i].depth;
92
- for (let j = i + 1; j < this.flat.length; j++) {
93
- if (this.flat[j].depth <= parentDepth) break;
94
- matched.add(j);
95
- }
96
- }
123
+ /** Suggest related pages for a query (fuzzy + keyword fallback). */
124
+ suggest(query, max = 5) {
125
+ let results = this.filter(query);
126
+ if (results.length > 0) return results.slice(0, max);
127
+ const segments = query.replace(/^\/+/, "").split("/").filter(Boolean);
128
+ const lastSegment = segments.at(-1);
129
+ if (lastSegment && lastSegment !== query) {
130
+ results = this.filter(lastSegment);
131
+ if (results.length > 0) return results.slice(0, max);
97
132
  }
98
- return [...matched].sort((a, b) => a - b);
133
+ const keywords = segments.flatMap((s) => s.split("-")).filter(Boolean);
134
+ return this.pages.filter((f) => keywords.some((kw) => f.entry.title.toLowerCase().includes(kw) || f.entry.path.toLowerCase().includes(kw))).slice(0, max);
99
135
  }
100
136
  };
101
137
  function flattenTree(entries, depth, fileMap) {
@@ -144,7 +180,7 @@ function fuzzyFilter(items, query, getText) {
144
180
  let best = Infinity;
145
181
  for (const text of getText(item)) {
146
182
  const s = fuzzyMatch(query, text);
147
- if (s >= 0 && s < best) best = s;
183
+ if (s !== -1 && s < best) best = s;
148
184
  }
149
185
  if (best < Infinity) scored.push({
150
186
  item,
@@ -154,11 +190,52 @@ function fuzzyFilter(items, query, getText) {
154
190
  scored.sort((a, b) => a.score - b.score);
155
191
  return scored.map((s) => s.item);
156
192
  }
193
+ function findMatchLines(content, lowerQuery, contextLines = 1) {
194
+ const matches = [];
195
+ const lines = content.split("\n");
196
+ for (let i = 0; i < lines.length; i++) if (lines[i].toLowerCase().includes(lowerQuery)) {
197
+ const context = [];
198
+ for (let j = Math.max(0, i - contextLines); j <= Math.min(lines.length - 1, i + contextLines); j++) if (j !== i) context.push(lines[j].trim());
199
+ matches.push({
200
+ line: i + 1,
201
+ text: lines[i].trim(),
202
+ context
203
+ });
204
+ }
205
+ return matches;
206
+ }
157
207
  //#endregion
158
- //#region src/docs/sources/_base.ts
159
- var DocsSource = class {};
208
+ //#region src/utils.ts
209
+ /** Extract short text snippets around matching terms. */
210
+ function extractSnippets(content, terms, opts = {}) {
211
+ const { maxSnippets = 3, radius = 80 } = opts;
212
+ const lower = content.toLowerCase();
213
+ const positions = [];
214
+ for (const term of terms) {
215
+ let idx = lower.indexOf(term);
216
+ while (idx !== -1 && positions.length < maxSnippets * 2) {
217
+ positions.push(idx);
218
+ idx = lower.indexOf(term, idx + term.length);
219
+ }
220
+ }
221
+ positions.sort((a, b) => a - b);
222
+ const snippets = [];
223
+ let prevEnd = -1;
224
+ for (const pos of positions) {
225
+ if (snippets.length >= maxSnippets) break;
226
+ const start = Math.max(0, pos - radius);
227
+ const end = Math.min(content.length, pos + radius);
228
+ if (start <= prevEnd) continue;
229
+ prevEnd = end;
230
+ let snippet = content.slice(start, end).trim().replaceAll(/\s+/g, " ");
231
+ if (start > 0) snippet = "…" + snippet;
232
+ if (end < content.length) snippet = snippet + "…";
233
+ snippets.push(snippet);
234
+ }
235
+ return snippets;
236
+ }
160
237
  //#endregion
161
- //#region src/docs/nav.ts
238
+ //#region src/nav.ts
162
239
  /**
163
240
  * Parse a numbered filename/dirname like "1.guide" or "3.middleware.md"
164
241
  * into { order, slug }. Also strips `.draft` suffix.
@@ -301,8 +378,11 @@ async function _scanNav(dirPath, parentPath, options) {
301
378
  return entries;
302
379
  }
303
380
  //#endregion
304
- //#region src/docs/sources/fs.ts
305
- var DocsSourceFS = class extends DocsSource {
381
+ //#region src/sources/_base.ts
382
+ var Source = class {};
383
+ //#endregion
384
+ //#region src/sources/fs.ts
385
+ var FSSource = class extends Source {
306
386
  dir;
307
387
  constructor(dir) {
308
388
  super();
@@ -369,8 +449,8 @@ function reorderTree(entries, manifest) {
369
449
  }
370
450
  }
371
451
  //#endregion
372
- //#region src/docs/sources/git.ts
373
- var DocsSourceGit = class extends DocsSource {
452
+ //#region src/sources/git.ts
453
+ var GitSource = class extends Source {
374
454
  src;
375
455
  options;
376
456
  _fs;
@@ -398,16 +478,16 @@ var DocsSourceGit = class extends DocsSource {
398
478
  break;
399
479
  }
400
480
  }
401
- this._fs = new DocsSourceFS(docsDir);
481
+ this._fs = new FSSource(docsDir);
402
482
  return this._fs.load();
403
483
  }
404
484
  async readContent(filePath) {
405
- if (!this._fs) throw new Error("DocsSourceGit: call load() before readContent()");
485
+ if (!this._fs) throw new Error("GitSource: call load() before readContent()");
406
486
  return this._fs.readContent(filePath);
407
487
  }
408
488
  };
409
489
  //#endregion
410
- //#region src/docs/sources/_npm.ts
490
+ //#region src/sources/_npm.ts
411
491
  /**
412
492
  * Parse an npm package spec: `[@scope/]name[@version][/subdir]`
413
493
  */
@@ -465,8 +545,8 @@ function parseNpmURL(url) {
465
545
  if (shortMatch && !/^(package|settings|signup|login|org|search)$/.test(shortMatch[1])) return shortMatch[1];
466
546
  }
467
547
  //#endregion
468
- //#region src/docs/sources/http.ts
469
- var DocsSourceHTTP = class extends DocsSource {
548
+ //#region src/sources/http.ts
549
+ var HTTPSource = class extends Source {
470
550
  url;
471
551
  options;
472
552
  _contentCache = /* @__PURE__ */ new Map();
@@ -727,8 +807,8 @@ function _resolveHref(href, baseURL) {
727
807
  }
728
808
  }
729
809
  //#endregion
730
- //#region src/docs/sources/npm.ts
731
- var DocsSourceNpm = class extends DocsSource {
810
+ //#region src/sources/npm.ts
811
+ var NpmSource = class extends Source {
732
812
  src;
733
813
  options;
734
814
  _fs;
@@ -758,11 +838,11 @@ var DocsSourceNpm = class extends DocsSource {
758
838
  break;
759
839
  }
760
840
  }
761
- this._fs = new DocsSourceFS(docsDir);
841
+ this._fs = new FSSource(docsDir);
762
842
  return this._fs.load();
763
843
  }
764
844
  async readContent(filePath) {
765
- if (!this._fs) throw new Error("DocsSourceNpm: call load() before readContent()");
845
+ if (!this._fs) throw new Error("NpmSource: call load() before readContent()");
766
846
  return this._fs.readContent(filePath);
767
847
  }
768
848
  };
@@ -777,31 +857,65 @@ async function npmProvider(input) {
777
857
  };
778
858
  }
779
859
  //#endregion
780
- //#region src/docs/exporter.ts
860
+ //#region src/source.ts
861
+ /**
862
+ * Resolve a source string to the appropriate Source instance.
863
+ *
864
+ * Supports: local paths, `gh:owner/repo`, `npm:package`, `http(s)://...`
865
+ */
866
+ function resolveSource(input) {
867
+ if (input.startsWith("http://") || input.startsWith("https://")) return new HTTPSource(input);
868
+ if (input.startsWith("gh:")) return new GitSource(input);
869
+ if (input.startsWith("npm:")) return new NpmSource(input);
870
+ return new FSSource(input);
871
+ }
872
+ //#endregion
873
+ //#region src/exporter.ts
874
+ /**
875
+ * High-level export: resolve source, load, and export in one call.
876
+ *
877
+ * ```ts
878
+ * await exportSource("./docs", "./dist/docs");
879
+ * await exportSource("gh:unjs/h3", "./dist/h3-docs");
880
+ * await exportSource("npm:h3", "./dist/h3-docs", { plainText: true });
881
+ * await exportSource("https://h3.unjs.io", "./dist/h3-docs");
882
+ * ```
883
+ */
884
+ async function exportSource(input, dir, options = {}) {
885
+ const collection = new Collection(typeof input === "string" ? resolveSource(input) : input);
886
+ await collection.load();
887
+ await mkdir(dir, { recursive: true });
888
+ await writeCollection(collection, dir, options);
889
+ return collection;
890
+ }
781
891
  /** Paths to skip during export (generated by source, not actual docs) */
782
892
  const IGNORED_PATHS = new Set(["/llms.txt", "/llms-full.txt"]);
783
893
  /**
784
894
  * Export documentation entries to a local filesystem directory as flat `.md` files.
785
895
  *
786
- * Each entry is written to `<dir>/<path>.md` (or `<dir>/<path>/index.md` for directory
787
- * index pages). Navigation order is preserved via `order` frontmatter in pages and
788
- * `.navigation.yml` files in directories.
896
+ * Each entry is written to `<dir>/<prefix>.<slug>.md` (or `<dir>/<prefix>.<slug>/index.md`
897
+ * for directory index pages). Navigation order is preserved via numeric prefixes on
898
+ * directories and files (e.g., `1.guide/`, `2.getting-started.md`) so the nav scanner
899
+ * can infer order without additional metadata files.
789
900
  *
790
901
  * A `README.md` table of contents is generated at the root of the output directory.
791
902
  */
792
- async function exportDocsToFS(manager, dir, options = {}) {
793
- const rootEntry = manager.flat.find((f) => f.entry.path === "/");
903
+ async function writeCollection(collection, dir, options = {}) {
904
+ const rootEntry = collection.flat.find((f) => f.entry.path === "/");
794
905
  const tocLines = [`# ${options.title ?? rootEntry?.entry.title ?? "Table of Contents"}`, ""];
795
906
  const writtenFiles = /* @__PURE__ */ new Set();
907
+ const pathMap = /* @__PURE__ */ new Map();
908
+ buildNumberedPaths(collection.tree, "", pathMap);
796
909
  const dirPaths = /* @__PURE__ */ new Set();
797
- collectDirPaths(manager.tree, dirPaths);
798
- for (const flat of manager.flat) {
910
+ collectDirPaths(collection.tree, dirPaths);
911
+ for (const flat of collection.flat) {
799
912
  if (options.filter ? !options.filter(flat) : flat.entry.page === false) continue;
800
913
  if (IGNORED_PATHS.has(flat.entry.path)) continue;
801
- let content = await manager.getContent(flat);
914
+ let content = await collection.getContent(flat);
802
915
  if (content === void 0) continue;
803
916
  const cleanContent = options.plainText ? renderToText(content) : renderToMarkdown(content);
804
- const filePath = flat.entry.path === "/" || dirPaths.has(flat.entry.path) ? flat.entry.path === "/" ? "/index.md" : `${flat.entry.path}/index.md` : flat.entry.path.endsWith(".md") ? flat.entry.path : `${flat.entry.path}.md`;
917
+ const numberedPath = pathMap.get(flat.entry.path) ?? flat.entry.path;
918
+ const filePath = flat.entry.path === "/" || dirPaths.has(flat.entry.path) ? flat.entry.path === "/" ? "/index.md" : `${numberedPath}/index.md` : numberedPath.endsWith(".md") ? numberedPath : `${numberedPath}.md`;
805
919
  const dest = join(dir, filePath);
806
920
  await mkdir(dirname(dest), { recursive: true });
807
921
  await writeFile(dest, cleanContent, "utf8");
@@ -813,7 +927,6 @@ async function exportDocsToFS(manager, dir, options = {}) {
813
927
  let tocFile = options.tocFile ?? "README.md";
814
928
  if (writtenFiles.has(tocFile)) tocFile = `_${tocFile}`;
815
929
  await writeFile(join(dir, tocFile), tocLines.join("\n") + "\n", "utf8");
816
- await writeFile(join(dir, "_navigation.json"), JSON.stringify(manager.tree, null, 2) + "\n", "utf8");
817
930
  }
818
931
  /** Collect all paths that are directories (have children in the tree). */
819
932
  function collectDirPaths(entries, set) {
@@ -822,5 +935,16 @@ function collectDirPaths(entries, set) {
822
935
  collectDirPaths(entry.children, set);
823
936
  }
824
937
  }
938
+ /**
939
+ * Build a map from nav path → numbered filesystem path.
940
+ * Uses sibling index as the numeric prefix (e.g., `/guide` → `/1.guide`).
941
+ */
942
+ function buildNumberedPaths(entries, parentPath, map) {
943
+ for (const [i, entry] of entries.entries()) {
944
+ const numbered = `${parentPath}/${i}.${entry.slug || "index"}`;
945
+ map.set(entry.path, numbered);
946
+ if (entry.children?.length) buildNumberedPaths(entry.children, numbered, map);
947
+ }
948
+ }
825
949
  //#endregion
826
- export { DocsSourceFS as a, DocsSourceGit as i, DocsSourceNpm as n, DocsSource as o, DocsSourceHTTP as r, DocsManager as s, exportDocsToFS as t };
950
+ export { HTTPSource as a, Source as c, NpmSource as i, extractSnippets as l, writeCollection as n, GitSource as o, resolveSource as r, FSSource as s, exportSource as t, Collection as u };