@kreuzberg/html-to-markdown-wasm 2.19.0-rc.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright 2024-2025 Na'aman Hirschfeld
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,569 @@
1
+ # @kreuzberg/html-to-markdown-wasm
2
+
3
+ > **npm package:** `@kreuzberg/html-to-markdown-wasm` (this README).
4
+ > Use [`@kreuzberg/html-to-markdown-node`](https://www.npmjs.com/package/@kreuzberg/html-to-markdown-node) when you only target Node.js or Bun and want native performance.
5
+
6
+ Universal HTML to Markdown converter using WebAssembly.
7
+
8
+ Powered by the same Rust engine as the Node.js, Python, Ruby, and PHP bindings, so Markdown output stays identical regardless of runtime.
9
+
10
+ Runs anywhere: Node.js, Deno, Bun, browsers, and edge runtimes.
11
+
12
+ [![Crates.io](https://img.shields.io/crates/v/html-to-markdown-rs.svg?logo=rust&label=crates.io)](https://crates.io/crates/html-to-markdown-rs)
13
+ [![npm (node)](https://img.shields.io/npm/v/%40kreuzberg%2Fhtml-to-markdown-node.svg?logo=npm)](https://www.npmjs.com/package/@kreuzberg/html-to-markdown-node)
14
+ [![npm (wasm)](https://img.shields.io/npm/v/%40kreuzberg%2Fhtml-to-markdown-wasm.svg?logo=npm)](https://www.npmjs.com/package/@kreuzberg/html-to-markdown-wasm)
15
+ [![PyPI](https://img.shields.io/pypi/v/html-to-markdown.svg?logo=pypi)](https://pypi.org/project/html-to-markdown/)
16
+ [![Packagist](https://img.shields.io/packagist/v/goldziher/html-to-markdown.svg)](https://packagist.org/packages/goldziher/html-to-markdown)
17
+ [![RubyGems](https://badge.fury.io/rb/html-to-markdown.svg)](https://rubygems.org/gems/html-to-markdown)
18
+ [![NuGet](https://img.shields.io/nuget/v/Goldziher.HtmlToMarkdown.svg)](https://www.nuget.org/packages/Goldziher.HtmlToMarkdown/)
19
+ [![Maven Central](https://img.shields.io/maven-central/v/io.github.goldziher/html-to-markdown.svg)](https://central.sonatype.com/artifact/io.github.goldziher/html-to-markdown)
20
+ [![Go Reference](https://pkg.go.dev/badge/github.com/kreuzberg-dev/html-to-markdown/packages/go/v2/htmltomarkdown.svg)](https://pkg.go.dev/github.com/kreuzberg-dev/html-to-markdown/packages/go/v2/htmltomarkdown)
21
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://github.com/kreuzberg-dev/html-to-markdown/blob/main/LICENSE)
22
+
23
+ ## Performance
24
+
25
+ Universal WebAssembly bindings with **excellent performance** across all JavaScript runtimes.
26
+
27
+ ### Benchmark Results (Apple M4)
28
+
29
+ | Document Type | ops/sec | Notes |
30
+ | -------------------------- | ---------- | ------------------ |
31
+ | **Small (5 paragraphs)** | **70,300** | Simple documents |
32
+ | **Medium (25 paragraphs)** | **15,282** | Nested formatting |
33
+ | **Large (100 paragraphs)** | **3,836** | Complex structures |
34
+ | **Tables (20 tables)** | **3,748** | Table processing |
35
+ | **Lists (500 items)** | **1,391** | Nested lists |
36
+ | **Wikipedia (129KB)** | **1,022** | Real-world content |
37
+ | **Wikipedia (653KB)** | **147** | Large documents |
38
+
39
+ **Average: ~15,536 ops/sec** across varied workloads.
40
+
41
+ ### Comparison
42
+
43
+ - **vs Native NAPI**: ~1.17× slower (WASM has minimal overhead)
44
+ - **vs Python**: ~6.3× faster (no FFI overhead)
45
+ - **Best for**: Universal deployment (browsers, Deno, edge runtimes, cross-platform apps)
46
+
47
+ ### Benchmark Fixtures (Apple M4)
48
+
49
+ Numbers captured via the shared fixture harness in `tools/benchmark-harness`:
50
+
51
+ | Document | Size | ops/sec (WASM) |
52
+ | ---------------------- | ------ | -------------- |
53
+ | Lists (Timeline) | 129 KB | 882 |
54
+ | Tables (Countries) | 360 KB | 242 |
55
+ | Medium (Python) | 657 KB | 121 |
56
+ | Large (Rust) | 567 KB | 124 |
57
+ | Small (Intro) | 463 KB | 163 |
58
+ | hOCR German PDF | 44 KB | 1,637 |
59
+ | hOCR Invoice | 4 KB | 7,775 |
60
+ | hOCR Embedded Tables | 37 KB | 1,667 |
61
+
62
+ > Expect slightly higher numbers in long-lived browser/Deno workers once the WASM module is warm.
63
+
64
+ ## Installation
65
+
66
+ ### npm / Yarn / pnpm
67
+
68
+ ```bash
69
+ npm install @kreuzberg/html-to-markdown-wasm
70
+ # or
71
+ yarn add @kreuzberg/html-to-markdown-wasm
72
+ # or
73
+ pnpm add @kreuzberg/html-to-markdown-wasm
74
+ ```
75
+
76
+ ### Deno
77
+
78
+ ```typescript
79
+ // Via npm specifier
80
+ import { convert } from "npm:@kreuzberg/html-to-markdown-wasm";
81
+ ```
82
+
83
+ ## Usage
84
+
85
+ ### Basic Conversion
86
+
87
+ ```javascript
88
+ import { convert } from '@kreuzberg/html-to-markdown-wasm';
89
+
90
+ const html = '<h1>Hello World</h1><p>This is <strong>fast</strong>!</p>';
91
+ const markdown = convert(html);
92
+ console.log(markdown);
93
+ // # Hello World
94
+ //
95
+ // This is **fast**!
96
+ ```
97
+
98
+ > **Heads up for edge runtimes:** Cloudflare Workers, Vite dev servers, and other environments that instantiate `.wasm` files asynchronously must call `await initWasm()` (or `await wasmReady`) once during startup before invoking `convert`. Traditional bundlers (Webpack, Rollup) and Deno/Node imports continue to work without manual initialization.
99
+
100
+ **Working Examples:**
101
+ - [**Browser with Rollup**](https://github.com/kreuzberg-dev/html-to-markdown/tree/main/examples/wasm-rollup) - Using dist-web target in browser
102
+ - [**Node.js**](https://github.com/kreuzberg-dev/html-to-markdown/tree/main/examples/wasm-node) - Using dist-node target
103
+ - [**Cloudflare Workers**](https://github.com/kreuzberg-dev/html-to-markdown/tree/main/examples/wasm-cloudflare) - Using bundler target with Wrangler
104
+
105
+ ### Reusing Options Handles
106
+
107
+ ```ts
108
+ import {
109
+ convertWithOptionsHandle,
110
+ createConversionOptionsHandle,
111
+ } from '@kreuzberg/html-to-markdown-wasm';
112
+
113
+ const handle = createConversionOptionsHandle({ hocrSpatialTables: false });
114
+ const markdown = convertWithOptionsHandle('<h1>Reusable</h1>', handle);
115
+ ```
116
+
117
+ ### Byte-Based Input (Buffers / Uint8Array)
118
+
119
+ When you already have raw bytes (e.g., `fs.readFileSync`, Fetch API responses), skip re-encoding with `TextDecoder` by calling the byte-friendly helpers:
120
+
121
+ ```ts
122
+ import {
123
+ convertBytes,
124
+ convertBytesWithOptionsHandle,
125
+ createConversionOptionsHandle,
126
+ convertBytesWithInlineImages,
127
+ } from '@kreuzberg/html-to-markdown-wasm';
128
+ import { readFileSync } from 'node:fs';
129
+
130
+ const htmlBytes = readFileSync('input.html'); // Buffer -> Uint8Array
131
+ const markdown = convertBytes(htmlBytes);
132
+
133
+ const handle = createConversionOptionsHandle({ headingStyle: 'atx' });
134
+ const markdownFromHandle = convertBytesWithOptionsHandle(htmlBytes, handle);
135
+
136
+ const inlineExtraction = convertBytesWithInlineImages(htmlBytes, null, {
137
+ maxDecodedSizeBytes: 5 * 1024 * 1024,
138
+ });
139
+ ```
140
+
141
+ ### With Options
142
+
143
+ ```typescript
144
+ import { convert } from '@kreuzberg/html-to-markdown-wasm';
145
+
146
+ const markdown = convert(html, {
147
+ headingStyle: 'atx',
148
+ codeBlockStyle: 'backticks',
149
+ listIndentWidth: 2,
150
+ bullets: '-',
151
+ wrap: true,
152
+ wrapWidth: 80
153
+ });
154
+ ```
155
+
156
+ ### Preserve Complex HTML (NEW in v2.5)
157
+
158
+ ```typescript
159
+ import { convert } from '@kreuzberg/html-to-markdown-wasm';
160
+
161
+ const html = `
162
+ <h1>Report</h1>
163
+ <table>
164
+ <tr><th>Name</th><th>Value</th></tr>
165
+ <tr><td>Foo</td><td>Bar</td></tr>
166
+ </table>
167
+ `;
168
+
169
+ const markdown = convert(html, {
170
+ preserveTags: ['table'] // Keep tables as HTML
171
+ });
172
+ ```
173
+
174
+ ### Deno
175
+
176
+ ```typescript
177
+ import { convert } from "npm:html-to-markdown-wasm";
178
+
179
+ const html = await Deno.readTextFile("input.html");
180
+ const markdown = convert(html, { headingStyle: "atx" });
181
+ await Deno.writeTextFile("output.md", markdown);
182
+ ```
183
+
184
+ > **Performance Tip:** For Node.js/Bun, use [@kreuzberg/html-to-markdown-node](https://www.npmjs.com/package/@kreuzberg/html-to-markdown-node) for 1.17× better performance with native bindings.
185
+
186
+ ### Browser (ESM)
187
+
188
+ ```html
189
+ <!DOCTYPE html>
190
+ <html>
191
+ <head>
192
+ <title>HTML to Markdown</title>
193
+ </head>
194
+ <body>
195
+ <script type="module">
196
+ import init, { convert } from 'https://unpkg.com/@kreuzberg/html-to-markdown-wasm/dist-web/html_to_markdown_wasm.js';
197
+
198
+ // Initialize WASM module
199
+ await init();
200
+
201
+ const html = '<h1>Hello World</h1><p>This runs in the <strong>browser</strong>!</p>';
202
+ const markdown = convert(html, { headingStyle: 'atx' });
203
+
204
+ console.log(markdown);
205
+ document.body.innerHTML = `<pre>${markdown}</pre>`;
206
+ </script>
207
+ </body>
208
+ </html>
209
+ ```
210
+
211
+ ### Vite / Webpack / Bundlers
212
+
213
+ ```typescript
214
+ import { convert } from '@kreuzberg/html-to-markdown-wasm';
215
+
216
+ const markdown = convert('<h1>Hello</h1>', {
217
+ headingStyle: 'atx',
218
+ codeBlockStyle: 'backticks'
219
+ });
220
+ ```
221
+
222
+ ### Cloudflare Workers
223
+
224
+ ```typescript
225
+ import { convert, initWasm, wasmReady } from '@kreuzberg/html-to-markdown-wasm';
226
+
227
+ // Cloudflare Workers / other edge runtimes instantiate WASM asynchronously.
228
+ // Kick off initialization once at module scope.
229
+ const ready = wasmReady ?? initWasm();
230
+
231
+ export default {
232
+ async fetch(request: Request): Promise<Response> {
233
+ await ready;
234
+ const html = await request.text();
235
+ const markdown = convert(html, { headingStyle: 'atx' });
236
+
237
+ return new Response(markdown, {
238
+ headers: { 'Content-Type': 'text/markdown' }
239
+ });
240
+ }
241
+ };
242
+ ```
243
+
244
+ > See the full [Cloudflare Workers example](https://github.com/kreuzberg-dev/html-to-markdown/tree/main/examples/wasm-cloudflare) with Wrangler configuration.
245
+
246
+ ## TypeScript
247
+
248
+ Full TypeScript support with type definitions:
249
+
250
+ ```typescript
251
+ import {
252
+ convert,
253
+ convertWithInlineImages,
254
+ WasmInlineImageConfig,
255
+ type WasmConversionOptions
256
+ } from '@kreuzberg/html-to-markdown-wasm';
257
+
258
+ const options: WasmConversionOptions = {
259
+ headingStyle: 'atx',
260
+ codeBlockStyle: 'backticks',
261
+ listIndentWidth: 2,
262
+ wrap: true,
263
+ wrapWidth: 80
264
+ };
265
+
266
+ const markdown = convert('<h1>Hello</h1>', options);
267
+ ```
268
+
269
+ ## Inline Images
270
+
271
+ Extract and decode inline images (data URIs, SVG):
272
+
273
+ ```typescript
274
+ import { convertWithInlineImages, WasmInlineImageConfig } from '@kreuzberg/html-to-markdown-wasm';
275
+
276
+ const html = '<img src="data:image/png;base64,iVBORw0..." alt="Logo">';
277
+
278
+ const config = new WasmInlineImageConfig(5 * 1024 * 1024); // 5MB max
279
+ config.inferDimensions = true;
280
+ config.filenamePrefix = 'img_';
281
+ config.captureSvg = true;
282
+
283
+ const result = convertWithInlineImages(html, null, config);
284
+
285
+ console.log(result.markdown);
286
+ console.log(`Extracted ${result.inlineImages.length} images`);
287
+
288
+ for (const img of result.inlineImages) {
289
+ console.log(`${img.filename}: ${img.format}, ${img.data.length} bytes`);
290
+ // img.data is a Uint8Array - save to file or upload
291
+ }
292
+ ```
293
+
294
+ ## Metadata Extraction
295
+
296
+ Extract document metadata (headers, links, images, structured data) alongside Markdown conversion:
297
+
298
+ ```typescript
299
+ import { convertWithMetadata, WasmMetadataConfig } from '@kreuzberg/html-to-markdown-wasm';
300
+
301
+ const html = `
302
+ <html lang="en">
303
+ <head><title>My Article</title></head>
304
+ <body>
305
+ <h1>Main Title</h1>
306
+ <p>Content with <a href="https://example.com">a link</a></p>
307
+ <img src="https://example.com/image.jpg" alt="Example image">
308
+ </body>
309
+ </html>
310
+ `;
311
+
312
+ const config = new WasmMetadataConfig();
313
+ config.extractHeaders = true;
314
+ config.extractLinks = true;
315
+ config.extractImages = true;
316
+ config.extractStructuredData = true;
317
+ config.maxStructuredDataSize = 1_000_000; // 1MB limit
318
+
319
+ const result = convertWithMetadata(html, null, config);
320
+
321
+ console.log(result.markdown);
322
+ console.log('Document metadata:', result.metadata.document);
323
+ // {
324
+ // title: 'My Article',
325
+ // language: 'en',
326
+ // ...
327
+ // }
328
+
329
+ console.log('Headers:', result.metadata.headers);
330
+ // [
331
+ // { level: 1, text: 'Main Title', id: undefined, depth: 0, htmlOffset: ... }
332
+ // ]
333
+
334
+ console.log('Links:', result.metadata.links);
335
+ // [
336
+ // {
337
+ // href: 'https://example.com',
338
+ // text: 'a link',
339
+ // linkType: 'external',
340
+ // rel: [],
341
+ // ...
342
+ // }
343
+ // ]
344
+
345
+ console.log('Images:', result.metadata.images);
346
+ // [
347
+ // {
348
+ // src: 'https://example.com/image.jpg',
349
+ // alt: 'Example image',
350
+ // imageType: 'external',
351
+ // ...
352
+ // }
353
+ // ]
354
+ ```
355
+
356
+ ### Metadata Configuration
357
+
358
+ The `WasmMetadataConfig` class controls what metadata is extracted:
359
+
360
+ ```typescript
361
+ import { WasmMetadataConfig } from '@kreuzberg/html-to-markdown-wasm';
362
+
363
+ const config = new WasmMetadataConfig();
364
+
365
+ // Enable/disable extraction types
366
+ config.extractHeaders = true; // h1-h6 elements
367
+ config.extractLinks = true; // <a> elements with link type classification
368
+ config.extractImages = true; // <img> and <svg> elements
369
+ config.extractStructuredData = true; // JSON-LD, Microdata, RDFa
370
+
371
+ // Limit structured data size to prevent memory exhaustion
372
+ config.maxStructuredDataSize = 1_000_000; // 1MB default
373
+ ```
374
+
375
+ ### Metadata Structure
376
+
377
+ The returned metadata object includes:
378
+
379
+ - **document**: Document-level metadata (title, description, keywords, language, OG tags, Twitter cards, etc.)
380
+ - **headers**: Array of header elements with level, text, id, and document position
381
+ - **links**: Array of links with href, text, type (anchor/internal/external/email/phone), and rel attributes
382
+ - **images**: Array of images with src, alt text, dimensions, and type classification (dataUri/external/relative/svg)
383
+ - **structuredData**: Array of JSON-LD, Microdata, and RDFa blocks
384
+
385
+ ### Byte-Based Input
386
+
387
+ Convert bytes directly with metadata extraction:
388
+
389
+ ```typescript
390
+ import { convertBytesWithMetadata, WasmMetadataConfig } from '@kreuzberg/html-to-markdown-wasm';
391
+ import { readFileSync } from 'node:fs';
392
+
393
+ const htmlBytes = readFileSync('article.html');
394
+ const config = new WasmMetadataConfig();
395
+
396
+ const result = convertBytesWithMetadata(htmlBytes, null, config);
397
+ console.log(result.markdown);
398
+ console.log(result.metadata);
399
+ ```
400
+
401
+ ## Build Targets
402
+
403
+ Three build targets are provided for different environments:
404
+
405
+ | Target | Path | Use Case |
406
+ | ----------- | --------------------------------- | ------------------------------ |
407
+ | **Bundler** | `@kreuzberg/html-to-markdown-wasm` | Webpack, Vite, Rollup, esbuild |
408
+ | **Node.js** | `@kreuzberg/html-to-markdown-wasm/dist-node` | Node.js, Bun (CommonJS/ESM) |
409
+ | **Web** | `@kreuzberg/html-to-markdown-wasm/dist-web` | Direct browser ESM imports |
410
+
411
+ ## Runtime Compatibility
412
+
413
+ | Runtime | Support | Package |
414
+ | ------------------------- | ---------------------------- | -------------- |
415
+ | ✅ **Node.js** 18+ | Full support | `dist-node` |
416
+ | ✅ **Deno** | Full support | npm: specifier |
417
+ | ✅ **Bun** | Full support (prefer native) | Default export |
418
+ | ✅ **Browsers** | Full support | `dist-web` |
419
+ | ✅ **Cloudflare Workers** | Full support | Default export |
420
+ | ✅ **Deno Deploy** | Full support | npm: specifier |
421
+
422
+ ## When to Use
423
+
424
+ Choose `@kreuzberg/html-to-markdown-wasm` when:
425
+
426
+ - 🌐 Running in browsers or edge runtimes
427
+ - 🦕 Using Deno
428
+ - ☁️ Deploying to Cloudflare Workers, Deno Deploy
429
+ - 📦 Building universal libraries
430
+ - 🔄 Need consistent behavior across all platforms
431
+
432
+ Use [@kreuzberg/html-to-markdown-node](https://www.npmjs.com/package/@kreuzberg/html-to-markdown-node) for:
433
+
434
+ - ⚡ Maximum performance in Node.js/Bun (~3× faster)
435
+ - 🖥️ Server-side only applications
436
+
437
+ ## Configuration Options
438
+
439
+ See the [TypeScript definitions](./dist-node/html_to_markdown_wasm.d.ts) for all available options:
440
+
441
+ - Heading styles (atx, underlined, atxClosed)
442
+ - Code block styles (indented, backticks, tildes)
443
+ - List formatting (indent width, bullet characters)
444
+ - Text escaping and formatting
445
+ - Tag preservation (`preserveTags`) and stripping (`stripTags`)
446
+ - Preprocessing for web scraping
447
+ - hOCR table extraction
448
+ - And more...
449
+
450
+ ## Examples
451
+
452
+ ### Preserving HTML Tags
453
+
454
+ Keep specific HTML tags in their original form:
455
+
456
+ ```typescript
457
+ import { convert } from '@kreuzberg/html-to-markdown-wasm';
458
+
459
+ const html = `
460
+ <p>Before table</p>
461
+ <table class="data">
462
+ <tr><th>Name</th><th>Value</th></tr>
463
+ <tr><td>Item 1</td><td>100</td></tr>
464
+ </table>
465
+ <p>After table</p>
466
+ `;
467
+
468
+ const markdown = convert(html, {
469
+ preserveTags: ['table']
470
+ });
471
+
472
+ // Result includes the table as HTML
473
+ ```
474
+
475
+ Combine with `stripTags`:
476
+
477
+ ```typescript
478
+ const markdown = convert(html, {
479
+ preserveTags: ['table', 'form'], // Keep as HTML
480
+ stripTags: ['script', 'style'] // Remove entirely
481
+ });
482
+ ```
483
+
484
+ ### Deno Web Server
485
+
486
+ ```typescript
487
+ import { convert } from "npm:html-to-markdown-wasm";
488
+
489
+ Deno.serve((req) => {
490
+ const url = new URL(req.url);
491
+
492
+ if (url.pathname === "/convert" && req.method === "POST") {
493
+ const html = await req.text();
494
+ const markdown = convert(html, { headingStyle: "atx" });
495
+
496
+ return new Response(markdown, {
497
+ headers: { "Content-Type": "text/markdown" }
498
+ });
499
+ }
500
+
501
+ return new Response("Not found", { status: 404 });
502
+ });
503
+ ```
504
+
505
+ ### Browser File Conversion
506
+
507
+ ```html
508
+ <input type="file" id="htmlFile" accept=".html">
509
+ <button onclick="convertFile()">Convert to Markdown</button>
510
+ <pre id="output"></pre>
511
+
512
+ <script type="module">
513
+ import init, { convert } from 'https://unpkg.com/@kreuzberg/html-to-markdown-wasm/dist-web/html_to_markdown_wasm.js';
514
+
515
+ await init();
516
+
517
+ window.convertFile = async () => {
518
+ const file = document.getElementById('htmlFile').files[0];
519
+ const html = await file.text();
520
+ const markdown = convert(html, { headingStyle: 'atx' });
521
+ document.getElementById('output').textContent = markdown;
522
+ };
523
+ </script>
524
+ ```
525
+
526
+ ### Web Scraping (Deno)
527
+
528
+ ```typescript
529
+ import { convert } from "npm:html-to-markdown-wasm";
530
+
531
+ const response = await fetch("https://example.com");
532
+ const html = await response.text();
533
+
534
+ const markdown = convert(html, {
535
+ preprocessing: {
536
+ enabled: true,
537
+ preset: "aggressive",
538
+ removeNavigation: true,
539
+ removeForms: true
540
+ },
541
+ headingStyle: "atx",
542
+ codeBlockStyle: "backticks"
543
+ });
544
+
545
+ console.log(markdown);
546
+ ```
547
+
548
+ ## Other Runtimes
549
+
550
+ The same Rust engine ships as native bindings for other ecosystems:
551
+
552
+ - 🖥️ Node.js / Bun: [`html-to-markdown-node`](https://www.npmjs.com/package/html-to-markdown-node)
553
+ - 🐍 Python: [`html-to-markdown`](https://pypi.org/project/html-to-markdown/)
554
+ - 💎 Ruby: [`html-to-markdown`](https://rubygems.org/gems/html-to-markdown)
555
+ - 🐘 PHP: [`goldziher/html-to-markdown`](https://packagist.org/packages/goldziher/html-to-markdown)
556
+ - 🦀 Rust crate & CLI: [`html-to-markdown-rs`](https://crates.io/crates/html-to-markdown-rs)
557
+
558
+ ## Links
559
+
560
+ - [GitHub Repository](https://github.com/kreuzberg-dev/html-to-markdown)
561
+ - [Full Documentation](https://github.com/kreuzberg-dev/html-to-markdown/blob/main/README.md)
562
+ - [Native Node Package](https://www.npmjs.com/package/html-to-markdown-node)
563
+ - [Python Package](https://pypi.org/project/html-to-markdown/)
564
+ - [PHP Extension & Helpers](https://packagist.org/packages/goldziher/html-to-markdown)
565
+ - [Rust Crate](https://crates.io/crates/html-to-markdown-rs)
566
+
567
+ ## License
568
+
569
+ MIT
package/dist/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright 2024-2025 Na'aman Hirschfeld
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.