html-to-markdown 2.7.0__cp310-abi3-win_amd64.whl → 2.8.3__cp310-abi3-win_amd64.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- html_to_markdown/__init__.py +1 -1
- html_to_markdown/_html_to_markdown.pyd +0 -0
- html_to_markdown/bin/html-to-markdown.exe +0 -0
- {html_to_markdown-2.7.0.data → html_to_markdown-2.8.3.data}/scripts/html-to-markdown.exe +0 -0
- {html_to_markdown-2.7.0.dist-info → html_to_markdown-2.8.3.dist-info}/METADATA +25 -3
- {html_to_markdown-2.7.0.dist-info → html_to_markdown-2.8.3.dist-info}/RECORD +8 -8
- {html_to_markdown-2.7.0.dist-info → html_to_markdown-2.8.3.dist-info}/WHEEL +0 -0
- {html_to_markdown-2.7.0.dist-info → html_to_markdown-2.8.3.dist-info}/licenses/LICENSE +0 -0
html_to_markdown/__init__.py
CHANGED
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: html-to-markdown
|
|
3
|
-
Version: 2.
|
|
3
|
+
Version: 2.8.3
|
|
4
4
|
Classifier: Development Status :: 5 - Production/Stable
|
|
5
5
|
Classifier: Environment :: Console
|
|
6
6
|
Classifier: Intended Audience :: Developers
|
|
@@ -35,13 +35,18 @@ Project-URL: Repository, https://github.com/Goldziher/html-to-markdown.git
|
|
|
35
35
|
|
|
36
36
|
High-performance HTML to Markdown converter with a clean Python API (powered by a Rust core). The same engine also drives the Node.js, Ruby, PHP, and WebAssembly bindings, so rendered Markdown stays identical across runtimes. Wheels are published for Linux, macOS, and Windows.
|
|
37
37
|
|
|
38
|
-
[](https://crates.io/crates/html-to-markdown)
|
|
39
39
|
[](https://www.npmjs.com/package/html-to-markdown-node)
|
|
40
40
|
[](https://www.npmjs.com/package/html-to-markdown-wasm)
|
|
41
41
|
[](https://pypi.org/project/html-to-markdown/)
|
|
42
42
|
[](https://packagist.org/packages/goldziher/html-to-markdown)
|
|
43
43
|
[](https://rubygems.org/gems/html-to-markdown)
|
|
44
|
+
[](https://hex.pm/packages/html_to_markdown)
|
|
45
|
+
[](https://www.nuget.org/packages/HtmlToMarkdown/)
|
|
46
|
+
[](https://central.sonatype.com/artifact/io.github.goldziher/html-to-markdown)
|
|
47
|
+
[](https://pkg.go.dev/github.com/Goldziher/html-to-markdown/packages/go/htmltomarkdown)
|
|
44
48
|
[](https://github.com/Goldziher/html-to-markdown/blob/main/LICENSE)
|
|
49
|
+
[](https://discord.gg/pXxagNK2zN)
|
|
45
50
|
|
|
46
51
|
## Installation
|
|
47
52
|
|
|
@@ -61,6 +66,23 @@ Apple M4 • Real Wikipedia documents • `convert()` (Python)
|
|
|
61
66
|
|
|
62
67
|
> V1 averaged ~2.5 MB/s (Python/BeautifulSoup). V2's Rust engine delivers 60–80× higher throughput.
|
|
63
68
|
|
|
69
|
+
### Benchmark Fixtures (Apple M4)
|
|
70
|
+
|
|
71
|
+
Pulled directly from `tools/runtime-bench` (`task bench:bindings -- --language python`) so they stay in lockstep with the Rust core:
|
|
72
|
+
|
|
73
|
+
| Document | Size | ops/sec (Python) |
|
|
74
|
+
| ---------------------- | ------ | ---------------- |
|
|
75
|
+
| Lists (Timeline) | 129 KB | 1,405 |
|
|
76
|
+
| Tables (Countries) | 360 KB | 352 |
|
|
77
|
+
| Medium (Python) | 657 KB | 158 |
|
|
78
|
+
| Large (Rust) | 567 KB | 183 |
|
|
79
|
+
| Small (Intro) | 463 KB | 223 |
|
|
80
|
+
| hOCR German PDF | 44 KB | 2,991 |
|
|
81
|
+
| hOCR Invoice | 4 KB | 23,500 |
|
|
82
|
+
| hOCR Embedded Tables | 37 KB | 3,464 |
|
|
83
|
+
|
|
84
|
+
> Re-run locally with `task bench:bindings -- --language python --output tmp.json` to compare against CI history.
|
|
85
|
+
|
|
64
86
|
## Quick Start
|
|
65
87
|
|
|
66
88
|
```python
|
|
@@ -233,7 +255,7 @@ The v1 compatibility layer creates extra Python objects and performs additional
|
|
|
233
255
|
|
|
234
256
|
A compatibility layer is provided to ease migration from v1.x:
|
|
235
257
|
|
|
236
|
-
- **Compat shim**: `html_to_markdown.v1_compat` exposes `convert_to_markdown`, `convert_to_markdown_stream`, and `markdownify`. Keyword mappings are listed in the [changelog](CHANGELOG.md#v200).
|
|
258
|
+
- **Compat shim**: `html_to_markdown.v1_compat` exposes `convert_to_markdown`, `convert_to_markdown_stream`, and `markdownify`. Keyword mappings are listed in the [changelog](https://github.com/Goldziher/html-to-markdown/blob/main/CHANGELOG.md#v200).
|
|
237
259
|
- **⚠️ Performance warning**: These compatibility functions add 77% overhead. Migrate to v2 API as soon as possible.
|
|
238
260
|
- **CLI**: The Rust CLI replaces the old Python script. New flags are documented via `html-to-markdown --help`.
|
|
239
261
|
- **Removed options**: `code_language_callback`, `strip`, and streaming APIs were removed; use `ConversionOptions`, `PreprocessingOptions`, and the inline-image helpers instead.
|
|
@@ -1,17 +1,17 @@
|
|
|
1
|
-
html_to_markdown-2.
|
|
2
|
-
html_to_markdown-2.
|
|
3
|
-
html_to_markdown-2.
|
|
4
|
-
html_to_markdown-2.
|
|
5
|
-
html_to_markdown/__init__.py,sha256
|
|
1
|
+
html_to_markdown-2.8.3.data/scripts/html-to-markdown.exe,sha256=yyoFKWSdtkTQ8REbHd234P5vJVQoHYY8QBw64Xs7gTw,3386368
|
|
2
|
+
html_to_markdown-2.8.3.dist-info/METADATA,sha256=hdOPePogSGM3QQJEv-GlihxZvYy3rGIIMCzwlJPJrDo,11887
|
|
3
|
+
html_to_markdown-2.8.3.dist-info/WHEEL,sha256=G3JyZRtw6x7sQDM5feqT5IDYMcac7O2Ec3LW6k1bFXE,96
|
|
4
|
+
html_to_markdown-2.8.3.dist-info/licenses/LICENSE,sha256=QhKFMkQLa4mSUlOsyG9VElzC7GYbAKtiS_EwOCyH-b4,1107
|
|
5
|
+
html_to_markdown/__init__.py,sha256=-ELgRSD8CCY5B_Es1irMAL6lcpI1bx1V7T8Iqz96YFU,1564
|
|
6
6
|
html_to_markdown/__main__.py,sha256=5objj9lB7hhpSpZsDok5tv9o9yztVR63Ccww-pXsAyY,343
|
|
7
|
-
html_to_markdown/_html_to_markdown.pyd,sha256=
|
|
7
|
+
html_to_markdown/_html_to_markdown.pyd,sha256=BX12mw3f9r2Yn9F8IKh3cvQ2XkwL-oY0P0A-PeBIhIg,3142656
|
|
8
8
|
html_to_markdown/_html_to_markdown.pyi,sha256=lh2hj6GyGx71fJzZPD5giZbO6XQYYBIlfQUJq4MwVPQ,878
|
|
9
9
|
html_to_markdown/api.py,sha256=xxdVbIZjuSewhsgntdfY5DFJaYIEZITz2TBieqUCR3A,5241
|
|
10
|
-
html_to_markdown/bin/html-to-markdown.exe,sha256=
|
|
10
|
+
html_to_markdown/bin/html-to-markdown.exe,sha256=yyoFKWSdtkTQ8REbHd234P5vJVQoHYY8QBw64Xs7gTw,3386368
|
|
11
11
|
html_to_markdown/cli.py,sha256=z59l8sF8wIRRzJtUd-tXgqiC0WTqkTjzl-df8Ey_oQ0,67
|
|
12
12
|
html_to_markdown/cli_proxy.py,sha256=Y0Z98U0EMDqIRtdEkcHa1dVntWkw69maczeksr-Cq28,4000
|
|
13
13
|
html_to_markdown/exceptions.py,sha256=31VqpPi4JLGv7lI2481Z4f2s5ejYmq97c3s-WFFkXVU,2443
|
|
14
14
|
html_to_markdown/options.py,sha256=iDEIfxxZlSHDM3V-Sr-XVxYLC1mzvuic56jSycYvQvY,5224
|
|
15
15
|
html_to_markdown/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
|
|
16
16
|
html_to_markdown/v1_compat.py,sha256=qBfWRsXxox4I4Mm2kzvxEvqEKZ8DwYMQK-bbLHTUk-A,8253
|
|
17
|
-
html_to_markdown-2.
|
|
17
|
+
html_to_markdown-2.8.3.dist-info/RECORD,,
|
|
File without changes
|
|
File without changes
|