html-to-markdown 1.9.0__py3-none-any.whl → 1.10.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of html-to-markdown might be problematic. Click here for more details.

@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: html-to-markdown
3
- Version: 1.9.0
3
+ Version: 1.10.0
4
4
  Summary: A modern, type-safe Python library for converting HTML to Markdown with comprehensive tag support and customizable options
5
5
  Author-email: Na'aman Hirschfeld <nhirschfeld@gmail.com>
6
6
  License: MIT
@@ -30,10 +30,10 @@ Classifier: Typing :: Typed
30
30
  Requires-Python: >=3.10
31
31
  Description-Content-Type: text/markdown
32
32
  License-File: LICENSE
33
- Requires-Dist: beautifulsoup4>=4.13.4
33
+ Requires-Dist: beautifulsoup4>=4.13.5
34
34
  Requires-Dist: nh3>=0.3
35
35
  Provides-Extra: lxml
36
- Requires-Dist: lxml>=6; extra == "lxml"
36
+ Requires-Dist: lxml>=6.0.1; extra == "lxml"
37
37
  Dynamic: license-file
38
38
 
39
39
  # html-to-markdown
@@ -42,20 +42,31 @@ A modern, fully typed Python library for converting HTML to Markdown. This libra
42
42
  of [markdownify](https://pypi.org/project/markdownify/) with a modernized codebase, strict type safety and support for
43
43
  Python 3.9+.
44
44
 
45
+ ## Support This Project
46
+
47
+ If you find html-to-markdown useful, please consider sponsoring the development:
48
+
49
+ <a href="https://github.com/sponsors/Goldziher"><img src="https://img.shields.io/badge/Sponsor-%E2%9D%A4-pink?logo=github-sponsors" alt="Sponsor on GitHub" height="32"></a>
50
+
51
+ Your support helps maintain and improve this library for the community.
52
+
45
53
  ## Features
46
54
 
47
55
  - **Full HTML5 Support**: Comprehensive support for all modern HTML5 elements including semantic, form, table, ruby, interactive, structural, SVG, and math elements
48
- - **Enhanced Table Support**: Advanced handling of merged cells with rowspan/colspan support for better table representation
56
+ - **Table Support**: Advanced handling of complex tables with rowspan/colspan support
49
57
  - **Type Safety**: Strict MyPy adherence with comprehensive type hints
50
58
  - **Metadata Extraction**: Automatic extraction of document metadata (title, meta tags) as comment headers
51
59
  - **Streaming Support**: Memory-efficient processing for large documents with progress callbacks
52
60
  - **Highlight Support**: Multiple styles for highlighted text (`<mark>` elements)
53
61
  - **Task List Support**: Converts HTML checkboxes to GitHub-compatible task list syntax
54
- - **Flexible Configuration**: 20+ configuration options for customizing conversion behavior
55
- - **CLI Tool**: Full-featured command-line interface with all API options exposed
62
+ - **Flexible Configuration**: Comprehensive configuration options for customizing conversion behavior
63
+ - **CLI Tool**: Full-featured command-line interface with complete API parity
56
64
  - **Custom Converters**: Extensible converter system for custom HTML tag handling
65
+ - **List Formatting**: Configurable list indentation with Discord/Slack compatibility
66
+ - **HTML Preprocessing**: Clean messy HTML with configurable aggressiveness levels
67
+ - **Whitespace Control**: Normalized or strict whitespace preservation modes
57
68
  - **BeautifulSoup Integration**: Support for pre-configured BeautifulSoup instances
58
- - **Comprehensive Test Coverage**: 91%+ test coverage with 623+ comprehensive tests
69
+ - **Robustly Tested**: Comprehensive unit tests and integration tests covering all conversion scenarios
59
70
 
60
71
  ## Installation
61
72
 
@@ -71,19 +82,9 @@ For improved performance, you can install with the optional lxml parser:
71
82
  pip install html-to-markdown[lxml]
72
83
  ```
73
84
 
74
- The lxml parser offers:
85
+ The lxml parser offers faster HTML parsing and better handling of malformed HTML compared to the default html.parser.
75
86
 
76
- - **~30% faster HTML parsing** compared to the default html.parser
77
- - Better handling of malformed HTML
78
- - More robust parsing for complex documents
79
-
80
- Once installed, lxml is automatically used by default for better performance. You can explicitly specify a parser if needed:
81
-
82
- ```python
83
- result = convert_to_markdown(html) # Auto-detects: uses lxml if available, otherwise html.parser
84
- result = convert_to_markdown(html, parser="lxml") # Force lxml (requires installation)
85
- result = convert_to_markdown(html, parser="html.parser") # Force built-in parser
86
- ```
87
+ The library automatically uses lxml when available. You can explicitly specify a parser using the `parser` parameter.
87
88
 
88
89
  ## Quick Start
89
90
 
@@ -148,123 +149,176 @@ soup = BeautifulSoup(html, "lxml") # Note: lxml requires additional installatio
148
149
  markdown = convert_to_markdown(soup)
149
150
  ```
150
151
 
151
- ## Advanced Usage
152
+ ## Common Use Cases
152
153
 
153
- ### Customizing Conversion Options
154
+ ### Discord/Slack Compatible Lists
154
155
 
155
- The library offers extensive customization through various options:
156
+ Discord and Slack require 2-space indentation for nested lists:
157
+
158
+ **Python:**
156
159
 
157
160
  ```python
158
161
  from html_to_markdown import convert_to_markdown
159
162
 
160
- html = "<div>Your content here...</div>"
161
- markdown = convert_to_markdown(
162
- html,
163
- # Document processing
164
- extract_metadata=True, # Extract metadata as comment header
165
- convert_as_inline=False, # Treat as block-level content
166
- strip_newlines=False, # Preserve original newlines
167
- # Formatting options
168
- heading_style="atx", # Use # style headers
169
- strong_em_symbol="*", # Use * for bold/italic
170
- bullets="*+-", # Define bullet point characters
171
- highlight_style="double-equal", # Use == for highlighted text
172
- # Text processing
173
- wrap=True, # Enable text wrapping
174
- wrap_width=100, # Set wrap width
175
- escape_asterisks=True, # Escape * characters
176
- escape_underscores=True, # Escape _ characters
177
- escape_misc=True, # Escape other special characters
178
- # Code blocks
179
- code_language="python", # Default code block language
180
- # Streaming for large documents
181
- stream_processing=False, # Enable for memory efficiency
182
- chunk_size=1024, # Chunk size for streaming
183
- )
163
+ html = "<ul><li>Item 1<ul><li>Nested item</li></ul></li></ul>"
164
+ markdown = convert_to_markdown(html, list_indent_width=2)
165
+ # Output: * Item 1\n + Nested item
184
166
  ```
185
167
 
186
- ### Custom Converters
168
+ **CLI:**
187
169
 
188
- You can provide your own conversion functions for specific HTML tags:
170
+ ```shell
171
+ html_to_markdown --list-indent-width 2 input.html
172
+ ```
173
+
174
+ ### Cleaning Web-Scraped HTML
175
+
176
+ Remove navigation, advertisements, and forms from scraped content:
177
+
178
+ **Python:**
189
179
 
190
180
  ```python
191
- from bs4.element import Tag
192
- from html_to_markdown import convert_to_markdown
181
+ markdown = convert_to_markdown(html, preprocess_html=True, preprocessing_preset="aggressive")
182
+ ```
193
183
 
194
- # Define a custom converter for the <b> tag
195
- def custom_bold_converter(*, tag: Tag, text: str, **kwargs) -> str:
196
- return f"IMPORTANT: {text}"
184
+ **CLI:**
197
185
 
198
- html = "<p>This is a <b>bold statement</b>.</p>"
199
- markdown = convert_to_markdown(html, custom_converters={"b": custom_bold_converter})
200
- print(markdown)
201
- # Output: This is a IMPORTANT: bold statement.
186
+ ```shell
187
+ html_to_markdown --preprocess-html --preprocessing-preset aggressive input.html
188
+ ```
189
+
190
+ ### Preserving Whitespace for Documentation
191
+
192
+ Maintain exact whitespace for code documentation or technical content:
193
+
194
+ **Python:**
195
+
196
+ ```python
197
+ markdown = convert_to_markdown(html, whitespace_mode="strict")
198
+ ```
199
+
200
+ **CLI:**
201
+
202
+ ```shell
203
+ html_to_markdown --whitespace-mode strict input.html
202
204
  ```
203
205
 
204
- Custom converters take precedence over the built-in converters and can be used alongside other configuration options.
206
+ ### Using Tabs for List Indentation
205
207
 
206
- ### Enhanced Table Support
208
+ Some editors and platforms prefer tab-based indentation:
207
209
 
208
- The library now provides better handling of complex tables with merged cells:
210
+ **Python:**
211
+
212
+ ```python
213
+ markdown = convert_to_markdown(html, list_indent_type="tabs")
214
+ ```
215
+
216
+ **CLI:**
217
+
218
+ ```shell
219
+ html_to_markdown --list-indent-type tabs input.html
220
+ ```
221
+
222
+ ## Advanced Usage
223
+
224
+ ### Configuration Example
209
225
 
210
226
  ```python
211
227
  from html_to_markdown import convert_to_markdown
212
228
 
213
- # HTML table with merged cells
214
- html = """
215
- <table>
216
- <tr>
217
- <th rowspan="2">Category</th>
218
- <th colspan="2">Sales Data</th>
219
- </tr>
220
- <tr>
221
- <th>Q1</th>
222
- <th>Q2</th>
223
- </tr>
224
- <tr>
225
- <td>Product A</td>
226
- <td>$100K</td>
227
- <td>$150K</td>
228
- </tr>
229
- </table>
230
- """
229
+ markdown = convert_to_markdown(
230
+ html,
231
+ # Headers and formatting
232
+ heading_style="atx",
233
+ strong_em_symbol="*",
234
+ bullets="*+-",
235
+ highlight_style="double-equal",
236
+ # List indentation
237
+ list_indent_type="spaces",
238
+ list_indent_width=4,
239
+ # Whitespace handling
240
+ whitespace_mode="normalized",
241
+ # HTML preprocessing
242
+ preprocess_html=True,
243
+ preprocessing_preset="standard",
244
+ )
245
+ ```
231
246
 
232
- markdown = convert_to_markdown(html)
247
+ ### Custom Converters
248
+
249
+ Custom converters allow you to override the default conversion behavior for any HTML tag. This is particularly useful for customizing header formatting or implementing domain-specific conversion rules.
250
+
251
+ #### Basic Example: Custom Header Formatting
252
+
253
+ ```python
254
+ from bs4.element import Tag
255
+ from html_to_markdown import convert_to_markdown
256
+
257
+ def custom_h1_converter(*, tag: Tag, text: str, **kwargs) -> str:
258
+ """Convert h1 tags with custom formatting."""
259
+ return f"### {text.upper()} ###\n\n"
260
+
261
+ def custom_h2_converter(*, tag: Tag, text: str, **kwargs) -> str:
262
+ """Convert h2 tags with underline."""
263
+ return f"{text}\n{'=' * len(text)}\n\n"
264
+
265
+ html = "<h1>Title</h1><h2>Subtitle</h2><p>Content</p>"
266
+ markdown = convert_to_markdown(html, custom_converters={"h1": custom_h1_converter, "h2": custom_h2_converter})
233
267
  print(markdown)
268
+ # Output:
269
+ # ### TITLE ###
270
+ #
271
+ # Subtitle
272
+ # ========
273
+ #
274
+ # Content
234
275
  ```
235
276
 
236
- Output:
277
+ #### Advanced Example: Context-Aware Link Conversion
237
278
 
238
- ```markdown
239
- | Category | Sales Data | |
240
- | --- | --- | --- |
241
- | | Q1 | Q2 |
242
- | Product A | $100K | $150K |
279
+ ```python
280
+ def smart_link_converter(*, tag: Tag, text: str, **kwargs) -> str:
281
+ """Convert links based on their attributes."""
282
+ href = tag.get("href", "")
283
+ title = tag.get("title", "")
284
+
285
+ # Handle different link types
286
+ if href.startswith("http"):
287
+ # External link
288
+ return f"[{text}]({href} \"{title or 'External link'}\")"
289
+ elif href.startswith("#"):
290
+ # Anchor link
291
+ return f"[{text}]({href})"
292
+ elif href.startswith("mailto:"):
293
+ # Email link
294
+ return f"[{text}]({href})"
295
+ else:
296
+ # Relative link
297
+ return f"[{text}]({href})"
298
+
299
+ html = '<a href="https://example.com">External</a> <a href="#section">Anchor</a>'
300
+ markdown = convert_to_markdown(html, custom_converters={"a": smart_link_converter})
243
301
  ```
244
302
 
245
- The library handles:
246
-
247
- - **Rowspan**: Inserts empty cells in subsequent rows
248
- - **Colspan**: Properly manages column spanning
249
- - **Clean output**: Removes `<colgroup>` and `<col>` elements that have no Markdown equivalent
303
+ #### Converter Function Signature
250
304
 
251
- ### Key Configuration Options
305
+ All converter functions must follow this signature:
252
306
 
253
- | Option | Type | Default | Description |
254
- | ------------------- | ---- | ---------------- | --------------------------------------------------------------- |
255
- | `extract_metadata` | bool | `True` | Extract document metadata as comment header |
256
- | `convert_as_inline` | bool | `False` | Treat content as inline elements only |
257
- | `heading_style` | str | `'underlined'` | Header style (`'underlined'`, `'atx'`, `'atx_closed'`) |
258
- | `highlight_style` | str | `'double-equal'` | Highlight style (`'double-equal'`, `'html'`, `'bold'`) |
259
- | `stream_processing` | bool | `False` | Enable streaming for large documents |
260
- | `parser` | str | auto-detect | BeautifulSoup parser (auto-detects `'lxml'` or `'html.parser'`) |
261
- | `autolinks` | bool | `True` | Auto-convert URLs to Markdown links |
262
- | `bullets` | str | `'*+-'` | Characters to use for bullet points |
263
- | `escape_asterisks` | bool | `True` | Escape * characters |
264
- | `wrap` | bool | `False` | Enable text wrapping |
265
- | `wrap_width` | int | `80` | Text wrap width |
307
+ ```python
308
+ def converter(*, tag: Tag, text: str, **kwargs) -> str:
309
+ """
310
+ Args:
311
+ tag: BeautifulSoup Tag object with access to all HTML attributes
312
+ text: Pre-processed text content of the tag
313
+ **kwargs: Additional context passed through from conversion
314
+
315
+ Returns:
316
+ Markdown formatted string
317
+ """
318
+ pass
319
+ ```
266
320
 
267
- For a complete list of all 20+ options, see the [Configuration Reference](#configuration-reference) section below.
321
+ Custom converters take precedence over built-in converters and can be used alongside other configuration options.
268
322
 
269
323
  ## CLI Usage
270
324
 
@@ -280,51 +334,30 @@ cat input.html | html_to_markdown > output.md
280
334
  # Use custom options
281
335
  html_to_markdown --heading-style atx --wrap --wrap-width 100 input.html > output.md
282
336
 
283
- # Advanced options
337
+ # Discord-compatible lists with HTML preprocessing
284
338
  html_to_markdown \
285
- --no-extract-metadata \
286
- --convert-as-inline \
287
- --highlight-style html \
288
- --stream-processing \
289
- --show-progress \
339
+ --list-indent-width 2 \
340
+ --preprocess-html \
341
+ --preprocessing-preset aggressive \
290
342
  input.html > output.md
291
343
  ```
292
344
 
293
345
  ### Key CLI Options
294
346
 
295
- ```shell
296
- # Content processing
297
- --convert-as-inline # Treat content as inline elements
298
- --no-extract-metadata # Disable metadata extraction
299
- --strip-newlines # Remove newlines from input
300
-
301
- # Formatting
302
- --heading-style {atx,atx_closed,underlined}
303
- --highlight-style {double-equal,html,bold}
304
- --strong-em-symbol {*,_}
305
- --bullets CHARS # e.g., "*+-"
306
-
307
- # Text escaping
308
- --no-escape-asterisks # Disable * escaping
309
- --no-escape-underscores # Disable _ escaping
310
- --no-escape-misc # Disable misc character escaping
311
-
312
- # Large document processing
313
- --stream-processing # Enable streaming mode
314
- --chunk-size SIZE # Set chunk size (default: 1024)
315
- --show-progress # Show progress for large files
316
-
317
- # Text wrapping
318
- --wrap # Enable text wrapping
319
- --wrap-width WIDTH # Set wrap width (default: 80)
320
- ```
321
-
322
- View all available options:
347
+ **Most Common Options:**
323
348
 
324
349
  ```shell
325
- html_to_markdown --help
350
+ --list-indent-width WIDTH # Spaces per indent (default: 4, use 2 for Discord)
351
+ --list-indent-type {spaces,tabs} # Indentation type (default: spaces)
352
+ --preprocess-html # Enable HTML cleaning for web scraping
353
+ --whitespace-mode {normalized,strict} # Whitespace handling (default: normalized)
354
+ --heading-style {atx,atx_closed,underlined} # Header style
355
+ --no-extract-metadata # Disable metadata extraction
326
356
  ```
327
357
 
358
+ **All Available Options:**
359
+ The CLI supports all Python API parameters. Use `html_to_markdown --help` to see the complete list.
360
+
328
361
  ## Migration from Markdownify
329
362
 
330
363
  For existing projects using Markdownify, a compatibility layer is provided:
@@ -343,27 +376,17 @@ The `markdownify` function is an alias for `convert_to_markdown` and provides id
343
376
 
344
377
  ## Configuration Reference
345
378
 
346
- Complete list of all configuration options:
379
+ ### Most Common Parameters
347
380
 
348
- ### Document Processing
349
-
350
- - `extract_metadata` (bool, default: `True`): Extract document metadata (title, meta tags) as comment header
351
- - `convert_as_inline` (bool, default: `False`): Treat content as inline elements only (no block elements)
352
- - `strip_newlines` (bool, default: `False`): Remove newlines from HTML input before processing
353
- - `convert` (list, default: `None`): List of HTML tags to convert (None = all supported tags)
354
- - `strip` (list, default: `None`): List of HTML tags to remove from output
355
- - `custom_converters` (dict, default: `None`): Mapping of HTML tag names to custom converter functions
356
-
357
- ### Streaming Support
358
-
359
- - `stream_processing` (bool, default: `False`): Enable streaming processing for large documents
360
- - `chunk_size` (int, default: `1024`): Size of chunks when using streaming processing
361
- - `chunk_callback` (callable, default: `None`): Callback function called with each processed chunk
362
- - `progress_callback` (callable, default: `None`): Callback function called with (processed_bytes, total_bytes)
381
+ - `list_indent_width` (int, default: `4`): Number of spaces per indentation level (use 2 for Discord/Slack)
382
+ - `list_indent_type` (str, default: `'spaces'`): Use `'spaces'` or `'tabs'` for list indentation
383
+ - `heading_style` (str, default: `'underlined'`): Header style (`'underlined'`, `'atx'`, `'atx_closed'`)
384
+ - `whitespace_mode` (str, default: `'normalized'`): Whitespace handling (`'normalized'` or `'strict'`)
385
+ - `preprocess_html` (bool, default: `False`): Enable HTML preprocessing to clean messy HTML
386
+ - `extract_metadata` (bool, default: `True`): Extract document metadata as comment header
363
387
 
364
388
  ### Text Formatting
365
389
 
366
- - `heading_style` (str, default: `'underlined'`): Header style (`'underlined'`, `'atx'`, `'atx_closed'`)
367
390
  - `highlight_style` (str, default: `'double-equal'`): Style for highlighted text (`'double-equal'`, `'html'`, `'bold'`)
368
391
  - `strong_em_symbol` (str, default: `'*'`): Symbol for strong/emphasized text (`'*'` or `'_'`)
369
392
  - `bullets` (str, default: `'*+-'`): Characters to use for bullet points in lists
@@ -371,6 +394,21 @@ Complete list of all configuration options:
371
394
  - `sub_symbol` (str, default: `''`): Custom symbol for subscript text
372
395
  - `sup_symbol` (str, default: `''`): Custom symbol for superscript text
373
396
 
397
+ ### Parser Options
398
+
399
+ - `parser` (str, default: auto-detect): BeautifulSoup parser to use (`'lxml'`, `'html.parser'`, `'html5lib'`)
400
+ - `preprocessing_preset` (str, default: `'standard'`): Preprocessing level (`'minimal'`, `'standard'`, `'aggressive'`)
401
+ - `remove_forms` (bool, default: `True`): Remove form elements during preprocessing
402
+ - `remove_navigation` (bool, default: `True`): Remove navigation elements during preprocessing
403
+
404
+ ### Document Processing
405
+
406
+ - `convert_as_inline` (bool, default: `False`): Treat content as inline elements only
407
+ - `strip_newlines` (bool, default: `False`): Remove newlines from HTML input before processing
408
+ - `convert` (list, default: `None`): List of HTML tags to convert (None = all supported tags)
409
+ - `strip` (list, default: `None`): List of HTML tags to remove from output
410
+ - `custom_converters` (dict, default: `None`): Mapping of HTML tag names to custom converter functions
411
+
374
412
  ### Text Escaping
375
413
 
376
414
  - `escape_asterisks` (bool, default: `True`): Escape `*` characters to prevent unintended formatting
@@ -393,6 +431,15 @@ Complete list of all configuration options:
393
431
  - `wrap` (bool, default: `False`): Enable text wrapping
394
432
  - `wrap_width` (int, default: `80`): Width for text wrapping
395
433
 
434
+ ### HTML Processing
435
+
436
+ - `parser` (str, default: auto-detect): BeautifulSoup parser to use (`'lxml'`, `'html.parser'`, `'html5lib'`)
437
+ - `whitespace_mode` (str, default: `'normalized'`): How to handle whitespace (`'normalized'` intelligently cleans whitespace, `'strict'` preserves original)
438
+ - `preprocess_html` (bool, default: `False`): Enable HTML preprocessing to clean messy HTML
439
+ - `preprocessing_preset` (str, default: `'standard'`): Preprocessing aggressiveness (`'minimal'` for basic cleaning, `'standard'` for balanced, `'aggressive'` for heavy cleaning)
440
+ - `remove_forms` (bool, default: `True`): Remove form elements during preprocessing
441
+ - `remove_navigation` (bool, default: `True`): Remove navigation elements during preprocessing
442
+
396
443
  ## Contribution
397
444
 
398
445
  This library is open to contribution. Feel free to open issues or submit PRs. Its better to discuss issues before
@@ -450,17 +497,6 @@ uv run python -m html_to_markdown input.html
450
497
  uv build
451
498
  ```
452
499
 
453
- ## Performance
454
-
455
- The library is optimized for performance with several key features:
456
-
457
- - **Efficient ancestor caching**: Reduces repeated DOM traversals using context-aware caching
458
- - **Streaming support**: Process large documents in chunks to minimize memory usage
459
- - **Optional lxml parser**: ~30% faster parsing for complex HTML documents
460
- - **Optimized string operations**: Minimizes string concatenations in hot paths
461
-
462
- Typical throughput: ~2 MB/s for regular processing on modern hardware.
463
-
464
500
  ## License
465
501
 
466
502
  This library uses the MIT license.
@@ -504,42 +540,6 @@ This library provides comprehensive support for all modern HTML5 elements:
504
540
 
505
541
  - `<math>` (MathML support)
506
542
 
507
- ## Advanced Table Support
508
-
509
- The library provides sophisticated handling of complex HTML tables, including merged cells and proper structure conversion:
510
-
511
- ```python
512
- from html_to_markdown import convert_to_markdown
513
-
514
- # Complex table with merged cells
515
- html = """
516
- <table>
517
- <caption>Sales Report</caption>
518
- <tr>
519
- <th rowspan="2">Product</th>
520
- <th colspan="2">Quarterly Sales</th>
521
- </tr>
522
- <tr>
523
- <th>Q1</th>
524
- <th>Q2</th>
525
- </tr>
526
- <tr>
527
- <td>Widget A</td>
528
- <td>$50K</td>
529
- <td>$75K</td>
530
- </tr>
531
- </table>
532
- """
533
-
534
- result = convert_to_markdown(html)
535
- ```
536
-
537
- **Features:**
538
-
539
- - **Merged cell support**: Handles `rowspan` and `colspan` attributes intelligently
540
- - **Clean output**: Automatically removes table styling elements that don't translate to Markdown
541
- - **Structure preservation**: Maintains table hierarchy and relationships
542
-
543
543
  ## Acknowledgments
544
544
 
545
545
  Special thanks to the original [markdownify](https://pypi.org/project/markdownify/) project creators and contributors.
@@ -0,0 +1,17 @@
1
+ html_to_markdown/__init__.py,sha256=TzZzhZDJHeXW_3B9zceYehz2zlttqdLsDr5un8stZLM,653
2
+ html_to_markdown/__main__.py,sha256=E9d62nVceR_5TUWgVu5L5CnSZxKcnT_7a6ScWZUGE-s,292
3
+ html_to_markdown/cli.py,sha256=ilnrJN2XMhPDQ4UkkG4cjLXTvglu_ZJj-bBsohVF3fw,8541
4
+ html_to_markdown/constants.py,sha256=CKFVHjUZKgi8-lgU6AHPic7X5ChlTkbZt4Jv6VaVjjs,665
5
+ html_to_markdown/converters.py,sha256=ewdKUwkQXuwgzwCBhxZ1AJufX90jR_aGLr02GkdB2So,32443
6
+ html_to_markdown/exceptions.py,sha256=YjfwVCWE_oZakr9iy0E-_aPSYHNaocJZgWeQ9Enty7Q,1212
7
+ html_to_markdown/preprocessor.py,sha256=acmuJJvx1RaXE3c0F_aWsartQE0cEpa3AOnJYGnPzqw,9708
8
+ html_to_markdown/processing.py,sha256=tqrBfXKqbN_rQbFOY4pGhDjY9fHyj_E1gOlhqE1ywK0,34214
9
+ html_to_markdown/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
10
+ html_to_markdown/utils.py,sha256=4Vzk2cCjxN0LAZ1DXQCufYtxE7a6739TYgPbje-VM_E,1086
11
+ html_to_markdown/whitespace.py,sha256=b8Vf_AWhIvGFqka4Au0GsxsOYeYRO9XBpD4DxW99Pg0,7806
12
+ html_to_markdown-1.10.0.dist-info/licenses/LICENSE,sha256=3J_HR5BWvUM1mlIrlkF32-uC1FM64gy8JfG17LBuheQ,1122
13
+ html_to_markdown-1.10.0.dist-info/METADATA,sha256=LlFYc0EDFdfapqLacVQ9Da12SjEWKExW-L-5j55bicM,17797
14
+ html_to_markdown-1.10.0.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
15
+ html_to_markdown-1.10.0.dist-info/entry_points.txt,sha256=xmFijrTfgYW7lOrZxZGRPciicQHa5KiXKkUhBCmICtQ,116
16
+ html_to_markdown-1.10.0.dist-info/top_level.txt,sha256=Ev6djb1c4dSKr_-n4K-FpEGDkzBigXY6LuZ5onqS7AE,17
17
+ html_to_markdown-1.10.0.dist-info/RECORD,,
@@ -1,16 +0,0 @@
1
- html_to_markdown/__init__.py,sha256=TzZzhZDJHeXW_3B9zceYehz2zlttqdLsDr5un8stZLM,653
2
- html_to_markdown/__main__.py,sha256=DJyJX7NIK0BVPNS2r3BYJ0Ci_lKHhgVOpw7ZEqACH3c,323
3
- html_to_markdown/cli.py,sha256=8xlgSEcnqsSM_dr1TCSgPDAo09YvUtO78PvDFivFFdg,6973
4
- html_to_markdown/constants.py,sha256=8vqANd-7wYvDzBm1VXZvdIxS4Xom4Ov_Yghg6jvmyio,584
5
- html_to_markdown/converters.py,sha256=ESOZQSW8qGAG1S9f_iDpPUirKIc9MGz_G0_rqbTCJ30,50018
6
- html_to_markdown/exceptions.py,sha256=s1DaG6A23rOurF91e4jryuUzplWcC_JIAuK9_bw_4jQ,1558
7
- html_to_markdown/preprocessor.py,sha256=S4S1ZfLC_hkJVgmA5atImTyWQDOxfHctPbaep2QtyrQ,11248
8
- html_to_markdown/processing.py,sha256=iUVZfDG_QmFsY32O3mJZEuyxS2m8cjZaNnsstx2RkQo,40544
9
- html_to_markdown/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
10
- html_to_markdown/utils.py,sha256=QgWPzmpZKFd6wDTe8IY3gbVT3xNzoGV3PBgd17J0O-w,2066
11
- html_to_markdown-1.9.0.dist-info/licenses/LICENSE,sha256=3J_HR5BWvUM1mlIrlkF32-uC1FM64gy8JfG17LBuheQ,1122
12
- html_to_markdown-1.9.0.dist-info/METADATA,sha256=Rptd2quL9YEGi7Bmh-pgbdPGx-8Ud8EZeZZLQNIMEik,18450
13
- html_to_markdown-1.9.0.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
14
- html_to_markdown-1.9.0.dist-info/entry_points.txt,sha256=xmFijrTfgYW7lOrZxZGRPciicQHa5KiXKkUhBCmICtQ,116
15
- html_to_markdown-1.9.0.dist-info/top_level.txt,sha256=Ev6djb1c4dSKr_-n4K-FpEGDkzBigXY6LuZ5onqS7AE,17
16
- html_to_markdown-1.9.0.dist-info/RECORD,,