smol-html 0.1.3__tar.gz → 0.1.4__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: smol-html
3
- Version: 0.1.3
3
+ Version: 0.1.4
4
4
  Summary: Small, dependable HTML cleaner/minifier with sensible defaults
5
5
  Project-URL: Homepage, https://github.com/NosibleAI/smol-html
6
6
  Project-URL: Repository, https://github.com/NosibleAI/smol-html
@@ -28,7 +28,7 @@ Requires-Dist: lxml[html-clean]>=1.3.2
28
28
  Requires-Dist: minify-html>=0.2.6
29
29
  Description-Content-Type: text/markdown
30
30
 
31
- ![smol](smol.png)
31
+ ![smol](https://github.com/NosibleAI/smol-html/blob/main/smol.png?raw=true)
32
32
 
33
33
 
34
34
  # smol-html
@@ -39,7 +39,7 @@ Small, dependable HTML cleaner/minifier with sensible defaults.
39
39
 
40
40
  Nosible is a search engine, which means we need to store and process a very large number of webpages. To make this tractable, we strip out visual chrome and other non-essential components that don’t matter for downstream tasks (indexing, ranking, retrieval, and LLM pipelines) while preserving the important content and structure. This package cleans and minifies HTML, greatly reducing size on disk; combined with Brotli compression (by Google), the savings are even larger.
41
41
 
42
- ![Cleaning impact on size](eval.png)
42
+ ![Cleaning impact on size](https://github.com/NosibleAI/smol-html/blob/main/eval.png?raw=true)
43
43
 
44
44
  ### 📦 Installation
45
45
 
@@ -53,6 +53,15 @@ pip install smol-html
53
53
  uv pip install smol-html
54
54
  ```
55
55
 
56
+ ### Requirements
57
+
58
+ - Python: 3.9
59
+ - Dependencies:
60
+ - beautifulsoup4>=4.0.1
61
+ - brotli>=0.5.2
62
+ - lxml[html-clean]>=1.3.2
63
+ - minify-html>=0.2.6
64
+
56
65
  ## Quick Start
57
66
 
58
67
  Clean an HTML string (or page contents):
@@ -1,4 +1,4 @@
1
- ![smol](smol.png)
1
+ ![smol](https://github.com/NosibleAI/smol-html/blob/main/smol.png?raw=true)
2
2
 
3
3
 
4
4
  # smol-html
@@ -9,7 +9,7 @@ Small, dependable HTML cleaner/minifier with sensible defaults.
9
9
 
10
10
  Nosible is a search engine, which means we need to store and process a very large number of webpages. To make this tractable, we strip out visual chrome and other non-essential components that don’t matter for downstream tasks (indexing, ranking, retrieval, and LLM pipelines) while preserving the important content and structure. This package cleans and minifies HTML, greatly reducing size on disk; combined with Brotli compression (by Google), the savings are even larger.
11
11
 
12
- ![Cleaning impact on size](eval.png)
12
+ ![Cleaning impact on size](https://github.com/NosibleAI/smol-html/blob/main/eval.png?raw=true)
13
13
 
14
14
  ### 📦 Installation
15
15
 
@@ -17,13 +17,22 @@ Nosible is a search engine, which means we need to store and process a very larg
17
17
  pip install smol-html
18
18
  ```
19
19
 
20
- ### ⚡ Installing with uv
21
-
22
- ```bash
23
- uv pip install smol-html
24
- ```
25
-
26
- ## Quick Start
20
+ ### ⚡ Installing with uv
21
+
22
+ ```bash
23
+ uv pip install smol-html
24
+ ```
25
+
26
+ ### Requirements
27
+
28
+ - Python: 3.9
29
+ - Dependencies:
30
+ - beautifulsoup4>=4.0.1
31
+ - brotli>=0.5.2
32
+ - lxml[html-clean]>=1.3.2
33
+ - minify-html>=0.2.6
34
+
35
+ ## Quick Start
27
36
 
28
37
  Clean an HTML string (or page contents):
29
38
 
@@ -1,6 +1,6 @@
1
1
  [project]
2
2
  name = "smol-html"
3
- version = "0.1.3"
3
+ version = "0.1.4"
4
4
  description = "Small, dependable HTML cleaner/minifier with sensible defaults"
5
5
  readme = { file = "README.md", content-type = "text/markdown" }
6
6
  requires-python = ">=3.9"
File without changes
File without changes
File without changes