loofah 2.2.3 → 2.19.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +221 -31
- data/README.md +18 -24
- data/lib/loofah/elements.rb +79 -75
- data/lib/loofah/helpers.rb +18 -7
- data/lib/loofah/html/document.rb +1 -0
- data/lib/loofah/html/document_fragment.rb +4 -2
- data/lib/loofah/html5/libxml2_workarounds.rb +8 -7
- data/lib/loofah/html5/safelist.rb +1042 -0
- data/lib/loofah/html5/scrub.rb +150 -55
- data/lib/loofah/instance_methods.rb +14 -8
- data/lib/loofah/metahelpers.rb +2 -1
- data/lib/loofah/scrubber.rb +12 -7
- data/lib/loofah/scrubbers.rb +21 -19
- data/lib/loofah/version.rb +5 -0
- data/lib/loofah/xml/document.rb +1 -0
- data/lib/loofah/xml/document_fragment.rb +2 -1
- data/lib/loofah.rb +35 -18
- metadata +52 -138
- data/.gemtest +0 -0
- data/Gemfile +0 -22
- data/Manifest.txt +0 -40
- data/Rakefile +0 -79
- data/benchmark/benchmark.rb +0 -149
- data/benchmark/fragment.html +0 -96
- data/benchmark/helper.rb +0 -73
- data/benchmark/www.slashdot.com.html +0 -2560
- data/lib/loofah/html5/whitelist.rb +0 -186
- data/test/assets/msword.html +0 -63
- data/test/assets/testdata_sanitizer_tests1.dat +0 -502
- data/test/helper.rb +0 -18
- data/test/html5/test_sanitizer.rb +0 -382
- data/test/integration/test_ad_hoc.rb +0 -204
- data/test/integration/test_helpers.rb +0 -43
- data/test/integration/test_html.rb +0 -72
- data/test/integration/test_scrubbers.rb +0 -400
- data/test/integration/test_xml.rb +0 -55
- data/test/unit/test_api.rb +0 -142
- data/test/unit/test_encoding.rb +0 -20
- data/test/unit/test_helpers.rb +0 -62
- data/test/unit/test_scrubber.rb +0 -229
- data/test/unit/test_scrubbers.rb +0 -14
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: bd3edb0acdf2359d82564aca0bc13710d9f6c49157963d18953ff55bd7c14413
|
4
|
+
data.tar.gz: 3a6e11b7deb9cfb469aaf6ec919062687bd4215ef11980bded72ca298807610c
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 4970a6aa72265f60556dd6fd254375c86d3f83be23f3bbcc8b04df00ce0e801e8ef9e67d0a77ca6a21915be89226131c16a7f3540f02538cc2b9a369950dfebf
|
7
|
+
data.tar.gz: 27e3a06cc391ec3d9e3c966efdb6b4ce58e98c397ec87490d418406c17757e5cb0193edabaced30a9f24320c729e6730308e346610859f9f7c6d5fcc6f72cd56
|
data/CHANGELOG.md
CHANGED
@@ -1,12 +1,202 @@
|
|
1
1
|
# Changelog
|
2
2
|
|
3
|
+
## 2.19.1 / 2022-12-13
|
4
|
+
|
5
|
+
### Security
|
6
|
+
|
7
|
+
* Address CVE-2022-23514, inefficient regular expression complexity. See [GHSA-486f-hjj9-9vhh](https://github.com/flavorjones/loofah/security/advisories/GHSA-486f-hjj9-9vhh) for more information.
|
8
|
+
* Address CVE-2022-23515, improper neutralization of data URIs. See [GHSA-228g-948r-83gx](https://github.com/flavorjones/loofah/security/advisories/GHSA-228g-948r-83gx) for more information.
|
9
|
+
* Address CVE-2022-23516, uncontrolled recursion. See [GHSA-3x8r-x6xp-q4vm](https://github.com/flavorjones/loofah/security/advisories/GHSA-3x8r-x6xp-q4vm) for more information.
|
10
|
+
|
11
|
+
|
12
|
+
## 2.19.0 / 2022-09-14
|
13
|
+
|
14
|
+
### Features
|
15
|
+
|
16
|
+
* Allow SVG 1.0 color keyword names in CSS attributes. These colors are part of the [CSS Color Module Level 3](https://www.w3.org/TR/css-color-3/#svg-color) recommendation released 2022-01-18. [[#243](https://github.com/flavorjones/loofah/issues/243)]
|
17
|
+
|
18
|
+
|
19
|
+
## 2.18.0 / 2022-05-11
|
20
|
+
|
21
|
+
### Features
|
22
|
+
|
23
|
+
* Allow CSS property `aspect-ratio`. [[#236](https://github.com/flavorjones/loofah/issues/236)] (Thanks, [@louim](https://github.com/louim)!)
|
24
|
+
|
25
|
+
|
26
|
+
## 2.17.0 / 2022-04-28
|
27
|
+
|
28
|
+
### Features
|
29
|
+
|
30
|
+
* Allow ARIA attributes. [[#232](https://github.com/flavorjones/loofah/issues/232), [#233](https://github.com/flavorjones/loofah/issues/233)] (Thanks, [@nick-desteffen](https://github.com/nick-desteffen)!)
|
31
|
+
|
32
|
+
|
33
|
+
## 2.16.0 / 2022-04-01
|
34
|
+
|
35
|
+
### Features
|
36
|
+
|
37
|
+
* Allow MathML elements `menclose` and `ms`, and MathML attributes `dir`, `href`, `lquote`, `mathsize`, `notation`, and `rquote`. [[#231](https://github.com/flavorjones/loofah/issues/231)] (Thanks, [@nick-desteffen](https://github.com/nick-desteffen)!)
|
38
|
+
|
39
|
+
|
40
|
+
## 2.15.0 / 2022-03-14
|
41
|
+
|
42
|
+
### Features
|
43
|
+
|
44
|
+
* Expand set of allowed protocols to include `sms:`. [[#228](https://github.com/flavorjones/loofah/issues/228)] (Thanks, [@brendon](https://github.com/brendon)!)
|
45
|
+
|
46
|
+
|
47
|
+
## 2.14.0 / 2022-02-11
|
48
|
+
|
49
|
+
### Features
|
50
|
+
|
51
|
+
* The `#to_text` method on `Loofah::HTML::{Document,DocumentFragment}` replaces `<br>` line break elements with a newline. [[#225](https://github.com/flavorjones/loofah/issues/225)]
|
52
|
+
|
53
|
+
|
54
|
+
## 2.13.0 / 2021-12-10
|
55
|
+
|
56
|
+
### Bug fixes
|
57
|
+
|
58
|
+
* Loofah::HTML::DocumentFragment#text no longer serializes top-level comment children. [[#221](https://github.com/flavorjones/loofah/issues/221)]
|
59
|
+
|
60
|
+
|
61
|
+
## 2.12.0 / 2021-08-11
|
62
|
+
|
63
|
+
### Features
|
64
|
+
|
65
|
+
* Support empty HTML5 data attributes. [[#215](https://github.com/flavorjones/loofah/issues/215)]
|
66
|
+
|
67
|
+
|
68
|
+
## 2.11.0 / 2021-07-31
|
69
|
+
|
70
|
+
### Features
|
71
|
+
|
72
|
+
* Allow HTML5 element `wbr`.
|
73
|
+
* Allow all CSS property values for `border-collapse`. [[#201](https://github.com/flavorjones/loofah/issues/201)]
|
74
|
+
|
75
|
+
|
76
|
+
### Changes
|
77
|
+
|
78
|
+
* Deprecating `Loofah::HTML5::SafeList::VOID_ELEMENTS` which is not a canonical list of void HTML4 or HTML5 elements.
|
79
|
+
* Removed some elements from `Loofah::HTML5::SafeList::VOID_ELEMENTS` that either are not acceptable elements or aren't considered "void" by libxml2.
|
80
|
+
|
81
|
+
|
82
|
+
## 2.10.0 / 2021-06-06
|
83
|
+
|
84
|
+
### Features
|
85
|
+
|
86
|
+
* Allow CSS properties `overflow-x` and `overflow-y`. [[#206](https://github.com/flavorjones/loofah/issues/206)] (Thanks, [@sampokuokkanen](https://github.com/sampokuokkanen)!)
|
87
|
+
|
88
|
+
|
89
|
+
## 2.9.1 / 2021-04-07
|
90
|
+
|
91
|
+
### Bug fixes
|
92
|
+
|
93
|
+
* Fix a regression in v2.9.0 which inappropriately removed CSS properties with quoted string values. [[#202](https://github.com/flavorjones/loofah/issues/202)]
|
94
|
+
|
95
|
+
|
96
|
+
## 2.9.0 / 2021-01-14
|
97
|
+
|
98
|
+
### Features
|
99
|
+
|
100
|
+
* Handle CSS functions in a CSS shorthand property (like `background`). [[#199](https://github.com/flavorjones/loofah/issues/199), [#200](https://github.com/flavorjones/loofah/issues/200)]
|
101
|
+
|
102
|
+
|
103
|
+
## 2.8.0 / 2020-11-25
|
104
|
+
|
105
|
+
### Features
|
106
|
+
|
107
|
+
* Allow CSS properties `order`, `flex-direction`, `flex-grow`, `flex-wrap`, `flex-shrink`, `flex-flow`, `flex-basis`, `flex`, `justify-content`, `align-self`, `align-items`, and `align-content`. [[#197](https://github.com/flavorjones/loofah/issues/197)] (Thanks, [@miguelperez](https://github.com/miguelperez)!)
|
108
|
+
|
109
|
+
|
110
|
+
## 2.7.0 / 2020-08-26
|
111
|
+
|
112
|
+
### Features
|
113
|
+
|
114
|
+
* Allow CSS properties `page-break-before`, `page-break-inside`, and `page-break-after`. [[#190](https://github.com/flavorjones/loofah/issues/190)] (Thanks, [@ahorek](https://github.com/ahorek)!)
|
115
|
+
|
116
|
+
|
117
|
+
### Fixes
|
118
|
+
|
119
|
+
* Don't drop the `!important` rule from some CSS properties. [[#191](https://github.com/flavorjones/loofah/issues/191)] (Thanks, [@b7kich](https://github.com/b7kich)!)
|
120
|
+
|
121
|
+
|
122
|
+
## 2.6.0 / 2020-06-16
|
123
|
+
|
124
|
+
### Features
|
125
|
+
|
126
|
+
* Allow CSS `border-style` keywords. [[#188](https://github.com/flavorjones/loofah/issues/188)] (Thanks, [@tarcisiozf](https://github.com/tarcisiozf)!)
|
127
|
+
|
128
|
+
|
129
|
+
## 2.5.0 / 2020-04-05
|
130
|
+
|
131
|
+
### Features
|
132
|
+
|
133
|
+
* Allow more CSS length units: "ch", "vw", "vh", "Q", "lh", "vmin", "vmax". [[#178](https://github.com/flavorjones/loofah/issues/178)] (Thanks, [@JuanitoFatas](https://github.com/JuanitoFatas)!)
|
134
|
+
|
135
|
+
|
136
|
+
### Fixes
|
137
|
+
|
138
|
+
* Remove comments from `Loofah::HTML::Document`s that exist outside the `html` element. [[#80](https://github.com/flavorjones/loofah/issues/80)]
|
139
|
+
|
140
|
+
|
141
|
+
### Other changes
|
142
|
+
|
143
|
+
* Gem metadata being set [[#181](https://github.com/flavorjones/loofah/issues/181)] (Thanks, [@JuanitoFatas](https://github.com/JuanitoFatas)!)
|
144
|
+
* Test files removed from gem file [[#180](https://github.com/flavorjones/loofah/issues/180),[#166](https://github.com/flavorjones/loofah/issues/166),[#159](https://github.com/flavorjones/loofah/issues/159)] (Thanks, [@JuanitoFatas](https://github.com/JuanitoFatas) and [@greysteil](https://github.com/greysteil)!)
|
145
|
+
|
146
|
+
|
147
|
+
## 2.4.0 / 2019-11-25
|
148
|
+
|
149
|
+
### Features
|
150
|
+
|
151
|
+
* Allow CSS property `max-width` [[#175](https://github.com/flavorjones/loofah/issues/175)] (Thanks, [@bchaney](https://github.com/bchaney)!)
|
152
|
+
* Allow CSS sizes expressed in `rem` [[#176](https://github.com/flavorjones/loofah/issues/176), [#177](https://github.com/flavorjones/loofah/issues/177)]
|
153
|
+
* Add `frozen_string_literal: true` magic comment to all `lib` files. [[#118](https://github.com/flavorjones/loofah/issues/118)]
|
154
|
+
|
155
|
+
|
156
|
+
## 2.3.1 / 2019-10-22
|
157
|
+
|
158
|
+
### Security
|
159
|
+
|
160
|
+
Address CVE-2019-15587: Unsanitized JavaScript may occur in sanitized output when a crafted SVG element is republished.
|
161
|
+
|
162
|
+
This CVE's public notice is at [#171](https://github.com/flavorjones/loofah/issues/171)
|
163
|
+
|
164
|
+
|
165
|
+
## 2.3.0 / 2019-09-28
|
166
|
+
|
167
|
+
### Features
|
168
|
+
|
169
|
+
* Expand set of allowed protocols to include `tel:` and `line:`. [[#104](https://github.com/flavorjones/loofah/issues/104), [#147](https://github.com/flavorjones/loofah/issues/147)]
|
170
|
+
* Expand set of allowed CSS functions. [related to [#122](https://github.com/flavorjones/loofah/issues/122)]
|
171
|
+
* Allow greater precision in shorthand CSS values. [[#149](https://github.com/flavorjones/loofah/issues/149)] (Thanks, [@danfstucky](https://github.com/danfstucky)!)
|
172
|
+
* Allow CSS property `list-style` [[#162](https://github.com/flavorjones/loofah/issues/162)] (Thanks, [@jaredbeck](https://github.com/jaredbeck)!)
|
173
|
+
* Allow CSS keywords `thick` and `thin` [[#168](https://github.com/flavorjones/loofah/issues/168)] (Thanks, [@georgeclaghorn](https://github.com/georgeclaghorn)!)
|
174
|
+
* Allow HTML property `contenteditable` [[#167](https://github.com/flavorjones/loofah/issues/167)] (Thanks, [@andreynering](https://github.com/andreynering)!)
|
175
|
+
|
176
|
+
|
177
|
+
### Bug fixes
|
178
|
+
|
179
|
+
* CSS hex values are no longer limited to lowercase hex. Previously uppercase hex were scrubbed. [[#165](https://github.com/flavorjones/loofah/issues/165)] (Thanks, [@asok](https://github.com/asok)!)
|
180
|
+
|
181
|
+
|
182
|
+
### Deprecations / Name Changes
|
183
|
+
|
184
|
+
The following method and constants are hereby deprecated, and will be completely removed in a future release:
|
185
|
+
|
186
|
+
* Deprecate `Loofah::Helpers::ActionView.white_list_sanitizer`, please use `Loofah::Helpers::ActionView.safe_list_sanitizer` instead.
|
187
|
+
* Deprecate `Loofah::Helpers::ActionView::WhiteListSanitizer`, please use `Loofah::Helpers::ActionView::SafeListSanitizer` instead.
|
188
|
+
* Deprecate `Loofah::HTML5::WhiteList`, please use `Loofah::HTML5::SafeList` instead.
|
189
|
+
|
190
|
+
Thanks to [@JuanitoFatas](https://github.com/JuanitoFatas) for submitting these changes in [#164](https://github.com/flavorjones/loofah/issues/164) and for making the language used in Loofah more inclusive.
|
191
|
+
|
192
|
+
|
3
193
|
## 2.2.3 / 2018-10-30
|
4
194
|
|
5
195
|
### Security
|
6
196
|
|
7
197
|
Address CVE-2018-16468: Unsanitized JavaScript may occur in sanitized output when a crafted SVG element is republished.
|
8
198
|
|
9
|
-
This CVE's public notice is at https://github.com/flavorjones/loofah/issues/154
|
199
|
+
This CVE's public notice is at [#154](https://github.com/flavorjones/loofah/issues/154)
|
10
200
|
|
11
201
|
|
12
202
|
## Meta / 2018-10-27
|
@@ -33,76 +223,76 @@ attribute scrubbers should they need to address CVE-2018-8048.
|
|
33
223
|
|
34
224
|
Addresses CVE-2018-8048. Loofah allowed non-whitelisted attributes to be present in sanitized output when input with specially-crafted HTML fragments.
|
35
225
|
|
36
|
-
This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
|
226
|
+
This CVE's public notice is at [#144](https://github.com/flavorjones/loofah/issues/144)
|
37
227
|
|
38
228
|
|
39
229
|
## 2.2.0 / 2018-02-11
|
40
230
|
|
41
231
|
### Features:
|
42
232
|
|
43
|
-
* Support HTML5 `<main>` tag. #133 (Thanks, @MothOnMars!)
|
44
|
-
* Recognize HTML5 block elements. #136 (Thanks, @MothOnMars!)
|
45
|
-
* Support SVG `<symbol>` tag. #131 (Thanks, @baopham!)
|
46
|
-
* Support for whitelisting CSS functions, initially just `calc` and `rgb`. #122
|
47
|
-
* Whitelist CSS property `list-style-type`. #68
|
233
|
+
* Support HTML5 `<main>` tag. [#133](https://github.com/flavorjones/loofah/issues/133) (Thanks, [@MothOnMars](https://github.com/MothOnMars)!)
|
234
|
+
* Recognize HTML5 block elements. [#136](https://github.com/flavorjones/loofah/issues/136) (Thanks, [@MothOnMars](https://github.com/MothOnMars)!)
|
235
|
+
* Support SVG `<symbol>` tag. [#131](https://github.com/flavorjones/loofah/issues/131) (Thanks, [@baopham](https://github.com/baopham)!)
|
236
|
+
* Support for whitelisting CSS functions, initially just `calc` and `rgb`. [#122](https://github.com/flavorjones/loofah/issues/122)/[#123](https://github.com/flavorjones/loofah/issues/123)/[#129](https://github.com/flavorjones/loofah/issues/129) (Thanks, [@NikoRoberts](https://github.com/NikoRoberts)!)
|
237
|
+
* Whitelist CSS property `list-style-type`. [#68](https://github.com/flavorjones/loofah/issues/68)/[#137](https://github.com/flavorjones/loofah/issues/137)/[#142](https://github.com/flavorjones/loofah/issues/142) (Thanks, [@andela-ysanni](https://github.com/andela-ysanni) and [@NikoRoberts](https://github.com/NikoRoberts)!)
|
48
238
|
|
49
239
|
### Bugfixes:
|
50
240
|
|
51
|
-
* Properly handle nested `script` tags. #127.
|
241
|
+
* Properly handle nested `script` tags. [#127](https://github.com/flavorjones/loofah/issues/127).
|
52
242
|
|
53
243
|
|
54
244
|
## 2.1.1 / 2017-09-24
|
55
245
|
|
56
246
|
### Bugfixes:
|
57
247
|
|
58
|
-
* Removed warning for unused variable. #124 (Thanks, @y-yagi!)
|
248
|
+
* Removed warning for unused variable. [#124](https://github.com/flavorjones/loofah/issues/124) (Thanks, [@y-yagi](https://github.com/y-yagi)!)
|
59
249
|
|
60
250
|
|
61
251
|
## 2.1.0 / 2017-09-24
|
62
252
|
|
63
253
|
### Notes:
|
64
254
|
|
65
|
-
* Re-implemented CSS parsing and sanitization using the [crass](https://github.com/rgrove/crass) library. #91
|
255
|
+
* Re-implemented CSS parsing and sanitization using the [crass](https://github.com/rgrove/crass) library. [#91](https://github.com/flavorjones/loofah/issues/91)
|
66
256
|
|
67
257
|
|
68
258
|
### Features:
|
69
259
|
|
70
|
-
* Added :noopener HTML scrubber (Thanks, @tastycode!)
|
71
|
-
* Support `data` URIs with the following media types: text/plain, text/css, image/png, image/gif, image/jpeg, image/svg+xml. #101, #120. (Thanks, @mrpasquini!)
|
260
|
+
* Added :noopener HTML scrubber (Thanks, [@tastycode](https://github.com/tastycode)!)
|
261
|
+
* Support `data` URIs with the following media types: text/plain, text/css, image/png, image/gif, image/jpeg, image/svg+xml. [#101](https://github.com/flavorjones/loofah/issues/101), [#120](https://github.com/flavorjones/loofah/issues/120). (Thanks, [@mrpasquini](https://github.com/mrpasquini)!)
|
72
262
|
|
73
263
|
|
74
264
|
### Bugfixes:
|
75
265
|
|
76
|
-
* The :unprintable scrubber now scrubs unprintable characters in CDATA nodes (like `<script>`). #124
|
77
|
-
* Allow negative values in CSS properties. Restores functionality that was reverted in v2.0.3. #91
|
266
|
+
* The :unprintable scrubber now scrubs unprintable characters in CDATA nodes (like `<script>`). [#124](https://github.com/flavorjones/loofah/issues/124)
|
267
|
+
* Allow negative values in CSS properties. Restores functionality that was reverted in v2.0.3. [#91](https://github.com/flavorjones/loofah/issues/91)
|
78
268
|
|
79
269
|
|
80
270
|
## 2.0.3 / 2015-08-17
|
81
271
|
|
82
272
|
### Bug fixes:
|
83
273
|
|
84
|
-
* Revert support for negative values in CSS properties due to slow performance. #90 (Related to #85.)
|
274
|
+
* Revert support for negative values in CSS properties due to slow performance. [#90](https://github.com/flavorjones/loofah/issues/90) (Related to [#85](https://github.com/flavorjones/loofah/issues/85).)
|
85
275
|
|
86
276
|
|
87
277
|
## 2.0.2 / 2015-05-05
|
88
278
|
|
89
279
|
### Bug fixes:
|
90
280
|
|
91
|
-
* Fix error with `#to_text` when Loofah::Helpers hadn't been required. #75
|
92
|
-
* Allow multi-word data attributes. #84 (Thanks, @jstorimer!)
|
93
|
-
* Allow negative values in CSS properties. #85 (Thanks, @siddhartham!)
|
281
|
+
* Fix error with `#to_text` when Loofah::Helpers hadn't been required. [#75](https://github.com/flavorjones/loofah/issues/75)
|
282
|
+
* Allow multi-word data attributes. [#84](https://github.com/flavorjones/loofah/issues/84) (Thanks, [@jstorimer](https://github.com/jstorimer)!)
|
283
|
+
* Allow negative values in CSS properties. [#85](https://github.com/flavorjones/loofah/issues/85) (Thanks, [@siddhartham](https://github.com/siddhartham)!)
|
94
284
|
|
95
285
|
|
96
286
|
## 2.0.1 / 2014-08-21
|
97
287
|
|
98
288
|
### Bug fixes:
|
99
289
|
|
100
|
-
* Load RR correctly when running test files directly. (Thanks, @ktdreyer!)
|
290
|
+
* Load RR correctly when running test files directly. (Thanks, [@ktdreyer](https://github.com/ktdreyer)!)
|
101
291
|
|
102
292
|
|
103
293
|
### Notes:
|
104
294
|
|
105
|
-
* Extracted HTML5::Scrub#scrub_css_attribute to accommodate the Rails integration work. (Thanks, @kaspth!)
|
295
|
+
* Extracted HTML5::Scrub#scrub_css_attribute to accommodate the Rails integration work. (Thanks, [@kaspth](https://github.com/kaspth)!)
|
106
296
|
|
107
297
|
|
108
298
|
## 2.0.0 / 2014-05-09
|
@@ -118,19 +308,19 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
|
|
118
308
|
* tags: `article`, `aside`, `bdi`, `bdo`, `canvas`, `command`, `datalist`, `details`, `figcaption`, `figure`, `footer`, `header`, `mark`, `meter`, `nav`, `output`, `section`, `summary`, `time`
|
119
309
|
* attributes: `data-*` (Thanks, Rafael Franca!)
|
120
310
|
* URI attributes: `poster` and `preload`
|
121
|
-
* Addition of the `:unprintable` scrubber to remove unprintable characters from text nodes. #65 (Thanks, Matt Swanson!)
|
122
|
-
* `Loofah.fragment` accepts an optional encoding argument, compatible with `Nokogiri::HTML::DocumentFragment.parse`. #62 (Thanks, Ben Atkins!)
|
311
|
+
* Addition of the `:unprintable` scrubber to remove unprintable characters from text nodes. [#65](https://github.com/flavorjones/loofah/issues/65) (Thanks, Matt Swanson!)
|
312
|
+
* `Loofah.fragment` accepts an optional encoding argument, compatible with `Nokogiri::HTML::DocumentFragment.parse`. [#62](https://github.com/flavorjones/loofah/issues/62) (Thanks, Ben Atkins!)
|
123
313
|
* HTML5 sanitizers now remove attributes without values. (Thanks, Kasper Timm Hansen!)
|
124
314
|
|
125
315
|
### Bug fixes:
|
126
316
|
|
127
317
|
* HTML5 sanitizers' CSS keyword check now actually works (broken in v2.0). Additional regression tests added. (Thanks, Kasper Timm Hansen!)
|
128
|
-
* HTML5 sanitizers now allow negative arguments to CSS. #64 (Thanks, Jon Calhoun!)
|
318
|
+
* HTML5 sanitizers now allow negative arguments to CSS. [#64](https://github.com/flavorjones/loofah/issues/64) (Thanks, Jon Calhoun!)
|
129
319
|
|
130
320
|
|
131
321
|
## 1.2.1 (2012-04-14)
|
132
322
|
|
133
|
-
* Declaring encoding in html5/scrub.rb. Without this, use of the ruby -KU option would cause havoc. (#32)
|
323
|
+
* Declaring encoding in html5/scrub.rb. Without this, use of the ruby -KU option would cause havoc. ([#32](https://github.com/flavorjones/loofah/issues/32))
|
134
324
|
|
135
325
|
|
136
326
|
## 1.2.0 (2011-08-08)
|
@@ -148,7 +338,7 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
|
|
148
338
|
* Additional HTML5lib whitelist elements (from html5lib 1524:80b5efe26230).
|
149
339
|
Up to date with HTML5lib ruby code as of 1723:7ee6a0331856.
|
150
340
|
* Whitelists (which are not part of the public API) are now Sets (were previously Arrays).
|
151
|
-
* Don't explode when encountering UTF-8 URIs. (#25, #29)
|
341
|
+
* Don't explode when encountering UTF-8 URIs. ([#25](https://github.com/flavorjones/loofah/issues/25), [#29](https://github.com/flavorjones/loofah/issues/29))
|
152
342
|
|
153
343
|
|
154
344
|
## 1.0.0 (2010-10-26)
|
@@ -166,7 +356,7 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
|
|
166
356
|
* New methods Loofah::HTML::Document#to_text and
|
167
357
|
Loofah::HTML::DocumentFragment#to_text do the right thing with
|
168
358
|
whitespace. Note that these methods are significantly slower than
|
169
|
-
#text. GH #12
|
359
|
+
#text. GH [#12](https://github.com/flavorjones/loofah/issues/12)
|
170
360
|
* Loofah::Elements::BLOCK_LEVEL contains a canonical list of HTML4 block-level4 elements.
|
171
361
|
* Loofah::HTML::Document#text and Loofah::HTML::DocumentFragment#text
|
172
362
|
will return unescaped HTML entities by passing :encode_special_chars => false.
|
@@ -180,7 +370,7 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
|
|
180
370
|
|
181
371
|
### Bug fixes:
|
182
372
|
|
183
|
-
* Loofah::XssFoliate was not properly escaping HTML entities when implicitly scrubbing a string attribute. GH #17
|
373
|
+
* Loofah::XssFoliate was not properly escaping HTML entities when implicitly scrubbing a string attribute. GH [#17](https://github.com/flavorjones/loofah/issues/17)
|
184
374
|
|
185
375
|
|
186
376
|
## 0.4.3 (2010-01-29)
|
@@ -208,7 +398,7 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
|
|
208
398
|
|
209
399
|
### Bug fixes:
|
210
400
|
|
211
|
-
* Supporting Rails apps that aren't loading ActiveRecord. GH #10
|
401
|
+
* Supporting Rails apps that aren't loading ActiveRecord. GH [#10](https://github.com/flavorjones/loofah/issues/10)
|
212
402
|
|
213
403
|
### Miscellaneous:
|
214
404
|
|
@@ -269,13 +459,13 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
|
|
269
459
|
### Enhancements:
|
270
460
|
|
271
461
|
* when loaded in a Rails app, automatically extend ActiveRecord::Base
|
272
|
-
with html_fragment and html_document. GH #6 (Thanks Josh Nichols!)
|
462
|
+
with html_fragment and html_document. GH [#6](https://github.com/flavorjones/loofah/issues/6) (Thanks Josh Nichols!)
|
273
463
|
|
274
464
|
### Bugfixes:
|
275
465
|
|
276
466
|
* ActiveRecord scrubbing should generate strings instead of Document or
|
277
|
-
DocumentFragment objects. GH #5
|
278
|
-
* init.rb fixed to support installation as a Rails plugin. GH #6
|
467
|
+
DocumentFragment objects. GH [#5](https://github.com/flavorjones/loofah/issues/5)
|
468
|
+
* init.rb fixed to support installation as a Rails plugin. GH [#6](https://github.com/flavorjones/loofah/issues/6)
|
279
469
|
(Thanks Josh Nichols!)
|
280
470
|
|
281
471
|
|
data/README.md
CHANGED
@@ -1,36 +1,27 @@
|
|
1
1
|
# Loofah
|
2
2
|
|
3
3
|
* https://github.com/flavorjones/loofah
|
4
|
-
* Docs: http://rubydoc.info/github/flavorjones/loofah/
|
4
|
+
* Docs: http://rubydoc.info/github/flavorjones/loofah/main/frames
|
5
5
|
* Mailing list: [loofah-talk@googlegroups.com](https://groups.google.com/forum/#!forum/loofah-talk)
|
6
6
|
|
7
7
|
## Status
|
8
8
|
|
9
|
-
|
10
|
-
|
11
|
-
| Concourse | [![Concourse CI](https://ci.nokogiri.org/api/v1/teams/nokogiri-core/pipelines/loofah/jobs/ruby-2.5/badge)](https://ci.nokogiri.org/teams/nokogiri-core/pipelines/loofah?groups=master) |
|
12
|
-
| Code Climate | [![Code Climate](https://codeclimate.com/github/flavorjones/loofah.svg)](https://codeclimate.com/github/flavorjones/loofah) |
|
13
|
-
| Version Eye | [![Version Eye](https://www.versioneye.com/ruby/loofah/badge.png)](https://www.versioneye.com/ruby/loofah) |
|
9
|
+
[![ci](https://github.com/flavorjones/loofah/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/flavorjones/loofah/actions/workflows/ci.yml)
|
10
|
+
[![Tidelift dependencies](https://tidelift.com/badges/package/rubygems/loofah)](https://tidelift.com/subscription/pkg/rubygems-loofah?utm_source=rubygems-loofah&utm_medium=referral&utm_campaign=readme)
|
14
11
|
|
15
12
|
|
16
13
|
## Description
|
17
14
|
|
18
|
-
Loofah is a general library for manipulating and transforming HTML/XML
|
19
|
-
documents and fragments. It's built on top of Nokogiri and libxml2, so
|
20
|
-
it's fast and has a nice API.
|
15
|
+
Loofah is a general library for manipulating and transforming HTML/XML documents and fragments, built on top of Nokogiri.
|
21
16
|
|
22
|
-
Loofah excels at HTML sanitization (XSS prevention). It includes some
|
23
|
-
nice HTML sanitizers, which are based on HTML5lib's whitelist, so it
|
24
|
-
most likely won't make your codes less secure. (These statements have
|
25
|
-
not been evaluated by Netexperts.)
|
17
|
+
Loofah excels at HTML sanitization (XSS prevention). It includes some nice HTML sanitizers, which are based on HTML5lib's safelist, so it most likely won't make your codes less secure. (These statements have not been evaluated by Netexperts.)
|
26
18
|
|
27
|
-
ActiveRecord extensions for sanitization are available in the
|
28
|
-
[`loofah-activerecord` gem](https://github.com/flavorjones/loofah-activerecord).
|
19
|
+
ActiveRecord extensions for sanitization are available in the [`loofah-activerecord` gem](https://github.com/flavorjones/loofah-activerecord).
|
29
20
|
|
30
21
|
|
31
22
|
## Features
|
32
23
|
|
33
|
-
* Easily write custom scrubbers for HTML/XML leveraging the sweetness of Nokogiri (and HTML5lib's
|
24
|
+
* Easily write custom scrubbers for HTML/XML leveraging the sweetness of Nokogiri (and HTML5lib's safelists).
|
34
25
|
* Common HTML sanitizing tasks are built-in:
|
35
26
|
* _Strip_ unsafe tags, leaving behind only the inner text.
|
36
27
|
* _Prune_ unsafe tags and their subtrees, removing all traces that they ever existed.
|
@@ -142,13 +133,12 @@ and `text` to return plain text:
|
|
142
133
|
doc.text # => "ohai! div is safe "
|
143
134
|
```
|
144
135
|
|
145
|
-
Also, `to_text` is available, which does the right thing with
|
146
|
-
whitespace around block-level elements.
|
136
|
+
Also, `to_text` is available, which does the right thing with whitespace around block-level and line break elements.
|
147
137
|
|
148
138
|
``` ruby
|
149
|
-
doc = Loofah.fragment("<h1>Title</h1><div>Content</div>")
|
150
|
-
doc.text # => "
|
151
|
-
doc.to_text # => "\nTitle\n\nContent\n"
|
139
|
+
doc = Loofah.fragment("<h1>Title</h1><div>Content<br>Next line</div>")
|
140
|
+
doc.text # => "TitleContentNext line" # probably not what you want
|
141
|
+
doc.to_text # => "\nTitle\n\nContent\nNext line\n" # better
|
152
142
|
```
|
153
143
|
|
154
144
|
### Loofah::XML::Document and Loofah::XML::DocumentFragment
|
@@ -219,10 +209,10 @@ end
|
|
219
209
|
Loofah.xml_document(File.read('plague.xml')).scrub!(bring_out_your_dead)
|
220
210
|
```
|
221
211
|
|
222
|
-
|
212
|
+
### Built-In HTML Scrubbers
|
223
213
|
|
224
214
|
Loofah comes with a set of sanitizing scrubbers that use HTML5lib's
|
225
|
-
|
215
|
+
safelist algorithm:
|
226
216
|
|
227
217
|
``` ruby
|
228
218
|
doc.scrub!(:strip) # replaces unknown/unsafe tags with their inner text
|
@@ -308,6 +298,10 @@ And the mailing list is on Google Groups:
|
|
308
298
|
|
309
299
|
And the IRC channel is \#loofah on freenode.
|
310
300
|
|
301
|
+
Consider subscribing to [Tidelift][tidelift] which provides license assurances and timely security notifications for your open source dependencies, including Loofah. [Tidelift][tidelift] subscriptions also help the Loofah maintainers fund our [automated testing](https://ci.nokogiri.org) which in turn allows us to ship releases, bugfixes, and security updates more often.
|
302
|
+
|
303
|
+
[tidelift]: https://tidelift.com/subscription/pkg/rubygems-loofah?utm_source=undefined&utm_medium=referral&utm_campaign=enterprise
|
304
|
+
|
311
305
|
|
312
306
|
## Security
|
313
307
|
|
@@ -354,7 +348,7 @@ And a big shout-out to Corey Innis for the name, and feedback on the API.
|
|
354
348
|
|
355
349
|
## Thank You
|
356
350
|
|
357
|
-
The following people have generously
|
351
|
+
The following people have generously funded Loofah:
|
358
352
|
|
359
353
|
* Bill Harding
|
360
354
|
|
data/lib/loofah/elements.rb
CHANGED
@@ -1,91 +1,95 @@
|
|
1
|
-
|
1
|
+
# frozen_string_literal: true
|
2
|
+
require "set"
|
2
3
|
|
3
4
|
module Loofah
|
4
5
|
module Elements
|
5
6
|
STRICT_BLOCK_LEVEL_HTML4 = Set.new %w[
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
28
|
-
|
29
|
-
|
30
|
-
|
7
|
+
address
|
8
|
+
blockquote
|
9
|
+
center
|
10
|
+
dir
|
11
|
+
div
|
12
|
+
dl
|
13
|
+
fieldset
|
14
|
+
form
|
15
|
+
h1
|
16
|
+
h2
|
17
|
+
h3
|
18
|
+
h4
|
19
|
+
h5
|
20
|
+
h6
|
21
|
+
hr
|
22
|
+
isindex
|
23
|
+
menu
|
24
|
+
noframes
|
25
|
+
noscript
|
26
|
+
ol
|
27
|
+
p
|
28
|
+
pre
|
29
|
+
table
|
30
|
+
ul
|
31
|
+
]
|
31
32
|
|
32
33
|
# https://developer.mozilla.org/en-US/docs/Web/HTML/Block-level_elements
|
33
34
|
STRICT_BLOCK_LEVEL_HTML5 = Set.new %w[
|
34
|
-
|
35
|
-
|
36
|
-
|
37
|
-
|
38
|
-
|
39
|
-
|
40
|
-
|
41
|
-
|
42
|
-
|
43
|
-
|
44
|
-
|
45
|
-
|
46
|
-
|
47
|
-
|
48
|
-
|
49
|
-
|
50
|
-
|
51
|
-
|
52
|
-
|
53
|
-
|
54
|
-
|
55
|
-
|
56
|
-
|
57
|
-
|
58
|
-
|
59
|
-
|
60
|
-
|
61
|
-
|
62
|
-
|
63
|
-
|
64
|
-
|
65
|
-
|
66
|
-
|
67
|
-
|
68
|
-
|
69
|
-
|
70
|
-
|
71
|
-
|
72
|
-
STRICT_BLOCK_LEVEL = STRICT_BLOCK_LEVEL_HTML4 + STRICT_BLOCK_LEVEL_HTML5
|
35
|
+
address
|
36
|
+
article
|
37
|
+
aside
|
38
|
+
blockquote
|
39
|
+
canvas
|
40
|
+
dd
|
41
|
+
div
|
42
|
+
dl
|
43
|
+
dt
|
44
|
+
fieldset
|
45
|
+
figcaption
|
46
|
+
figure
|
47
|
+
footer
|
48
|
+
form
|
49
|
+
h1
|
50
|
+
h2
|
51
|
+
h3
|
52
|
+
h4
|
53
|
+
h5
|
54
|
+
h6
|
55
|
+
header
|
56
|
+
hgroup
|
57
|
+
hr
|
58
|
+
li
|
59
|
+
main
|
60
|
+
nav
|
61
|
+
noscript
|
62
|
+
ol
|
63
|
+
output
|
64
|
+
p
|
65
|
+
pre
|
66
|
+
section
|
67
|
+
table
|
68
|
+
tfoot
|
69
|
+
ul
|
70
|
+
video
|
71
|
+
]
|
73
72
|
|
74
73
|
# The following elements may also be considered block-level
|
75
74
|
# elements since they may contain block-level elements
|
76
75
|
LOOSE_BLOCK_LEVEL = Set.new %w[dd
|
77
|
-
|
78
|
-
|
79
|
-
|
80
|
-
|
81
|
-
|
82
|
-
|
83
|
-
|
84
|
-
|
85
|
-
|
86
|
-
|
76
|
+
dt
|
77
|
+
frameset
|
78
|
+
li
|
79
|
+
tbody
|
80
|
+
td
|
81
|
+
tfoot
|
82
|
+
th
|
83
|
+
thead
|
84
|
+
tr
|
85
|
+
]
|
87
86
|
|
87
|
+
# Elements that aren't block but should generate a newline in #to_text
|
88
|
+
INLINE_LINE_BREAK = Set.new(["br"])
|
89
|
+
|
90
|
+
STRICT_BLOCK_LEVEL = STRICT_BLOCK_LEVEL_HTML4 + STRICT_BLOCK_LEVEL_HTML5
|
88
91
|
BLOCK_LEVEL = STRICT_BLOCK_LEVEL + LOOSE_BLOCK_LEVEL
|
92
|
+
LINEBREAKERS = BLOCK_LEVEL + INLINE_LINE_BREAK
|
89
93
|
end
|
90
94
|
|
91
95
|
::Loofah::MetaHelpers.add_downcased_set_members_to_all_set_constants ::Loofah::Elements
|