loofah 2.2.3 → 2.19.1

Sign up to get free protection for your applications and to get access to all the features.
Files changed (42) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +221 -31
  3. data/README.md +18 -24
  4. data/lib/loofah/elements.rb +79 -75
  5. data/lib/loofah/helpers.rb +18 -7
  6. data/lib/loofah/html/document.rb +1 -0
  7. data/lib/loofah/html/document_fragment.rb +4 -2
  8. data/lib/loofah/html5/libxml2_workarounds.rb +8 -7
  9. data/lib/loofah/html5/safelist.rb +1042 -0
  10. data/lib/loofah/html5/scrub.rb +150 -55
  11. data/lib/loofah/instance_methods.rb +14 -8
  12. data/lib/loofah/metahelpers.rb +2 -1
  13. data/lib/loofah/scrubber.rb +12 -7
  14. data/lib/loofah/scrubbers.rb +21 -19
  15. data/lib/loofah/version.rb +5 -0
  16. data/lib/loofah/xml/document.rb +1 -0
  17. data/lib/loofah/xml/document_fragment.rb +2 -1
  18. data/lib/loofah.rb +35 -18
  19. metadata +52 -138
  20. data/.gemtest +0 -0
  21. data/Gemfile +0 -22
  22. data/Manifest.txt +0 -40
  23. data/Rakefile +0 -79
  24. data/benchmark/benchmark.rb +0 -149
  25. data/benchmark/fragment.html +0 -96
  26. data/benchmark/helper.rb +0 -73
  27. data/benchmark/www.slashdot.com.html +0 -2560
  28. data/lib/loofah/html5/whitelist.rb +0 -186
  29. data/test/assets/msword.html +0 -63
  30. data/test/assets/testdata_sanitizer_tests1.dat +0 -502
  31. data/test/helper.rb +0 -18
  32. data/test/html5/test_sanitizer.rb +0 -382
  33. data/test/integration/test_ad_hoc.rb +0 -204
  34. data/test/integration/test_helpers.rb +0 -43
  35. data/test/integration/test_html.rb +0 -72
  36. data/test/integration/test_scrubbers.rb +0 -400
  37. data/test/integration/test_xml.rb +0 -55
  38. data/test/unit/test_api.rb +0 -142
  39. data/test/unit/test_encoding.rb +0 -20
  40. data/test/unit/test_helpers.rb +0 -62
  41. data/test/unit/test_scrubber.rb +0 -229
  42. data/test/unit/test_scrubbers.rb +0 -14
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: c22c1a749ff878b96f0c4a53e789834fa8072775c5abdccb68c388d6218b1bce
4
- data.tar.gz: e8d00e6ff5d623b3f3d03ce83ee780a88e92138fcb71efff28194f8a7d87e5fc
3
+ metadata.gz: bd3edb0acdf2359d82564aca0bc13710d9f6c49157963d18953ff55bd7c14413
4
+ data.tar.gz: 3a6e11b7deb9cfb469aaf6ec919062687bd4215ef11980bded72ca298807610c
5
5
  SHA512:
6
- metadata.gz: 0d5a0160010d61a51dad8e31bc644e03454311b99b1d71c6eaea5458cfaaa228671b82db52cf2369b42c48b636b912ca0d812191ac886a5c1499c44fc5221239
7
- data.tar.gz: ac479e283ef08b0df14938ec577a3aa4008d07ba3288232541928794cd0b9fe2512da88ac7fd2d123666dcad67d09c1a07307442610f61adbfd65f143ae339b5
6
+ metadata.gz: 4970a6aa72265f60556dd6fd254375c86d3f83be23f3bbcc8b04df00ce0e801e8ef9e67d0a77ca6a21915be89226131c16a7f3540f02538cc2b9a369950dfebf
7
+ data.tar.gz: 27e3a06cc391ec3d9e3c966efdb6b4ce58e98c397ec87490d418406c17757e5cb0193edabaced30a9f24320c729e6730308e346610859f9f7c6d5fcc6f72cd56
data/CHANGELOG.md CHANGED
@@ -1,12 +1,202 @@
1
1
  # Changelog
2
2
 
3
+ ## 2.19.1 / 2022-12-13
4
+
5
+ ### Security
6
+
7
+ * Address CVE-2022-23514, inefficient regular expression complexity. See [GHSA-486f-hjj9-9vhh](https://github.com/flavorjones/loofah/security/advisories/GHSA-486f-hjj9-9vhh) for more information.
8
+ * Address CVE-2022-23515, improper neutralization of data URIs. See [GHSA-228g-948r-83gx](https://github.com/flavorjones/loofah/security/advisories/GHSA-228g-948r-83gx) for more information.
9
+ * Address CVE-2022-23516, uncontrolled recursion. See [GHSA-3x8r-x6xp-q4vm](https://github.com/flavorjones/loofah/security/advisories/GHSA-3x8r-x6xp-q4vm) for more information.
10
+
11
+
12
+ ## 2.19.0 / 2022-09-14
13
+
14
+ ### Features
15
+
16
+ * Allow SVG 1.0 color keyword names in CSS attributes. These colors are part of the [CSS Color Module Level 3](https://www.w3.org/TR/css-color-3/#svg-color) recommendation released 2022-01-18. [[#243](https://github.com/flavorjones/loofah/issues/243)]
17
+
18
+
19
+ ## 2.18.0 / 2022-05-11
20
+
21
+ ### Features
22
+
23
+ * Allow CSS property `aspect-ratio`. [[#236](https://github.com/flavorjones/loofah/issues/236)] (Thanks, [@louim](https://github.com/louim)!)
24
+
25
+
26
+ ## 2.17.0 / 2022-04-28
27
+
28
+ ### Features
29
+
30
+ * Allow ARIA attributes. [[#232](https://github.com/flavorjones/loofah/issues/232), [#233](https://github.com/flavorjones/loofah/issues/233)] (Thanks, [@nick-desteffen](https://github.com/nick-desteffen)!)
31
+
32
+
33
+ ## 2.16.0 / 2022-04-01
34
+
35
+ ### Features
36
+
37
+ * Allow MathML elements `menclose` and `ms`, and MathML attributes `dir`, `href`, `lquote`, `mathsize`, `notation`, and `rquote`. [[#231](https://github.com/flavorjones/loofah/issues/231)] (Thanks, [@nick-desteffen](https://github.com/nick-desteffen)!)
38
+
39
+
40
+ ## 2.15.0 / 2022-03-14
41
+
42
+ ### Features
43
+
44
+ * Expand set of allowed protocols to include `sms:`. [[#228](https://github.com/flavorjones/loofah/issues/228)] (Thanks, [@brendon](https://github.com/brendon)!)
45
+
46
+
47
+ ## 2.14.0 / 2022-02-11
48
+
49
+ ### Features
50
+
51
+ * The `#to_text` method on `Loofah::HTML::{Document,DocumentFragment}` replaces `<br>` line break elements with a newline. [[#225](https://github.com/flavorjones/loofah/issues/225)]
52
+
53
+
54
+ ## 2.13.0 / 2021-12-10
55
+
56
+ ### Bug fixes
57
+
58
+ * Loofah::HTML::DocumentFragment#text no longer serializes top-level comment children. [[#221](https://github.com/flavorjones/loofah/issues/221)]
59
+
60
+
61
+ ## 2.12.0 / 2021-08-11
62
+
63
+ ### Features
64
+
65
+ * Support empty HTML5 data attributes. [[#215](https://github.com/flavorjones/loofah/issues/215)]
66
+
67
+
68
+ ## 2.11.0 / 2021-07-31
69
+
70
+ ### Features
71
+
72
+ * Allow HTML5 element `wbr`.
73
+ * Allow all CSS property values for `border-collapse`. [[#201](https://github.com/flavorjones/loofah/issues/201)]
74
+
75
+
76
+ ### Changes
77
+
78
+ * Deprecating `Loofah::HTML5::SafeList::VOID_ELEMENTS` which is not a canonical list of void HTML4 or HTML5 elements.
79
+ * Removed some elements from `Loofah::HTML5::SafeList::VOID_ELEMENTS` that either are not acceptable elements or aren't considered "void" by libxml2.
80
+
81
+
82
+ ## 2.10.0 / 2021-06-06
83
+
84
+ ### Features
85
+
86
+ * Allow CSS properties `overflow-x` and `overflow-y`. [[#206](https://github.com/flavorjones/loofah/issues/206)] (Thanks, [@sampokuokkanen](https://github.com/sampokuokkanen)!)
87
+
88
+
89
+ ## 2.9.1 / 2021-04-07
90
+
91
+ ### Bug fixes
92
+
93
+ * Fix a regression in v2.9.0 which inappropriately removed CSS properties with quoted string values. [[#202](https://github.com/flavorjones/loofah/issues/202)]
94
+
95
+
96
+ ## 2.9.0 / 2021-01-14
97
+
98
+ ### Features
99
+
100
+ * Handle CSS functions in a CSS shorthand property (like `background`). [[#199](https://github.com/flavorjones/loofah/issues/199), [#200](https://github.com/flavorjones/loofah/issues/200)]
101
+
102
+
103
+ ## 2.8.0 / 2020-11-25
104
+
105
+ ### Features
106
+
107
+ * Allow CSS properties `order`, `flex-direction`, `flex-grow`, `flex-wrap`, `flex-shrink`, `flex-flow`, `flex-basis`, `flex`, `justify-content`, `align-self`, `align-items`, and `align-content`. [[#197](https://github.com/flavorjones/loofah/issues/197)] (Thanks, [@miguelperez](https://github.com/miguelperez)!)
108
+
109
+
110
+ ## 2.7.0 / 2020-08-26
111
+
112
+ ### Features
113
+
114
+ * Allow CSS properties `page-break-before`, `page-break-inside`, and `page-break-after`. [[#190](https://github.com/flavorjones/loofah/issues/190)] (Thanks, [@ahorek](https://github.com/ahorek)!)
115
+
116
+
117
+ ### Fixes
118
+
119
+ * Don't drop the `!important` rule from some CSS properties. [[#191](https://github.com/flavorjones/loofah/issues/191)] (Thanks, [@b7kich](https://github.com/b7kich)!)
120
+
121
+
122
+ ## 2.6.0 / 2020-06-16
123
+
124
+ ### Features
125
+
126
+ * Allow CSS `border-style` keywords. [[#188](https://github.com/flavorjones/loofah/issues/188)] (Thanks, [@tarcisiozf](https://github.com/tarcisiozf)!)
127
+
128
+
129
+ ## 2.5.0 / 2020-04-05
130
+
131
+ ### Features
132
+
133
+ * Allow more CSS length units: "ch", "vw", "vh", "Q", "lh", "vmin", "vmax". [[#178](https://github.com/flavorjones/loofah/issues/178)] (Thanks, [@JuanitoFatas](https://github.com/JuanitoFatas)!)
134
+
135
+
136
+ ### Fixes
137
+
138
+ * Remove comments from `Loofah::HTML::Document`s that exist outside the `html` element. [[#80](https://github.com/flavorjones/loofah/issues/80)]
139
+
140
+
141
+ ### Other changes
142
+
143
+ * Gem metadata being set [[#181](https://github.com/flavorjones/loofah/issues/181)] (Thanks, [@JuanitoFatas](https://github.com/JuanitoFatas)!)
144
+ * Test files removed from gem file [[#180](https://github.com/flavorjones/loofah/issues/180),[#166](https://github.com/flavorjones/loofah/issues/166),[#159](https://github.com/flavorjones/loofah/issues/159)] (Thanks, [@JuanitoFatas](https://github.com/JuanitoFatas) and [@greysteil](https://github.com/greysteil)!)
145
+
146
+
147
+ ## 2.4.0 / 2019-11-25
148
+
149
+ ### Features
150
+
151
+ * Allow CSS property `max-width` [[#175](https://github.com/flavorjones/loofah/issues/175)] (Thanks, [@bchaney](https://github.com/bchaney)!)
152
+ * Allow CSS sizes expressed in `rem` [[#176](https://github.com/flavorjones/loofah/issues/176), [#177](https://github.com/flavorjones/loofah/issues/177)]
153
+ * Add `frozen_string_literal: true` magic comment to all `lib` files. [[#118](https://github.com/flavorjones/loofah/issues/118)]
154
+
155
+
156
+ ## 2.3.1 / 2019-10-22
157
+
158
+ ### Security
159
+
160
+ Address CVE-2019-15587: Unsanitized JavaScript may occur in sanitized output when a crafted SVG element is republished.
161
+
162
+ This CVE's public notice is at [#171](https://github.com/flavorjones/loofah/issues/171)
163
+
164
+
165
+ ## 2.3.0 / 2019-09-28
166
+
167
+ ### Features
168
+
169
+ * Expand set of allowed protocols to include `tel:` and `line:`. [[#104](https://github.com/flavorjones/loofah/issues/104), [#147](https://github.com/flavorjones/loofah/issues/147)]
170
+ * Expand set of allowed CSS functions. [related to [#122](https://github.com/flavorjones/loofah/issues/122)]
171
+ * Allow greater precision in shorthand CSS values. [[#149](https://github.com/flavorjones/loofah/issues/149)] (Thanks, [@danfstucky](https://github.com/danfstucky)!)
172
+ * Allow CSS property `list-style` [[#162](https://github.com/flavorjones/loofah/issues/162)] (Thanks, [@jaredbeck](https://github.com/jaredbeck)!)
173
+ * Allow CSS keywords `thick` and `thin` [[#168](https://github.com/flavorjones/loofah/issues/168)] (Thanks, [@georgeclaghorn](https://github.com/georgeclaghorn)!)
174
+ * Allow HTML property `contenteditable` [[#167](https://github.com/flavorjones/loofah/issues/167)] (Thanks, [@andreynering](https://github.com/andreynering)!)
175
+
176
+
177
+ ### Bug fixes
178
+
179
+ * CSS hex values are no longer limited to lowercase hex. Previously uppercase hex were scrubbed. [[#165](https://github.com/flavorjones/loofah/issues/165)] (Thanks, [@asok](https://github.com/asok)!)
180
+
181
+
182
+ ### Deprecations / Name Changes
183
+
184
+ The following method and constants are hereby deprecated, and will be completely removed in a future release:
185
+
186
+ * Deprecate `Loofah::Helpers::ActionView.white_list_sanitizer`, please use `Loofah::Helpers::ActionView.safe_list_sanitizer` instead.
187
+ * Deprecate `Loofah::Helpers::ActionView::WhiteListSanitizer`, please use `Loofah::Helpers::ActionView::SafeListSanitizer` instead.
188
+ * Deprecate `Loofah::HTML5::WhiteList`, please use `Loofah::HTML5::SafeList` instead.
189
+
190
+ Thanks to [@JuanitoFatas](https://github.com/JuanitoFatas) for submitting these changes in [#164](https://github.com/flavorjones/loofah/issues/164) and for making the language used in Loofah more inclusive.
191
+
192
+
3
193
  ## 2.2.3 / 2018-10-30
4
194
 
5
195
  ### Security
6
196
 
7
197
  Address CVE-2018-16468: Unsanitized JavaScript may occur in sanitized output when a crafted SVG element is republished.
8
198
 
9
- This CVE's public notice is at https://github.com/flavorjones/loofah/issues/154
199
+ This CVE's public notice is at [#154](https://github.com/flavorjones/loofah/issues/154)
10
200
 
11
201
 
12
202
  ## Meta / 2018-10-27
@@ -33,76 +223,76 @@ attribute scrubbers should they need to address CVE-2018-8048.
33
223
 
34
224
  Addresses CVE-2018-8048. Loofah allowed non-whitelisted attributes to be present in sanitized output when input with specially-crafted HTML fragments.
35
225
 
36
- This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
226
+ This CVE's public notice is at [#144](https://github.com/flavorjones/loofah/issues/144)
37
227
 
38
228
 
39
229
  ## 2.2.0 / 2018-02-11
40
230
 
41
231
  ### Features:
42
232
 
43
- * Support HTML5 `<main>` tag. #133 (Thanks, @MothOnMars!)
44
- * Recognize HTML5 block elements. #136 (Thanks, @MothOnMars!)
45
- * Support SVG `<symbol>` tag. #131 (Thanks, @baopham!)
46
- * Support for whitelisting CSS functions, initially just `calc` and `rgb`. #122/#123/#129 (Thanks, @NikoRoberts!)
47
- * Whitelist CSS property `list-style-type`. #68/#137/#142 (Thanks, @andela-ysanni and @NikoRoberts!)
233
+ * Support HTML5 `<main>` tag. [#133](https://github.com/flavorjones/loofah/issues/133) (Thanks, [@MothOnMars](https://github.com/MothOnMars)!)
234
+ * Recognize HTML5 block elements. [#136](https://github.com/flavorjones/loofah/issues/136) (Thanks, [@MothOnMars](https://github.com/MothOnMars)!)
235
+ * Support SVG `<symbol>` tag. [#131](https://github.com/flavorjones/loofah/issues/131) (Thanks, [@baopham](https://github.com/baopham)!)
236
+ * Support for whitelisting CSS functions, initially just `calc` and `rgb`. [#122](https://github.com/flavorjones/loofah/issues/122)/[#123](https://github.com/flavorjones/loofah/issues/123)/[#129](https://github.com/flavorjones/loofah/issues/129) (Thanks, [@NikoRoberts](https://github.com/NikoRoberts)!)
237
+ * Whitelist CSS property `list-style-type`. [#68](https://github.com/flavorjones/loofah/issues/68)/[#137](https://github.com/flavorjones/loofah/issues/137)/[#142](https://github.com/flavorjones/loofah/issues/142) (Thanks, [@andela-ysanni](https://github.com/andela-ysanni) and [@NikoRoberts](https://github.com/NikoRoberts)!)
48
238
 
49
239
  ### Bugfixes:
50
240
 
51
- * Properly handle nested `script` tags. #127.
241
+ * Properly handle nested `script` tags. [#127](https://github.com/flavorjones/loofah/issues/127).
52
242
 
53
243
 
54
244
  ## 2.1.1 / 2017-09-24
55
245
 
56
246
  ### Bugfixes:
57
247
 
58
- * Removed warning for unused variable. #124 (Thanks, @y-yagi!)
248
+ * Removed warning for unused variable. [#124](https://github.com/flavorjones/loofah/issues/124) (Thanks, [@y-yagi](https://github.com/y-yagi)!)
59
249
 
60
250
 
61
251
  ## 2.1.0 / 2017-09-24
62
252
 
63
253
  ### Notes:
64
254
 
65
- * Re-implemented CSS parsing and sanitization using the [crass](https://github.com/rgrove/crass) library. #91
255
+ * Re-implemented CSS parsing and sanitization using the [crass](https://github.com/rgrove/crass) library. [#91](https://github.com/flavorjones/loofah/issues/91)
66
256
 
67
257
 
68
258
  ### Features:
69
259
 
70
- * Added :noopener HTML scrubber (Thanks, @tastycode!)
71
- * Support `data` URIs with the following media types: text/plain, text/css, image/png, image/gif, image/jpeg, image/svg+xml. #101, #120. (Thanks, @mrpasquini!)
260
+ * Added :noopener HTML scrubber (Thanks, [@tastycode](https://github.com/tastycode)!)
261
+ * Support `data` URIs with the following media types: text/plain, text/css, image/png, image/gif, image/jpeg, image/svg+xml. [#101](https://github.com/flavorjones/loofah/issues/101), [#120](https://github.com/flavorjones/loofah/issues/120). (Thanks, [@mrpasquini](https://github.com/mrpasquini)!)
72
262
 
73
263
 
74
264
  ### Bugfixes:
75
265
 
76
- * The :unprintable scrubber now scrubs unprintable characters in CDATA nodes (like `<script>`). #124
77
- * Allow negative values in CSS properties. Restores functionality that was reverted in v2.0.3. #91
266
+ * The :unprintable scrubber now scrubs unprintable characters in CDATA nodes (like `<script>`). [#124](https://github.com/flavorjones/loofah/issues/124)
267
+ * Allow negative values in CSS properties. Restores functionality that was reverted in v2.0.3. [#91](https://github.com/flavorjones/loofah/issues/91)
78
268
 
79
269
 
80
270
  ## 2.0.3 / 2015-08-17
81
271
 
82
272
  ### Bug fixes:
83
273
 
84
- * Revert support for negative values in CSS properties due to slow performance. #90 (Related to #85.)
274
+ * Revert support for negative values in CSS properties due to slow performance. [#90](https://github.com/flavorjones/loofah/issues/90) (Related to [#85](https://github.com/flavorjones/loofah/issues/85).)
85
275
 
86
276
 
87
277
  ## 2.0.2 / 2015-05-05
88
278
 
89
279
  ### Bug fixes:
90
280
 
91
- * Fix error with `#to_text` when Loofah::Helpers hadn't been required. #75
92
- * Allow multi-word data attributes. #84 (Thanks, @jstorimer!)
93
- * Allow negative values in CSS properties. #85 (Thanks, @siddhartham!)
281
+ * Fix error with `#to_text` when Loofah::Helpers hadn't been required. [#75](https://github.com/flavorjones/loofah/issues/75)
282
+ * Allow multi-word data attributes. [#84](https://github.com/flavorjones/loofah/issues/84) (Thanks, [@jstorimer](https://github.com/jstorimer)!)
283
+ * Allow negative values in CSS properties. [#85](https://github.com/flavorjones/loofah/issues/85) (Thanks, [@siddhartham](https://github.com/siddhartham)!)
94
284
 
95
285
 
96
286
  ## 2.0.1 / 2014-08-21
97
287
 
98
288
  ### Bug fixes:
99
289
 
100
- * Load RR correctly when running test files directly. (Thanks, @ktdreyer!)
290
+ * Load RR correctly when running test files directly. (Thanks, [@ktdreyer](https://github.com/ktdreyer)!)
101
291
 
102
292
 
103
293
  ### Notes:
104
294
 
105
- * Extracted HTML5::Scrub#scrub_css_attribute to accommodate the Rails integration work. (Thanks, @kaspth!)
295
+ * Extracted HTML5::Scrub#scrub_css_attribute to accommodate the Rails integration work. (Thanks, [@kaspth](https://github.com/kaspth)!)
106
296
 
107
297
 
108
298
  ## 2.0.0 / 2014-05-09
@@ -118,19 +308,19 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
118
308
  * tags: `article`, `aside`, `bdi`, `bdo`, `canvas`, `command`, `datalist`, `details`, `figcaption`, `figure`, `footer`, `header`, `mark`, `meter`, `nav`, `output`, `section`, `summary`, `time`
119
309
  * attributes: `data-*` (Thanks, Rafael Franca!)
120
310
  * URI attributes: `poster` and `preload`
121
- * Addition of the `:unprintable` scrubber to remove unprintable characters from text nodes. #65 (Thanks, Matt Swanson!)
122
- * `Loofah.fragment` accepts an optional encoding argument, compatible with `Nokogiri::HTML::DocumentFragment.parse`. #62 (Thanks, Ben Atkins!)
311
+ * Addition of the `:unprintable` scrubber to remove unprintable characters from text nodes. [#65](https://github.com/flavorjones/loofah/issues/65) (Thanks, Matt Swanson!)
312
+ * `Loofah.fragment` accepts an optional encoding argument, compatible with `Nokogiri::HTML::DocumentFragment.parse`. [#62](https://github.com/flavorjones/loofah/issues/62) (Thanks, Ben Atkins!)
123
313
  * HTML5 sanitizers now remove attributes without values. (Thanks, Kasper Timm Hansen!)
124
314
 
125
315
  ### Bug fixes:
126
316
 
127
317
  * HTML5 sanitizers' CSS keyword check now actually works (broken in v2.0). Additional regression tests added. (Thanks, Kasper Timm Hansen!)
128
- * HTML5 sanitizers now allow negative arguments to CSS. #64 (Thanks, Jon Calhoun!)
318
+ * HTML5 sanitizers now allow negative arguments to CSS. [#64](https://github.com/flavorjones/loofah/issues/64) (Thanks, Jon Calhoun!)
129
319
 
130
320
 
131
321
  ## 1.2.1 (2012-04-14)
132
322
 
133
- * Declaring encoding in html5/scrub.rb. Without this, use of the ruby -KU option would cause havoc. (#32)
323
+ * Declaring encoding in html5/scrub.rb. Without this, use of the ruby -KU option would cause havoc. ([#32](https://github.com/flavorjones/loofah/issues/32))
134
324
 
135
325
 
136
326
  ## 1.2.0 (2011-08-08)
@@ -148,7 +338,7 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
148
338
  * Additional HTML5lib whitelist elements (from html5lib 1524:80b5efe26230).
149
339
  Up to date with HTML5lib ruby code as of 1723:7ee6a0331856.
150
340
  * Whitelists (which are not part of the public API) are now Sets (were previously Arrays).
151
- * Don't explode when encountering UTF-8 URIs. (#25, #29)
341
+ * Don't explode when encountering UTF-8 URIs. ([#25](https://github.com/flavorjones/loofah/issues/25), [#29](https://github.com/flavorjones/loofah/issues/29))
152
342
 
153
343
 
154
344
  ## 1.0.0 (2010-10-26)
@@ -166,7 +356,7 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
166
356
  * New methods Loofah::HTML::Document#to_text and
167
357
  Loofah::HTML::DocumentFragment#to_text do the right thing with
168
358
  whitespace. Note that these methods are significantly slower than
169
- #text. GH #12
359
+ #text. GH [#12](https://github.com/flavorjones/loofah/issues/12)
170
360
  * Loofah::Elements::BLOCK_LEVEL contains a canonical list of HTML4 block-level4 elements.
171
361
  * Loofah::HTML::Document#text and Loofah::HTML::DocumentFragment#text
172
362
  will return unescaped HTML entities by passing :encode_special_chars => false.
@@ -180,7 +370,7 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
180
370
 
181
371
  ### Bug fixes:
182
372
 
183
- * Loofah::XssFoliate was not properly escaping HTML entities when implicitly scrubbing a string attribute. GH #17
373
+ * Loofah::XssFoliate was not properly escaping HTML entities when implicitly scrubbing a string attribute. GH [#17](https://github.com/flavorjones/loofah/issues/17)
184
374
 
185
375
 
186
376
  ## 0.4.3 (2010-01-29)
@@ -208,7 +398,7 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
208
398
 
209
399
  ### Bug fixes:
210
400
 
211
- * Supporting Rails apps that aren't loading ActiveRecord. GH #10
401
+ * Supporting Rails apps that aren't loading ActiveRecord. GH [#10](https://github.com/flavorjones/loofah/issues/10)
212
402
 
213
403
  ### Miscellaneous:
214
404
 
@@ -269,13 +459,13 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
269
459
  ### Enhancements:
270
460
 
271
461
  * when loaded in a Rails app, automatically extend ActiveRecord::Base
272
- with html_fragment and html_document. GH #6 (Thanks Josh Nichols!)
462
+ with html_fragment and html_document. GH [#6](https://github.com/flavorjones/loofah/issues/6) (Thanks Josh Nichols!)
273
463
 
274
464
  ### Bugfixes:
275
465
 
276
466
  * ActiveRecord scrubbing should generate strings instead of Document or
277
- DocumentFragment objects. GH #5
278
- * init.rb fixed to support installation as a Rails plugin. GH #6
467
+ DocumentFragment objects. GH [#5](https://github.com/flavorjones/loofah/issues/5)
468
+ * init.rb fixed to support installation as a Rails plugin. GH [#6](https://github.com/flavorjones/loofah/issues/6)
279
469
  (Thanks Josh Nichols!)
280
470
 
281
471
 
data/README.md CHANGED
@@ -1,36 +1,27 @@
1
1
  # Loofah
2
2
 
3
3
  * https://github.com/flavorjones/loofah
4
- * Docs: http://rubydoc.info/github/flavorjones/loofah/master/frames
4
+ * Docs: http://rubydoc.info/github/flavorjones/loofah/main/frames
5
5
  * Mailing list: [loofah-talk@googlegroups.com](https://groups.google.com/forum/#!forum/loofah-talk)
6
6
 
7
7
  ## Status
8
8
 
9
- |System|Status|
10
- |--|--|
11
- | Concourse | [![Concourse CI](https://ci.nokogiri.org/api/v1/teams/nokogiri-core/pipelines/loofah/jobs/ruby-2.5/badge)](https://ci.nokogiri.org/teams/nokogiri-core/pipelines/loofah?groups=master) |
12
- | Code Climate | [![Code Climate](https://codeclimate.com/github/flavorjones/loofah.svg)](https://codeclimate.com/github/flavorjones/loofah) |
13
- | Version Eye | [![Version Eye](https://www.versioneye.com/ruby/loofah/badge.png)](https://www.versioneye.com/ruby/loofah) |
9
+ [![ci](https://github.com/flavorjones/loofah/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/flavorjones/loofah/actions/workflows/ci.yml)
10
+ [![Tidelift dependencies](https://tidelift.com/badges/package/rubygems/loofah)](https://tidelift.com/subscription/pkg/rubygems-loofah?utm_source=rubygems-loofah&utm_medium=referral&utm_campaign=readme)
14
11
 
15
12
 
16
13
  ## Description
17
14
 
18
- Loofah is a general library for manipulating and transforming HTML/XML
19
- documents and fragments. It's built on top of Nokogiri and libxml2, so
20
- it's fast and has a nice API.
15
+ Loofah is a general library for manipulating and transforming HTML/XML documents and fragments, built on top of Nokogiri.
21
16
 
22
- Loofah excels at HTML sanitization (XSS prevention). It includes some
23
- nice HTML sanitizers, which are based on HTML5lib's whitelist, so it
24
- most likely won't make your codes less secure. (These statements have
25
- not been evaluated by Netexperts.)
17
+ Loofah excels at HTML sanitization (XSS prevention). It includes some nice HTML sanitizers, which are based on HTML5lib's safelist, so it most likely won't make your codes less secure. (These statements have not been evaluated by Netexperts.)
26
18
 
27
- ActiveRecord extensions for sanitization are available in the
28
- [`loofah-activerecord` gem](https://github.com/flavorjones/loofah-activerecord).
19
+ ActiveRecord extensions for sanitization are available in the [`loofah-activerecord` gem](https://github.com/flavorjones/loofah-activerecord).
29
20
 
30
21
 
31
22
  ## Features
32
23
 
33
- * Easily write custom scrubbers for HTML/XML leveraging the sweetness of Nokogiri (and HTML5lib's whitelists).
24
+ * Easily write custom scrubbers for HTML/XML leveraging the sweetness of Nokogiri (and HTML5lib's safelists).
34
25
  * Common HTML sanitizing tasks are built-in:
35
26
  * _Strip_ unsafe tags, leaving behind only the inner text.
36
27
  * _Prune_ unsafe tags and their subtrees, removing all traces that they ever existed.
@@ -142,13 +133,12 @@ and `text` to return plain text:
142
133
  doc.text # => "ohai! div is safe "
143
134
  ```
144
135
 
145
- Also, `to_text` is available, which does the right thing with
146
- whitespace around block-level elements.
136
+ Also, `to_text` is available, which does the right thing with whitespace around block-level and line break elements.
147
137
 
148
138
  ``` ruby
149
- doc = Loofah.fragment("<h1>Title</h1><div>Content</div>")
150
- doc.text # => "TitleContent" # probably not what you want
151
- doc.to_text # => "\nTitle\n\nContent\n" # better
139
+ doc = Loofah.fragment("<h1>Title</h1><div>Content<br>Next line</div>")
140
+ doc.text # => "TitleContentNext line" # probably not what you want
141
+ doc.to_text # => "\nTitle\n\nContent\nNext line\n" # better
152
142
  ```
153
143
 
154
144
  ### Loofah::XML::Document and Loofah::XML::DocumentFragment
@@ -219,10 +209,10 @@ end
219
209
  Loofah.xml_document(File.read('plague.xml')).scrub!(bring_out_your_dead)
220
210
  ```
221
211
 
222
- === Built-In HTML Scrubbers
212
+ ### Built-In HTML Scrubbers
223
213
 
224
214
  Loofah comes with a set of sanitizing scrubbers that use HTML5lib's
225
- whitelist algorithm:
215
+ safelist algorithm:
226
216
 
227
217
  ``` ruby
228
218
  doc.scrub!(:strip) # replaces unknown/unsafe tags with their inner text
@@ -308,6 +298,10 @@ And the mailing list is on Google Groups:
308
298
 
309
299
  And the IRC channel is \#loofah on freenode.
310
300
 
301
+ Consider subscribing to [Tidelift][tidelift] which provides license assurances and timely security notifications for your open source dependencies, including Loofah. [Tidelift][tidelift] subscriptions also help the Loofah maintainers fund our [automated testing](https://ci.nokogiri.org) which in turn allows us to ship releases, bugfixes, and security updates more often.
302
+
303
+ [tidelift]: https://tidelift.com/subscription/pkg/rubygems-loofah?utm_source=undefined&utm_medium=referral&utm_campaign=enterprise
304
+
311
305
 
312
306
  ## Security
313
307
 
@@ -354,7 +348,7 @@ And a big shout-out to Corey Innis for the name, and feedback on the API.
354
348
 
355
349
  ## Thank You
356
350
 
357
- The following people have generously donated via the [Pledgie](http://pledgie.com) badge on the [Loofah github page](https://github.com/flavorjones/loofah):
351
+ The following people have generously funded Loofah:
358
352
 
359
353
  * Bill Harding
360
354
 
@@ -1,91 +1,95 @@
1
- require 'set'
1
+ # frozen_string_literal: true
2
+ require "set"
2
3
 
3
4
  module Loofah
4
5
  module Elements
5
6
  STRICT_BLOCK_LEVEL_HTML4 = Set.new %w[
6
- address
7
- blockquote
8
- center
9
- dir
10
- div
11
- dl
12
- fieldset
13
- form
14
- h1
15
- h2
16
- h3
17
- h4
18
- h5
19
- h6
20
- hr
21
- isindex
22
- menu
23
- noframes
24
- noscript
25
- ol
26
- p
27
- pre
28
- table
29
- ul
30
- ]
7
+ address
8
+ blockquote
9
+ center
10
+ dir
11
+ div
12
+ dl
13
+ fieldset
14
+ form
15
+ h1
16
+ h2
17
+ h3
18
+ h4
19
+ h5
20
+ h6
21
+ hr
22
+ isindex
23
+ menu
24
+ noframes
25
+ noscript
26
+ ol
27
+ p
28
+ pre
29
+ table
30
+ ul
31
+ ]
31
32
 
32
33
  # https://developer.mozilla.org/en-US/docs/Web/HTML/Block-level_elements
33
34
  STRICT_BLOCK_LEVEL_HTML5 = Set.new %w[
34
- address
35
- article
36
- aside
37
- blockquote
38
- canvas
39
- dd
40
- div
41
- dl
42
- dt
43
- fieldset
44
- figcaption
45
- figure
46
- footer
47
- form
48
- h1
49
- h2
50
- h3
51
- h4
52
- h5
53
- h6
54
- header
55
- hgroup
56
- hr
57
- li
58
- main
59
- nav
60
- noscript
61
- ol
62
- output
63
- p
64
- pre
65
- section
66
- table
67
- tfoot
68
- ul
69
- video
70
- ]
71
-
72
- STRICT_BLOCK_LEVEL = STRICT_BLOCK_LEVEL_HTML4 + STRICT_BLOCK_LEVEL_HTML5
35
+ address
36
+ article
37
+ aside
38
+ blockquote
39
+ canvas
40
+ dd
41
+ div
42
+ dl
43
+ dt
44
+ fieldset
45
+ figcaption
46
+ figure
47
+ footer
48
+ form
49
+ h1
50
+ h2
51
+ h3
52
+ h4
53
+ h5
54
+ h6
55
+ header
56
+ hgroup
57
+ hr
58
+ li
59
+ main
60
+ nav
61
+ noscript
62
+ ol
63
+ output
64
+ p
65
+ pre
66
+ section
67
+ table
68
+ tfoot
69
+ ul
70
+ video
71
+ ]
73
72
 
74
73
  # The following elements may also be considered block-level
75
74
  # elements since they may contain block-level elements
76
75
  LOOSE_BLOCK_LEVEL = Set.new %w[dd
77
- dt
78
- frameset
79
- li
80
- tbody
81
- td
82
- tfoot
83
- th
84
- thead
85
- tr
86
- ]
76
+ dt
77
+ frameset
78
+ li
79
+ tbody
80
+ td
81
+ tfoot
82
+ th
83
+ thead
84
+ tr
85
+ ]
87
86
 
87
+ # Elements that aren't block but should generate a newline in #to_text
88
+ INLINE_LINE_BREAK = Set.new(["br"])
89
+
90
+ STRICT_BLOCK_LEVEL = STRICT_BLOCK_LEVEL_HTML4 + STRICT_BLOCK_LEVEL_HTML5
88
91
  BLOCK_LEVEL = STRICT_BLOCK_LEVEL + LOOSE_BLOCK_LEVEL
92
+ LINEBREAKERS = BLOCK_LEVEL + INLINE_LINE_BREAK
89
93
  end
90
94
 
91
95
  ::Loofah::MetaHelpers.add_downcased_set_members_to_all_set_constants ::Loofah::Elements