loofah 2.3.1 → 2.8.0

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of loofah might be problematic. Click here for more details.

Files changed (42) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +91 -40
  3. data/README.md +7 -4
  4. data/lib/loofah.rb +33 -16
  5. data/lib/loofah/elements.rb +74 -73
  6. data/lib/loofah/helpers.rb +5 -4
  7. data/lib/loofah/html/document.rb +1 -0
  8. data/lib/loofah/html/document_fragment.rb +4 -2
  9. data/lib/loofah/html5/libxml2_workarounds.rb +8 -7
  10. data/lib/loofah/html5/safelist.rb +23 -0
  11. data/lib/loofah/html5/scrub.rb +21 -21
  12. data/lib/loofah/instance_methods.rb +5 -3
  13. data/lib/loofah/metahelpers.rb +2 -1
  14. data/lib/loofah/scrubber.rb +8 -7
  15. data/lib/loofah/scrubbers.rb +11 -10
  16. data/lib/loofah/version.rb +5 -0
  17. data/lib/loofah/xml/document.rb +1 -0
  18. data/lib/loofah/xml/document_fragment.rb +2 -1
  19. metadata +27 -93
  20. data/.gemtest +0 -0
  21. data/Gemfile +0 -22
  22. data/Manifest.txt +0 -41
  23. data/Rakefile +0 -81
  24. data/benchmark/benchmark.rb +0 -149
  25. data/benchmark/fragment.html +0 -96
  26. data/benchmark/helper.rb +0 -73
  27. data/benchmark/www.slashdot.com.html +0 -2560
  28. data/test/assets/msword.html +0 -63
  29. data/test/assets/testdata_sanitizer_tests1.dat +0 -502
  30. data/test/helper.rb +0 -18
  31. data/test/html5/test_sanitizer.rb +0 -401
  32. data/test/html5/test_scrub.rb +0 -10
  33. data/test/integration/test_ad_hoc.rb +0 -220
  34. data/test/integration/test_helpers.rb +0 -43
  35. data/test/integration/test_html.rb +0 -72
  36. data/test/integration/test_scrubbers.rb +0 -400
  37. data/test/integration/test_xml.rb +0 -55
  38. data/test/unit/test_api.rb +0 -142
  39. data/test/unit/test_encoding.rb +0 -20
  40. data/test/unit/test_helpers.rb +0 -62
  41. data/test/unit/test_scrubber.rb +0 -229
  42. data/test/unit/test_scrubbers.rb +0 -14
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 1196afab25d29644d1961e4516ac317a2c38dee3295f35354c468e6a9318fa55
4
- data.tar.gz: 2e07ff641edb37d2b0dce2933288da4667d4b680a586912af9c171db7dfb0a63
3
+ metadata.gz: af120a1d5829e0f0a9676dddd8b9b112a432c6f05c65b2522d8d1aafe8bde311
4
+ data.tar.gz: a19cfbdb4c3751332d471478718ae384dde5ef970ac482f5f0fb2a027561c0d6
5
5
  SHA512:
6
- metadata.gz: 37ac2cdb0d136da417cff62e3845c5b71769f044d8150c636a549dc9ca4cf98bcef4c6d2b6e653eff56922b95d812ed39310a406c49366c14791456ca905e8fe
7
- data.tar.gz: 0fa3cdd75a3d2950801a1cfe7f8d4cad6bb73bbec67d24ba25980c09a565f6c95c5d664c1789ccd62486d1917c685a5b0f762cc073a054bbb0f02fb0222688f0
6
+ metadata.gz: 93bbb41db6d1edd130d6c83fba87e70c145ec01f57120b406096ae56e7993f56803e04d40ee50faaf2f48fb3a2f6d704e5659923d8e5c04f62f6989591e37fa4
7
+ data.tar.gz: 8d7fd16c9ba849ae552c22bc37795efc1b1382d8ef83816ad2f66a868d7e9628562e7581af67a5ab7c5ab50ff7da26ebbb13b7b38099da72859583cd5ef1aa3b
@@ -1,29 +1,80 @@
1
1
  # Changelog
2
2
 
3
+ ### 2.8.0 / 2020-11-25
4
+
5
+ * Allow CSS properties `order`, `flex-direction`, `flex-grow`, `flex-wrap`, `flex-shrink`, `flex-flow`, `flex-basis`, `flex`, `justify-content`, `align-self`, `align-items`, and `align-content`. [[#197](https://github.com/flavorjones/loofah/issues/197)] (Thanks, [@miguelperez](https://github.com/miguelperez)!)
6
+
7
+
8
+ ## 2.7.0 / 2020-08-26
9
+
10
+ ### Features
11
+
12
+ * Allow CSS properties `page-break-before`, `page-break-inside`, and `page-break-after`. [[#190](https://github.com/flavorjones/loofah/issues/190)] (Thanks, [@ahorek](https://github.com/ahorek)!)
13
+
14
+
15
+ ### Fixes
16
+
17
+ * Don't drop the `!important` rule from some CSS properties. [[#191](https://github.com/flavorjones/loofah/issues/191)] (Thanks, [@b7kich](https://github.com/b7kich)!)
18
+
19
+
20
+ ## 2.6.0 / 2020-06-16
21
+
22
+ ### Features
23
+
24
+ * Allow CSS `border-style` keywords. [[#188](https://github.com/flavorjones/loofah/issues/188)] (Thanks, [@tarcisiozf](https://github.com/tarcisiozf)!)
25
+
26
+
27
+ ## 2.5.0 / 2020-04-05
28
+
29
+ ### Features
30
+
31
+ * Allow more CSS length units: "ch", "vw", "vh", "Q", "lh", "vmin", "vmax". [[#178](https://github.com/flavorjones/loofah/issues/178)] (Thanks, [@JuanitoFatas](https://github.com/JuanitoFatas)!)
32
+
33
+
34
+ ### Fixes
35
+
36
+ * Remove comments from `Loofah::HTML::Document`s that exist outside the `html` element. [[#80](https://github.com/flavorjones/loofah/issues/80)]
37
+
38
+
39
+ ### Other changes
40
+
41
+ * Gem metadata being set [[#181](https://github.com/flavorjones/loofah/issues/181)] (Thanks, [@JuanitoFatas](https://github.com/JuanitoFatas)!)
42
+ * Test files removed from gem file [[#180](https://github.com/flavorjones/loofah/issues/180),[#166](https://github.com/flavorjones/loofah/issues/166),[#159](https://github.com/flavorjones/loofah/issues/159)] (Thanks, [@JuanitoFatas](https://github.com/JuanitoFatas) and [@greysteil](https://github.com/greysteil)!)
43
+
44
+
45
+ ## 2.4.0 / 2019-11-25
46
+
47
+ ### Features
48
+
49
+ * Allow CSS property `max-width` [[#175](https://github.com/flavorjones/loofah/issues/175)] (Thanks, [@bchaney](https://github.com/bchaney)!)
50
+ * Allow CSS sizes expressed in `rem` [[#176](https://github.com/flavorjones/loofah/issues/176), [#177](https://github.com/flavorjones/loofah/issues/177)]
51
+ * Add `frozen_string_literal: true` magic comment to all `lib` files. [[#118](https://github.com/flavorjones/loofah/issues/118)]
52
+
53
+
3
54
  ## 2.3.1 / 2019-10-22
4
55
 
5
56
  ### Security
6
57
 
7
58
  Address CVE-2019-15587: Unsanitized JavaScript may occur in sanitized output when a crafted SVG element is republished.
8
59
 
9
- This CVE's public notice is at https://github.com/flavorjones/loofah/issues/171
60
+ This CVE's public notice is at [#171](https://github.com/flavorjones/loofah/issues/171)
10
61
 
11
62
 
12
63
  ## 2.3.0 / 2019-09-28
13
64
 
14
65
  ### Features
15
66
 
16
- * Expand set of allowed protocols to include `tel:` and `line:`. [#104, #147]
17
- * Expand set of allowed CSS functions. [related to #122]
18
- * Allow greater precision in shorthand CSS values. [#149] (Thanks, @danfstucky!)
19
- * Allow CSS property `list-style` [#162] (Thanks, @jaredbeck!)
20
- * Allow CSS keywords `thick` and `thin` [#168] (Thanks, @georgeclaghorn!)
21
- * Allow HTML property `contenteditable` [#167] (Thanks, @andreynering!)
67
+ * Expand set of allowed protocols to include `tel:` and `line:`. [[#104](https://github.com/flavorjones/loofah/issues/104), [#147](https://github.com/flavorjones/loofah/issues/147)]
68
+ * Expand set of allowed CSS functions. [related to [#122](https://github.com/flavorjones/loofah/issues/122)]
69
+ * Allow greater precision in shorthand CSS values. [[#149](https://github.com/flavorjones/loofah/issues/149)] (Thanks, [@danfstucky](https://github.com/danfstucky)!)
70
+ * Allow CSS property `list-style` [[#162](https://github.com/flavorjones/loofah/issues/162)] (Thanks, [@jaredbeck](https://github.com/jaredbeck)!)
71
+ * Allow CSS keywords `thick` and `thin` [[#168](https://github.com/flavorjones/loofah/issues/168)] (Thanks, [@georgeclaghorn](https://github.com/georgeclaghorn)!)
72
+ * Allow HTML property `contenteditable` [[#167](https://github.com/flavorjones/loofah/issues/167)] (Thanks, [@andreynering](https://github.com/andreynering)!)
22
73
 
23
74
 
24
75
  ### Bug fixes
25
76
 
26
- * CSS hex values are no longer limited to lowercase hex. Previously uppercase hex were scrubbed. [#165] (Thanks, @asok!)
77
+ * CSS hex values are no longer limited to lowercase hex. Previously uppercase hex were scrubbed. [[#165](https://github.com/flavorjones/loofah/issues/165)] (Thanks, [@asok](https://github.com/asok)!)
27
78
 
28
79
 
29
80
  ### Deprecations / Name Changes
@@ -34,7 +85,7 @@ The following method and constants are hereby deprecated, and will be completely
34
85
  * Deprecate `Loofah::Helpers::ActionView::WhiteListSanitizer`, please use `Loofah::Helpers::ActionView::SafeListSanitizer` instead.
35
86
  * Deprecate `Loofah::HTML5::WhiteList`, please use `Loofah::HTML5::SafeList` instead.
36
87
 
37
- Thanks to @JuanitoFatas for submitting these changes in #164 and for making the language used in Loofah more inclusive.
88
+ Thanks to [@JuanitoFatas](https://github.com/JuanitoFatas) for submitting these changes in [#164](https://github.com/flavorjones/loofah/issues/164) and for making the language used in Loofah more inclusive.
38
89
 
39
90
 
40
91
  ## 2.2.3 / 2018-10-30
@@ -43,7 +94,7 @@ Thanks to @JuanitoFatas for submitting these changes in #164 and for making the
43
94
 
44
95
  Address CVE-2018-16468: Unsanitized JavaScript may occur in sanitized output when a crafted SVG element is republished.
45
96
 
46
- This CVE's public notice is at https://github.com/flavorjones/loofah/issues/154
97
+ This CVE's public notice is at [#154](https://github.com/flavorjones/loofah/issues/154)
47
98
 
48
99
 
49
100
  ## Meta / 2018-10-27
@@ -70,76 +121,76 @@ attribute scrubbers should they need to address CVE-2018-8048.
70
121
 
71
122
  Addresses CVE-2018-8048. Loofah allowed non-whitelisted attributes to be present in sanitized output when input with specially-crafted HTML fragments.
72
123
 
73
- This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
124
+ This CVE's public notice is at [#144](https://github.com/flavorjones/loofah/issues/144)
74
125
 
75
126
 
76
127
  ## 2.2.0 / 2018-02-11
77
128
 
78
129
  ### Features:
79
130
 
80
- * Support HTML5 `<main>` tag. #133 (Thanks, @MothOnMars!)
81
- * Recognize HTML5 block elements. #136 (Thanks, @MothOnMars!)
82
- * Support SVG `<symbol>` tag. #131 (Thanks, @baopham!)
83
- * Support for whitelisting CSS functions, initially just `calc` and `rgb`. #122/#123/#129 (Thanks, @NikoRoberts!)
84
- * Whitelist CSS property `list-style-type`. #68/#137/#142 (Thanks, @andela-ysanni and @NikoRoberts!)
131
+ * Support HTML5 `<main>` tag. [#133](https://github.com/flavorjones/loofah/issues/133) (Thanks, [@MothOnMars](https://github.com/MothOnMars)!)
132
+ * Recognize HTML5 block elements. [#136](https://github.com/flavorjones/loofah/issues/136) (Thanks, [@MothOnMars](https://github.com/MothOnMars)!)
133
+ * Support SVG `<symbol>` tag. [#131](https://github.com/flavorjones/loofah/issues/131) (Thanks, [@baopham](https://github.com/baopham)!)
134
+ * Support for whitelisting CSS functions, initially just `calc` and `rgb`. [#122](https://github.com/flavorjones/loofah/issues/122)/[#123](https://github.com/flavorjones/loofah/issues/123)/[#129](https://github.com/flavorjones/loofah/issues/129) (Thanks, [@NikoRoberts](https://github.com/NikoRoberts)!)
135
+ * Whitelist CSS property `list-style-type`. [#68](https://github.com/flavorjones/loofah/issues/68)/[#137](https://github.com/flavorjones/loofah/issues/137)/[#142](https://github.com/flavorjones/loofah/issues/142) (Thanks, [@andela-ysanni](https://github.com/andela-ysanni) and [@NikoRoberts](https://github.com/NikoRoberts)!)
85
136
 
86
137
  ### Bugfixes:
87
138
 
88
- * Properly handle nested `script` tags. #127.
139
+ * Properly handle nested `script` tags. [#127](https://github.com/flavorjones/loofah/issues/127).
89
140
 
90
141
 
91
142
  ## 2.1.1 / 2017-09-24
92
143
 
93
144
  ### Bugfixes:
94
145
 
95
- * Removed warning for unused variable. #124 (Thanks, @y-yagi!)
146
+ * Removed warning for unused variable. [#124](https://github.com/flavorjones/loofah/issues/124) (Thanks, [@y-yagi](https://github.com/y-yagi)!)
96
147
 
97
148
 
98
149
  ## 2.1.0 / 2017-09-24
99
150
 
100
151
  ### Notes:
101
152
 
102
- * Re-implemented CSS parsing and sanitization using the [crass](https://github.com/rgrove/crass) library. #91
153
+ * Re-implemented CSS parsing and sanitization using the [crass](https://github.com/rgrove/crass) library. [#91](https://github.com/flavorjones/loofah/issues/91)
103
154
 
104
155
 
105
156
  ### Features:
106
157
 
107
- * Added :noopener HTML scrubber (Thanks, @tastycode!)
108
- * Support `data` URIs with the following media types: text/plain, text/css, image/png, image/gif, image/jpeg, image/svg+xml. #101, #120. (Thanks, @mrpasquini!)
158
+ * Added :noopener HTML scrubber (Thanks, [@tastycode](https://github.com/tastycode)!)
159
+ * Support `data` URIs with the following media types: text/plain, text/css, image/png, image/gif, image/jpeg, image/svg+xml. [#101](https://github.com/flavorjones/loofah/issues/101), [#120](https://github.com/flavorjones/loofah/issues/120). (Thanks, [@mrpasquini](https://github.com/mrpasquini)!)
109
160
 
110
161
 
111
162
  ### Bugfixes:
112
163
 
113
- * The :unprintable scrubber now scrubs unprintable characters in CDATA nodes (like `<script>`). #124
114
- * Allow negative values in CSS properties. Restores functionality that was reverted in v2.0.3. #91
164
+ * The :unprintable scrubber now scrubs unprintable characters in CDATA nodes (like `<script>`). [#124](https://github.com/flavorjones/loofah/issues/124)
165
+ * Allow negative values in CSS properties. Restores functionality that was reverted in v2.0.3. [#91](https://github.com/flavorjones/loofah/issues/91)
115
166
 
116
167
 
117
168
  ## 2.0.3 / 2015-08-17
118
169
 
119
170
  ### Bug fixes:
120
171
 
121
- * Revert support for negative values in CSS properties due to slow performance. #90 (Related to #85.)
172
+ * Revert support for negative values in CSS properties due to slow performance. [#90](https://github.com/flavorjones/loofah/issues/90) (Related to [#85](https://github.com/flavorjones/loofah/issues/85).)
122
173
 
123
174
 
124
175
  ## 2.0.2 / 2015-05-05
125
176
 
126
177
  ### Bug fixes:
127
178
 
128
- * Fix error with `#to_text` when Loofah::Helpers hadn't been required. #75
129
- * Allow multi-word data attributes. #84 (Thanks, @jstorimer!)
130
- * Allow negative values in CSS properties. #85 (Thanks, @siddhartham!)
179
+ * Fix error with `#to_text` when Loofah::Helpers hadn't been required. [#75](https://github.com/flavorjones/loofah/issues/75)
180
+ * Allow multi-word data attributes. [#84](https://github.com/flavorjones/loofah/issues/84) (Thanks, [@jstorimer](https://github.com/jstorimer)!)
181
+ * Allow negative values in CSS properties. [#85](https://github.com/flavorjones/loofah/issues/85) (Thanks, [@siddhartham](https://github.com/siddhartham)!)
131
182
 
132
183
 
133
184
  ## 2.0.1 / 2014-08-21
134
185
 
135
186
  ### Bug fixes:
136
187
 
137
- * Load RR correctly when running test files directly. (Thanks, @ktdreyer!)
188
+ * Load RR correctly when running test files directly. (Thanks, [@ktdreyer](https://github.com/ktdreyer)!)
138
189
 
139
190
 
140
191
  ### Notes:
141
192
 
142
- * Extracted HTML5::Scrub#scrub_css_attribute to accommodate the Rails integration work. (Thanks, @kaspth!)
193
+ * Extracted HTML5::Scrub#scrub_css_attribute to accommodate the Rails integration work. (Thanks, [@kaspth](https://github.com/kaspth)!)
143
194
 
144
195
 
145
196
  ## 2.0.0 / 2014-05-09
@@ -155,19 +206,19 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
155
206
  * tags: `article`, `aside`, `bdi`, `bdo`, `canvas`, `command`, `datalist`, `details`, `figcaption`, `figure`, `footer`, `header`, `mark`, `meter`, `nav`, `output`, `section`, `summary`, `time`
156
207
  * attributes: `data-*` (Thanks, Rafael Franca!)
157
208
  * URI attributes: `poster` and `preload`
158
- * Addition of the `:unprintable` scrubber to remove unprintable characters from text nodes. #65 (Thanks, Matt Swanson!)
159
- * `Loofah.fragment` accepts an optional encoding argument, compatible with `Nokogiri::HTML::DocumentFragment.parse`. #62 (Thanks, Ben Atkins!)
209
+ * Addition of the `:unprintable` scrubber to remove unprintable characters from text nodes. [#65](https://github.com/flavorjones/loofah/issues/65) (Thanks, Matt Swanson!)
210
+ * `Loofah.fragment` accepts an optional encoding argument, compatible with `Nokogiri::HTML::DocumentFragment.parse`. [#62](https://github.com/flavorjones/loofah/issues/62) (Thanks, Ben Atkins!)
160
211
  * HTML5 sanitizers now remove attributes without values. (Thanks, Kasper Timm Hansen!)
161
212
 
162
213
  ### Bug fixes:
163
214
 
164
215
  * HTML5 sanitizers' CSS keyword check now actually works (broken in v2.0). Additional regression tests added. (Thanks, Kasper Timm Hansen!)
165
- * HTML5 sanitizers now allow negative arguments to CSS. #64 (Thanks, Jon Calhoun!)
216
+ * HTML5 sanitizers now allow negative arguments to CSS. [#64](https://github.com/flavorjones/loofah/issues/64) (Thanks, Jon Calhoun!)
166
217
 
167
218
 
168
219
  ## 1.2.1 (2012-04-14)
169
220
 
170
- * Declaring encoding in html5/scrub.rb. Without this, use of the ruby -KU option would cause havoc. (#32)
221
+ * Declaring encoding in html5/scrub.rb. Without this, use of the ruby -KU option would cause havoc. ([#32](https://github.com/flavorjones/loofah/issues/32))
171
222
 
172
223
 
173
224
  ## 1.2.0 (2011-08-08)
@@ -185,7 +236,7 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
185
236
  * Additional HTML5lib whitelist elements (from html5lib 1524:80b5efe26230).
186
237
  Up to date with HTML5lib ruby code as of 1723:7ee6a0331856.
187
238
  * Whitelists (which are not part of the public API) are now Sets (were previously Arrays).
188
- * Don't explode when encountering UTF-8 URIs. (#25, #29)
239
+ * Don't explode when encountering UTF-8 URIs. ([#25](https://github.com/flavorjones/loofah/issues/25), [#29](https://github.com/flavorjones/loofah/issues/29))
189
240
 
190
241
 
191
242
  ## 1.0.0 (2010-10-26)
@@ -203,7 +254,7 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
203
254
  * New methods Loofah::HTML::Document#to_text and
204
255
  Loofah::HTML::DocumentFragment#to_text do the right thing with
205
256
  whitespace. Note that these methods are significantly slower than
206
- #text. GH #12
257
+ #text. GH [#12](https://github.com/flavorjones/loofah/issues/12)
207
258
  * Loofah::Elements::BLOCK_LEVEL contains a canonical list of HTML4 block-level4 elements.
208
259
  * Loofah::HTML::Document#text and Loofah::HTML::DocumentFragment#text
209
260
  will return unescaped HTML entities by passing :encode_special_chars => false.
@@ -217,7 +268,7 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
217
268
 
218
269
  ### Bug fixes:
219
270
 
220
- * Loofah::XssFoliate was not properly escaping HTML entities when implicitly scrubbing a string attribute. GH #17
271
+ * Loofah::XssFoliate was not properly escaping HTML entities when implicitly scrubbing a string attribute. GH [#17](https://github.com/flavorjones/loofah/issues/17)
221
272
 
222
273
 
223
274
  ## 0.4.3 (2010-01-29)
@@ -245,7 +296,7 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
245
296
 
246
297
  ### Bug fixes:
247
298
 
248
- * Supporting Rails apps that aren't loading ActiveRecord. GH #10
299
+ * Supporting Rails apps that aren't loading ActiveRecord. GH [#10](https://github.com/flavorjones/loofah/issues/10)
249
300
 
250
301
  ### Miscellaneous:
251
302
 
@@ -306,13 +357,13 @@ This CVE's public notice is at https://github.com/flavorjones/loofah/issues/144
306
357
  ### Enhancements:
307
358
 
308
359
  * when loaded in a Rails app, automatically extend ActiveRecord::Base
309
- with html_fragment and html_document. GH #6 (Thanks Josh Nichols!)
360
+ with html_fragment and html_document. GH [#6](https://github.com/flavorjones/loofah/issues/6) (Thanks Josh Nichols!)
310
361
 
311
362
  ### Bugfixes:
312
363
 
313
364
  * ActiveRecord scrubbing should generate strings instead of Document or
314
- DocumentFragment objects. GH #5
315
- * init.rb fixed to support installation as a Rails plugin. GH #6
365
+ DocumentFragment objects. GH [#5](https://github.com/flavorjones/loofah/issues/5)
366
+ * init.rb fixed to support installation as a Rails plugin. GH [#6](https://github.com/flavorjones/loofah/issues/6)
316
367
  (Thanks Josh Nichols!)
317
368
 
318
369
 
data/README.md CHANGED
@@ -6,10 +6,9 @@
6
6
 
7
7
  ## Status
8
8
 
9
- |System|Status|
10
- |--|--|
11
- | Concourse CI | [![Concourse CI](https://ci.nokogiri.org/api/v1/teams/nokogiri-core/pipelines/loofah/jobs/ruby-2.5/badge)](https://ci.nokogiri.org/teams/nokogiri-core/pipelines/loofah?groups=master) |
12
- | Code Climate | [![Code Climate](https://codeclimate.com/github/flavorjones/loofah.svg)](https://codeclimate.com/github/flavorjones/loofah) |
9
+ [![Concourse CI](https://ci.nokogiri.org/api/v1/teams/nokogiri-core/pipelines/loofah/jobs/ruby-2.5/badge)](https://ci.nokogiri.org/teams/nokogiri-core/pipelines/loofah?groups=master)
10
+ [![Code Climate](https://codeclimate.com/github/flavorjones/loofah.svg)](https://codeclimate.com/github/flavorjones/loofah)
11
+ [![Tidelift dependencies](https://tidelift.com/badges/package/rubygems/loofah)](https://tidelift.com/subscription/pkg/rubygems-loofah?utm_source=rubygems-loofah&utm_medium=referral&utm_campaign=readme)
13
12
 
14
13
 
15
14
  ## Description
@@ -301,6 +300,10 @@ And the mailing list is on Google Groups:
301
300
 
302
301
  And the IRC channel is \#loofah on freenode.
303
302
 
303
+ Consider subscribing to [Tidelift][tidelift] which provides license assurances and timely security notifications for your open source dependencies, including Loofah. [Tidelift][tidelift] subscriptions also help the Loofah maintainers fund our [automated testing](https://ci.nokogiri.org) which in turn allows us to ship releases, bugfixes, and security updates more often.
304
+
305
+ [tidelift]: https://tidelift.com/subscription/pkg/rubygems-loofah?utm_source=undefined&utm_medium=referral&utm_campaign=enterprise
306
+
304
307
 
305
308
  ## Security
306
309
 
@@ -1,22 +1,24 @@
1
+ # frozen_string_literal: true
1
2
  $LOAD_PATH.unshift(File.expand_path(File.dirname(__FILE__))) unless $LOAD_PATH.include?(File.expand_path(File.dirname(__FILE__)))
2
3
 
3
4
  require "nokogiri"
4
5
 
5
- require "loofah/metahelpers"
6
- require "loofah/elements"
6
+ require_relative "loofah/version"
7
+ require_relative "loofah/metahelpers"
8
+ require_relative "loofah/elements"
7
9
 
8
- require "loofah/html5/safelist"
9
- require "loofah/html5/libxml2_workarounds"
10
- require "loofah/html5/scrub"
10
+ require_relative "loofah/html5/safelist"
11
+ require_relative "loofah/html5/libxml2_workarounds"
12
+ require_relative "loofah/html5/scrub"
11
13
 
12
- require "loofah/scrubber"
13
- require "loofah/scrubbers"
14
+ require_relative "loofah/scrubber"
15
+ require_relative "loofah/scrubbers"
14
16
 
15
- require "loofah/instance_methods"
16
- require "loofah/xml/document"
17
- require "loofah/xml/document_fragment"
18
- require "loofah/html/document"
19
- require "loofah/html/document_fragment"
17
+ require_relative "loofah/instance_methods"
18
+ require_relative "loofah/xml/document"
19
+ require_relative "loofah/xml/document_fragment"
20
+ require_relative "loofah/html/document"
21
+ require_relative "loofah/html/document_fragment"
20
22
 
21
23
  # == Strings and IO Objects as Input
22
24
  #
@@ -27,14 +29,11 @@ require "loofah/html/document_fragment"
27
29
  # quantities of docs.
28
30
  #
29
31
  module Loofah
30
- # The version of Loofah you are using
31
- VERSION = "2.3.1"
32
-
33
32
  class << self
34
33
  # Shortcut for Loofah::HTML::Document.parse
35
34
  # This method accepts the same parameters as Nokogiri::HTML::Document.parse
36
35
  def document(*args, &block)
37
- Loofah::HTML::Document.parse(*args, &block)
36
+ remove_comments_before_html_element Loofah::HTML::Document.parse(*args, &block)
38
37
  end
39
38
 
40
39
  # Shortcut for Loofah::HTML::DocumentFragment.parse
@@ -79,5 +78,23 @@ module Loofah
79
78
  def remove_extraneous_whitespace(string)
80
79
  string.gsub(/\n\s*\n\s*\n/, "\n\n")
81
80
  end
81
+
82
+ private
83
+
84
+ # remove comments that exist outside of the HTML element.
85
+ #
86
+ # these comments are allowed by the HTML spec:
87
+ #
88
+ # https://www.w3.org/TR/html401/struct/global.html#h-7.1
89
+ #
90
+ # but are not scrubbed by Loofah because these nodes don't meet
91
+ # the contract that scrubbers expect of a node (e.g., it can be
92
+ # replaced, sibling and children nodes can be created).
93
+ def remove_comments_before_html_element(doc)
94
+ doc.children.each do |child|
95
+ child.unlink if child.comment?
96
+ end
97
+ doc
98
+ end
82
99
  end
83
100
  end
@@ -1,89 +1,90 @@
1
- require 'set'
1
+ # frozen_string_literal: true
2
+ require "set"
2
3
 
3
4
  module Loofah
4
5
  module Elements
5
6
  STRICT_BLOCK_LEVEL_HTML4 = Set.new %w[
6
- address
7
- blockquote
8
- center
9
- dir
10
- div
11
- dl
12
- fieldset
13
- form
14
- h1
15
- h2
16
- h3
17
- h4
18
- h5
19
- h6
20
- hr
21
- isindex
22
- menu
23
- noframes
24
- noscript
25
- ol
26
- p
27
- pre
28
- table
29
- ul
30
- ]
7
+ address
8
+ blockquote
9
+ center
10
+ dir
11
+ div
12
+ dl
13
+ fieldset
14
+ form
15
+ h1
16
+ h2
17
+ h3
18
+ h4
19
+ h5
20
+ h6
21
+ hr
22
+ isindex
23
+ menu
24
+ noframes
25
+ noscript
26
+ ol
27
+ p
28
+ pre
29
+ table
30
+ ul
31
+ ]
31
32
 
32
33
  # https://developer.mozilla.org/en-US/docs/Web/HTML/Block-level_elements
33
34
  STRICT_BLOCK_LEVEL_HTML5 = Set.new %w[
34
- address
35
- article
36
- aside
37
- blockquote
38
- canvas
39
- dd
40
- div
41
- dl
42
- dt
43
- fieldset
44
- figcaption
45
- figure
46
- footer
47
- form
48
- h1
49
- h2
50
- h3
51
- h4
52
- h5
53
- h6
54
- header
55
- hgroup
56
- hr
57
- li
58
- main
59
- nav
60
- noscript
61
- ol
62
- output
63
- p
64
- pre
65
- section
66
- table
67
- tfoot
68
- ul
69
- video
70
- ]
35
+ address
36
+ article
37
+ aside
38
+ blockquote
39
+ canvas
40
+ dd
41
+ div
42
+ dl
43
+ dt
44
+ fieldset
45
+ figcaption
46
+ figure
47
+ footer
48
+ form
49
+ h1
50
+ h2
51
+ h3
52
+ h4
53
+ h5
54
+ h6
55
+ header
56
+ hgroup
57
+ hr
58
+ li
59
+ main
60
+ nav
61
+ noscript
62
+ ol
63
+ output
64
+ p
65
+ pre
66
+ section
67
+ table
68
+ tfoot
69
+ ul
70
+ video
71
+ ]
71
72
 
72
73
  STRICT_BLOCK_LEVEL = STRICT_BLOCK_LEVEL_HTML4 + STRICT_BLOCK_LEVEL_HTML5
73
74
 
74
75
  # The following elements may also be considered block-level
75
76
  # elements since they may contain block-level elements
76
77
  LOOSE_BLOCK_LEVEL = Set.new %w[dd
77
- dt
78
- frameset
79
- li
80
- tbody
81
- td
82
- tfoot
83
- th
84
- thead
85
- tr
86
- ]
78
+ dt
79
+ frameset
80
+ li
81
+ tbody
82
+ td
83
+ tfoot
84
+ th
85
+ thead
86
+ tr
87
+ ]
87
88
 
88
89
  BLOCK_LEVEL = STRICT_BLOCK_LEVEL + LOOSE_BLOCK_LEVEL
89
90
  end