sanitize 4.6.4 → 6.0.2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: ac9e4e6beb6025350578007019bf48458a64b82ef26cd11e4547aee35b72625c
4
- data.tar.gz: 000a0cd02b3a2690589f042f3d27fb0dd8fe34d150cc1ba073793dcfb2eb0a92
3
+ metadata.gz: 93adca1e155370d138ccb7c500b618e2ed218297d21593ec8937638f4d99731b
4
+ data.tar.gz: 740b6d84113a0945928601b6cad03e36b4ee76f7c3098c72ddb1a1b01ec5d0ec
5
5
  SHA512:
6
- metadata.gz: 127b1656fa575c9ba793db8d6026a37b240a884172b18b05bb2776c80b2cda6fc2982938e3a98272b46d5741b05c8c05f3238ebb61e439a9f7ded36615854c4d
7
- data.tar.gz: adb7c8b6bf29118b82094dd2cb7109dfaa6286ca57b73e98ec457b8eb433e91c76efd987fe2c6d3b6dac1ed554331f7d067240c4a96e7522d7122da5c5b925b1
6
+ metadata.gz: 4d3e9852ec92ac961c2e35d4a04e7d967dd2eac364e656837b93daf95c1b653da53d4ef7f104af83887e46d08237ddca1efa945facde3efbfcfce0164c0fe334
7
+ data.tar.gz: 05a56334e5cdbbee7b165b19245b90a8acdd82bcd72bbc9f84e2780d914f8b040d19d9ff71934b0c1bd71df4b55f407f460c76dffdbd275b183ecaffb2fa6c38
data/HISTORY.md CHANGED
@@ -1,5 +1,228 @@
1
1
  # Sanitize History
2
2
 
3
+ ## 6.0.2 (2023-07-06)
4
+
5
+ ### Bug Fixes
6
+
7
+ * CVE-2023-36823: Fixed an HTML+CSS sanitization bypass that could allow XSS
8
+ (cross-site scripting). This issue affects Sanitize versions 3.0.0 through
9
+ 6.0.1.
10
+
11
+ When using Sanitize's relaxed config or a custom config that allows `<style>`
12
+ elements and one or more CSS at-rules, carefully crafted input could be used
13
+ to sneak arbitrary HTML through Sanitize.
14
+
15
+ See the following security advisory for additional details:
16
+ [GHSA-f5ww-cq3m-q3g7](https://github.com/rgrove/sanitize/security/advisories/GHSA-f5ww-cq3m-q3g7)
17
+
18
+ Thanks to @cure53 for finding this issue.
19
+
20
+ ## 6.0.1 (2023-01-27)
21
+
22
+ ### Bug Fixes
23
+
24
+ * Sanitize now always removes `<noscript>` elements and their contents, even
25
+ when `noscript` is in the allowlist.
26
+
27
+ This fixes a sanitization bypass that could occur when `noscript` was allowed
28
+ by a custom allowlist. In this scenario, carefully crafted input could sneak
29
+ arbitrary HTML through Sanitize, potentially enabling an XSS (cross-site
30
+ scripting) attack.
31
+
32
+ Sanitize's default configs don't allow `<noscript>` elements and are not
33
+ vulnerable. This issue only affects users who are using a custom config that
34
+ adds `noscript` to the element allowlist.
35
+
36
+ The root cause of this issue is that HTML parsing rules treat the contents of
37
+ a `<noscript>` element differently depending on whether scripting is enabled
38
+ in the user agent. Nokogiri doesn't support scripting so it follows the
39
+ "scripting disabled" rules, but a web browser with scripting enabled will
40
+ follow the "scripting enabled" rules. This means that Sanitize can't reliably
41
+ make the contents of a `<noscript>` element safe for scripting enabled
42
+ browsers, so the safest thing to do is to remove the element and its contents
43
+ entirely.
44
+
45
+ See the following security advisory for additional details:
46
+ [GHSA-fw3g-2h3j-qmm7](https://github.com/rgrove/sanitize/security/advisories/GHSA-fw3g-2h3j-qmm7)
47
+
48
+ Thanks to David Klein from [TU Braunschweig](https://www.tu-braunschweig.de/en/ias)
49
+ (@leeN) for reporting this issue.
50
+
51
+ * Fixed an edge case in which the contents of an "unescaped text" element (such
52
+ as `<noembed>` or `<xmp>`) were not properly escaped if that element was
53
+ allowlisted and was also inside an allowlisted `<math>` or `<svg>` element.
54
+
55
+ The only way to encounter this situation was to ignore multiple warnings in
56
+ the readme and create a custom config that allowlisted all the elements
57
+ involved, including `<math>` or `<svg>`. If you're using a default config or
58
+ if you heeded the warnings about MathML and SVG not being supported, you're
59
+ not affected by this issue.
60
+
61
+ Please let this be a reminder that Sanitize cannot safely sanitize MathML or
62
+ SVG content and does not support this use case. The default configs don't
63
+ allow MathML or SVG elements, and allowlisting MathML or SVG elements in a
64
+ custom config may create a security vulnerability in your application.
65
+
66
+ Documentation has been updated to add more warnings and to make the existing
67
+ warnings about this more prominent.
68
+
69
+ Thanks to David Klein from [TU Braunschweig](https://www.tu-braunschweig.de/en/ias)
70
+ (@leeN) for reporting this issue.
71
+
72
+ ## 6.0.0 (2021-08-03)
73
+
74
+ ### Potentially Breaking Changes
75
+
76
+ * Ruby 2.5.0 is now the oldest officially supported Ruby version.
77
+
78
+ * Sanitize now requires Nokogiri 1.12.0 or higher, which includes Nokogumbo.
79
+ The separate dependency on Nokogumbo has been removed. [@lis2 - #211][211]
80
+
81
+ [211]:https://github.com/rgrove/sanitize/pull/211
82
+
83
+ ## 5.2.3 (2021-01-11)
84
+
85
+ ### Bug Fixes
86
+
87
+ * Ensure protocol sanitization is applied to data attributes.
88
+ [@ccutrer - #207][207]
89
+
90
+ [207]:https://github.com/rgrove/sanitize/pull/207
91
+
92
+ ## 5.2.2 (2021-01-06)
93
+
94
+ ### Bug Fixes
95
+
96
+ * Fixed a deprecation warning in Ruby 2.7+ when using keyword arguments in a
97
+ custom transformer. [@mscrivo - #206][206]
98
+
99
+ [206]:https://github.com/rgrove/sanitize/pull/206
100
+
101
+ ## 5.2.1 (2020-06-16)
102
+
103
+ ### Bug Fixes
104
+
105
+ * Fixed an HTML sanitization bypass that could allow XSS. This issue affects
106
+ Sanitize versions 3.0.0 through 5.2.0.
107
+
108
+ When HTML was sanitized using the "relaxed" config or a custom config that
109
+ allows certain elements, some content in a `<math>` or `<svg>` element may not
110
+ have beeen sanitized correctly even if `math` and `svg` were not in the
111
+ allowlist. This could allow carefully crafted input to sneak arbitrary HTML
112
+ through Sanitize, potentially enabling an XSS (cross-site scripting) attack.
113
+
114
+ You are likely to be vulnerable to this issue if you use Sanitize's relaxed
115
+ config or a custom config that allows one or more of the following HTML
116
+ elements:
117
+
118
+ - `iframe`
119
+ - `math`
120
+ - `noembed`
121
+ - `noframes`
122
+ - `noscript`
123
+ - `plaintext`
124
+ - `script`
125
+ - `style`
126
+ - `svg`
127
+ - `xmp`
128
+
129
+ See the security advisory for more details, including a workaround if you're
130
+ not able to upgrade: [GHSA-p4x4-rw2p-8j8m]
131
+
132
+ Many thanks to Michał Bentkowski of Securitum for reporting this issue and
133
+ helping to verify the fix.
134
+
135
+ [GHSA-p4x4-rw2p-8j8m]:https://github.com/rgrove/sanitize/security/advisories/GHSA-p4x4-rw2p-8j8m
136
+
137
+ ## 5.2.0 (2020-06-06)
138
+
139
+ ### Changes
140
+
141
+ * The term "whitelist" has been replaced with "allowlist" throughout Sanitize's
142
+ source and documentation.
143
+
144
+ While the etymology of "whitelist" may not be explicitly racist in origin or
145
+ intent, there are inherent racial connotations in the implication that white
146
+ is good and black (as in "blacklist") is not.
147
+
148
+ This is a change I should have made long ago, and I apologize for not making
149
+ it sooner.
150
+
151
+ * In transformer input, the `:is_whitelisted` and `:node_whitelist` keys are now
152
+ deprecated. New `:is_allowlisted` and `:node_allowlist` keys have been added.
153
+ The old keys will continue to work in order to avoid breaking existing code,
154
+ but they are no longer documented and may be removed in a future semver major
155
+ release.
156
+
157
+ ## 5.1.0 (2019-09-07)
158
+
159
+ ### Features
160
+
161
+ * Added a `:parser_options` config hash, which makes it possible to pass custom
162
+ parsing options to Nokogumbo. [@austin-wang - #194][194]
163
+
164
+ ### Bug Fixes
165
+
166
+ * Non-characters and non-whitespace control characters are now stripped from
167
+ HTML input before parsing to comply with the HTML Standard's [preprocessing
168
+ guidelines][html-preprocessing]. Prior to this Sanitize had adhered to [older
169
+ W3C guidelines][unicode-xml] that have since been withdrawn. [#179][179]
170
+
171
+ [179]:https://github.com/rgrove/sanitize/issues/179
172
+ [194]:https://github.com/rgrove/sanitize/pull/194
173
+ [html-preprocessing]:https://html.spec.whatwg.org/multipage/parsing.html#preprocessing-the-input-stream
174
+ [unicode-xml]:https://www.w3.org/TR/unicode-xml/
175
+
176
+ ## 5.0.0 (2018-10-14)
177
+
178
+ For most users, upgrading from 4.x shouldn't require any changes. However, the
179
+ minimum required Ruby version has changed, and Sanitize 5.x's HTML output may
180
+ differ in some small ways from 4.x's output. If this matters to you, please
181
+ review the changes below carefully.
182
+
183
+ ### Potentially Breaking Changes
184
+
185
+ * Ruby 2.3.0 is now the oldest officially supported Ruby version. Sanitize may
186
+ work in older 2.x Rubies, but they aren't actively tested. Sanitize definitely
187
+ no longer works in Ruby 1.9.x.
188
+
189
+ * Upgraded to Nokogumbo 2.x, which fixes various bugs and adds
190
+ standard-compliant HTML serialization. [@stevecheckoway - #189][189]
191
+
192
+ * Children of the following elements are now removed by default when these
193
+ elements are removed, rather than being preserved and escaped:
194
+
195
+ - `iframe`
196
+ - `noembed`
197
+ - `noframes`
198
+ - `noscript`
199
+ - `script`
200
+ - `style`
201
+
202
+ * Children of allowlisted `iframe` elements are now always removed. In modern
203
+ HTML, `iframe` elements should never have children. In HTML 4 and earlier
204
+ `iframe` elements were allowed to contain fallback content for legacy
205
+ browsers, but it's been almost two decades since that was useful.
206
+
207
+ * Fixed a bug that caused `:remove_contents` to behave as if it were set to
208
+ `true` when it was actually an Array.
209
+
210
+ [189]:https://github.com/rgrove/sanitize/pull/189
211
+
212
+ ## 4.6.6 (2018-07-23)
213
+
214
+ * Improved performance and memory usage by optimizing `Sanitize#transform_node!`
215
+ [@stanhu - #183][183]
216
+
217
+ [183]:https://github.com/rgrove/sanitize/pull/183
218
+
219
+ ## 4.6.5 (2018-05-16)
220
+
221
+ * Improved performance slightly by tweaking the order of built-in transformers.
222
+ [@rafbm - #180][180]
223
+
224
+ [180]:https://github.com/rgrove/sanitize/pull/180
225
+
3
226
  ## 4.6.4 (2018-03-20)
4
227
 
5
228
  * Fixed: A change introduced in 4.6.2 broke certain transformers that relied on
@@ -15,7 +238,7 @@
15
238
 
16
239
  When Sanitize <= 4.6.2 is used in combination with libxml2 >= 2.9.2, a
17
240
  specially crafted HTML fragment can cause libxml2 to generate improperly
18
- escaped output, allowing non-whitelisted attributes to be used on whitelisted
241
+ escaped output, allowing non-allowlisted attributes to be used on allowlisted
19
242
  elements.
20
243
 
21
244
  Sanitize now performs additional escaping on affected attributes to prevent
@@ -59,7 +282,7 @@
59
282
 
60
283
  ## 4.4.0 (2016-09-29)
61
284
 
62
- * Added `srcset` to the attribute whitelist for `img` elements in the relaxed
285
+ * Added `srcset` to the attribute allowlist for `img` elements in the relaxed
63
286
  config. [@ejtttje - #156][156]
64
287
 
65
288
  [156]:https://github.com/rgrove/sanitize/pull/156
@@ -180,7 +403,7 @@
180
403
  ## 3.0.4 (2014-12-12)
181
404
 
182
405
  * Fixed: Harmless whitespace preceding a URL protocol (such as " http://")
183
- caused the URL to be removed even when the protocol was whitelisted.
406
+ caused the URL to be removed even when the protocol was allowlisted.
184
407
  [@benubois - #126][126]
185
408
 
186
409
  [126]:https://github.com/rgrove/sanitize/pull/126
@@ -189,7 +412,7 @@
189
412
  ## 3.0.3 (2014-10-29)
190
413
 
191
414
  * Fixed: Some CSS selectors weren't parsed correctly inside the body of a
192
- `@media` block, causing them to be removed even when whitelist rules should
415
+ `@media` block, causing them to be removed even when allowlist rules should
193
416
  have allowed them to remain. [#121][121]
194
417
 
195
418
  [121]:https://github.com/rgrove/sanitize/issues/121
@@ -254,7 +477,7 @@ Sanitize.fragment(html, Sanitize::Config.merge(Sanitize::Config::BASIC,
254
477
  * The `clean_node!` method was renamed to `node!`.
255
478
 
256
479
  * The `document` method now raises a `Sanitize::Error` if the `<html>` element
257
- isn't whitelisted, rather than a `RuntimeError`. This error is also now raised
480
+ isn't allowlisted, rather than a `RuntimeError`. This error is also now raised
258
481
  regardless of the `:remove_contents` config setting.
259
482
 
260
483
  * The `:output` config has been removed. Output is now always HTML, not XHTML.
@@ -265,7 +488,7 @@ Sanitize.fragment(html, Sanitize::Config.merge(Sanitize::Config::BASIC,
265
488
 
266
489
  * Added advanced CSS sanitization support using [Crass][crass], which is fully
267
490
  compliant with the CSS Syntax Module Level 3 parsing spec. The contents of
268
- whitelisted `<style>` elements and `style` attributes in HTML will be
491
+ allowlisted `<style>` elements and `style` attributes in HTML will be
269
492
  sanitized as CSS, or you can use the `Sanitize::CSS` class to manually
270
493
  sanitize CSS stylesheets or properties.
271
494
 
@@ -310,9 +533,29 @@ Sanitize.fragment(html, Sanitize::Config.merge(Sanitize::Config::BASIC,
310
533
  [n1008]:https://github.com/sparklemotion/nokogiri/issues/1008
311
534
 
312
535
 
536
+ ## 2.1.1 (2018-09-30)
537
+
538
+ * [CVE-2018-3740][176]: Fixed an HTML injection vulnerability that could allow
539
+ XSS (backported from Sanitize 4.6.3). [@dometto - #188][188]
540
+
541
+ When Sanitize <= 2.1.0 is used in combination with libxml2 >= 2.9.2, a
542
+ specially crafted HTML fragment can cause libxml2 to generate improperly
543
+ escaped output, allowing non-allowlisted attributes to be used on allowlisted
544
+ elements.
545
+
546
+ Sanitize now performs additional escaping on affected attributes to prevent
547
+ this.
548
+
549
+ Many thanks to the Shopify Application Security Team for responsibly reporting
550
+ this issue.
551
+
552
+ [176]:https://github.com/rgrove/sanitize/issues/176
553
+ [188]:https://github.com/rgrove/sanitize/pull/188
554
+
555
+
313
556
  ## 2.1.0 (2014-01-13)
314
557
 
315
- * Added support for whitelisting arbitrary HTML5 `data-*` attributes. Use the
558
+ * Added support for allowlisting arbitrary HTML5 `data-*` attributes. Use the
316
559
  symbol `:data` instead of an attribute name in the `:attributes` config to
317
560
  indicate that arbitrary data attributes should be allowed on an element.
318
561
 
@@ -393,12 +636,12 @@ Sanitize.fragment(html, Sanitize::Config.merge(Sanitize::Config::BASIC,
393
636
  the default depth-first mode.
394
637
 
395
638
  * Added the `abbr`, `dfn`, `kbd`, `mark`, `s`, `samp`, `time`, and `var`
396
- elements to the whitelists for the basic and relaxed configs.
639
+ elements to the allowlists for the basic and relaxed configs.
397
640
 
398
641
  * Added the `bdo`, `del`, `figcaption`, `figure`, `hgroup`, `ins`, `rp`, `rt`,
399
- `ruby`, and `wbr` elements to the whitelist for the relaxed config.
642
+ `ruby`, and `wbr` elements to the allowlist for the relaxed config.
400
643
 
401
- * The `dir`, `lang`, and `title` attributes are now whitelisted for all
644
+ * The `dir`, `lang`, and `title` attributes are now allowlisted for all
402
645
  elements in the relaxed config.
403
646
 
404
647
  * Bumped minimum Nokogiri version to 1.4.4 to avoid a bug in 1.4.2+
@@ -409,7 +652,7 @@ Sanitize.fragment(html, Sanitize::Config.merge(Sanitize::Config::BASIC,
409
652
  ## 1.2.1 (2010-04-20)
410
653
 
411
654
  * Added a `:remove_contents` config setting. If set to `true`, Sanitize will
412
- remove the contents of all non-whitelisted elements in addition to the
655
+ remove the contents of all non-allowlisted elements in addition to the
413
656
  elements themselves. If set to an array of element names, Sanitize will
414
657
  remove the contents of only those elements (when filtered), and leave the
415
658
  contents of other filtered elements. [Thanks to Rafael Souza for the array
@@ -437,7 +680,7 @@ Sanitize.fragment(html, Sanitize::Config.merge(Sanitize::Config::BASIC,
437
680
  * Added `Sanitize.clean_node!`, which sanitizes a `Nokogiri::XML::Node` and
438
681
  all its children.
439
682
 
440
- * Added elements `<h1>` through `<h6>` to the Relaxed whitelist. [Suggested by
683
+ * Added elements `<h1>` through `<h6>` to the Relaxed allowlist. [Suggested by
441
684
  David Reese]
442
685
 
443
686
 
@@ -457,7 +700,7 @@ Sanitize.fragment(html, Sanitize::Config.merge(Sanitize::Config::BASIC,
457
700
 
458
701
  * Added a workaround for an Hpricot bug that prevents attribute names from
459
702
  being downcased in recent versions of Hpricot. This was exploitable to
460
- prevent non-whitelisted protocols from being cleaned. [Reported by Ben
703
+ prevent non-allowlisted protocols from being cleaned. [Reported by Ben
461
704
  Wanicur]
462
705
 
463
706
 
@@ -487,7 +730,7 @@ Sanitize.fragment(html, Sanitize::Config.merge(Sanitize::Config::BASIC,
487
730
 
488
731
  ## 1.0.5 (2009-02-05)
489
732
 
490
- * Fixed a bug introduced in version 1.0.3 that prevented non-whitelisted
733
+ * Fixed a bug introduced in version 1.0.3 that prevented non-allowlisted
491
734
  protocols from being cleaned when relative URLs were allowed. [Reported by
492
735
  Dev Purkayastha]
493
736
 
@@ -497,7 +740,7 @@ Sanitize.fragment(html, Sanitize::Config.merge(Sanitize::Config::BASIC,
497
740
 
498
741
  ## 1.0.4 (2009-01-16)
499
742
 
500
- * Fixed a bug that made it possible to sneak a non-whitelisted element through
743
+ * Fixed a bug that made it possible to sneak a non-allowlisted element through
501
744
  by repeating it several times in a row. All versions of Sanitize prior to
502
745
  1.0.4 are vulnerable. [Reported by Cristobal]
503
746
 
@@ -505,7 +748,7 @@ Sanitize.fragment(html, Sanitize::Config.merge(Sanitize::Config::BASIC,
505
748
  ## 1.0.3 (2009-01-15)
506
749
 
507
750
  * Fixed a bug whereby incomplete Unicode or hex entities could be used to
508
- prevent non-whitelisted protocols from being cleaned. Since IE6 and Opera
751
+ prevent non-allowlisted protocols from being cleaned. Since IE6 and Opera
509
752
  still decode the incomplete entities, users of those browsers may be
510
753
  vulnerable to malicious script injection on websites using versions of
511
754
  Sanitize prior to 1.0.3.
data/LICENSE CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (c) 2015 Ryan Grove <ryan@wonko.com>
1
+ Copyright (c) 2021 Ryan Grove <ryan@wonko.com>
2
2
 
3
3
  Permission is hereby granted, free of charge, to any person obtaining a copy of
4
4
  this software and associated documentation files (the 'Software'), to deal in