rails-html-sanitizer 1.4.3 → 1.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of rails-html-sanitizer might be problematic. Click here for more details.

checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 2f00d9f256478eb753c8d211c3b25efa4204bbdbc9c5abf0415413c811a2e404
4
- data.tar.gz: 65d3871aa798dfbbfb1138b666d475b590e347cdb66614d6d39b72ad3531c742
3
+ metadata.gz: 365db7c11fc174c5da0a4a670fec92033cf277b71e7bb089534b2ad1bd48b314
4
+ data.tar.gz: b33e592de2e0081f1493d9fc29e8db1a26b2f727c20aa7d111332438bfbf2f1d
5
5
  SHA512:
6
- metadata.gz: e6e31eaa72b1a2e8356aae50600ac784f85a80828cbc49ce8061384ecd3f21a1d8eaee69845dc08537c5102728c3cc41a72cb3ed8b9789c4921038398afa61e2
7
- data.tar.gz: 6b14a49842eaf4c3e0fbae5acd28fdf32a5deb6cd42f769aada848226847180c4d3a67a9dcbc439e1a4855699b0ea694cb4c7b6ee173391ac841bd334ae44b6f
6
+ metadata.gz: bafc9210e52f68f6ea033c1deb70d2d227a85a661f9c4fe988da876a73e29b7c86e0910d9705616ed536978d4c6cdf9e5a23b211e720c1f4c86d7b5ce04c03bf
7
+ data.tar.gz: acb3ed50bf5ebd95824bffc8efb4be8745c32e3d5bd5d157edc14648f4f00e07f308ce5ecb2889ae417d7cd999871f4860ac79ecb0864a25220683ae2edd5473
data/CHANGELOG.md CHANGED
@@ -1,3 +1,110 @@
1
+ ## 1.6.0 / 2023-05-26
2
+
3
+ * Dependencies have been updated:
4
+
5
+ - Loofah `~>2.21` and Nokogiri `~>1.14` for HTML5 parser support
6
+ - As a result, required Ruby version is now `>= 2.7.0`
7
+
8
+ Security updates will continue to be made on the `1.5.x` release branch as long as Rails 6.1
9
+ (which supports Ruby 2.5) is still in security support.
10
+
11
+ *Mike Dalessio*
12
+
13
+ * HTML5 standards-compliant sanitizers are now available on platforms supported by
14
+ Nokogiri::HTML5. These are available as:
15
+
16
+ - `Rails::HTML5::FullSanitizer`
17
+ - `Rails::HTML5::LinkSanitizer`
18
+ - `Rails::HTML5::SafeListSanitizer`
19
+
20
+ And a new "vendor" is provided at `Rails::HTML5::Sanitizer` that can be used in a future version
21
+ of Rails.
22
+
23
+ Note that for symmetry `Rails::HTML4::Sanitizer` is also added, though its behavior is identical
24
+ to the vendor class methods on `Rails::HTML::Sanitizer`.
25
+
26
+ Users may call `Rails::HTML::Sanitizer.best_supported_vendor` to get back the HTML5 vendor if it's
27
+ supported, else the legacy HTML4 vendor.
28
+
29
+ *Mike Dalessio*
30
+
31
+ * Module namespaces have changed, but backwards compatibility is provided by aliases.
32
+
33
+ The library defines three additional modules:
34
+
35
+ - `Rails::HTML` for general functionality (replacing `Rails::Html`)
36
+ - `Rails::HTML4` containing sanitizers that parse content as HTML4
37
+ - `Rails::HTML5` containing sanitizers that parse content as HTML5
38
+
39
+ The following aliases are maintained for backwards compatibility:
40
+
41
+ - `Rails::Html` points to `Rails::HTML`
42
+ - `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
43
+ - `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
44
+ - `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
45
+
46
+ *Mike Dalessio*
47
+
48
+ * `LinkSanitizer` always returns UTF-8 encoded strings. `SafeListSanitizer` and `FullSanitizer`
49
+ already ensured this encoding.
50
+
51
+ *Mike Dalessio*
52
+
53
+ * `SafeListSanitizer` allows `time` tag and `lang` attribute by default.
54
+
55
+ *Mike Dalessio*
56
+
57
+ * The constant `Rails::Html::XPATHS_TO_REMOVE` has been removed. It's not necessary with the
58
+ existing sanitizers, and should have been a private constant all along anyway.
59
+
60
+ *Mike Dalessio*
61
+
62
+
63
+ ## 1.5.0 / 2023-01-20
64
+
65
+ * `SafeListSanitizer`, `PermitScrubber`, and `TargetScrubber` now all support pruning of unsafe tags.
66
+
67
+ By default, unsafe tags are still stripped, but this behavior can be changed to prune the element
68
+ and its children from the document by passing `prune: true` to any of these classes' constructors.
69
+
70
+ *seyerian*
71
+
72
+
73
+ ## 1.4.4 / 2022-12-13
74
+
75
+ * Address inefficient regular expression complexity with certain configurations of Rails::Html::Sanitizer.
76
+
77
+ Fixes CVE-2022-23517. See
78
+ [GHSA-5x79-w82f-gw8w](https://github.com/rails/rails-html-sanitizer/security/advisories/GHSA-5x79-w82f-gw8w)
79
+ for more information.
80
+
81
+ *Mike Dalessio*
82
+
83
+ * Address improper sanitization of data URIs.
84
+
85
+ Fixes CVE-2022-23518 and #135. See
86
+ [GHSA-mcvf-2q2m-x72m](https://github.com/rails/rails-html-sanitizer/security/advisories/GHSA-mcvf-2q2m-x72m)
87
+ for more information.
88
+
89
+ *Mike Dalessio*
90
+
91
+ * Address possible XSS vulnerability with certain configurations of Rails::Html::Sanitizer.
92
+
93
+ Fixes CVE-2022-23520. See
94
+ [GHSA-rrfc-7g8p-99q8](https://github.com/rails/rails-html-sanitizer/security/advisories/GHSA-rrfc-7g8p-99q8)
95
+ for more information.
96
+
97
+ *Mike Dalessio*
98
+
99
+ * Address possible XSS vulnerability with certain configurations of Rails::Html::Sanitizer.
100
+
101
+ Fixes CVE-2022-23519. See
102
+ [GHSA-9h9g-93gc-623h](https://github.com/rails/rails-html-sanitizer/security/advisories/GHSA-9h9g-93gc-623h)
103
+ for more information.
104
+
105
+ *Mike Dalessio*
106
+
107
+
1
108
  ## 1.4.3 / 2022-06-09
2
109
 
3
110
  * Address a possible XSS vulnerability with certain configurations of Rails::Html::Sanitizer.
@@ -17,6 +124,7 @@
17
124
 
18
125
  *Mike Dalessio*
19
126
 
127
+
20
128
  ## 1.4.1 / 2021-08-18
21
129
 
22
130
  * Fix regression in v1.4.0 that did not pass comment nodes to the scrubber.
@@ -29,6 +137,7 @@
29
137
 
30
138
  *Mike Dalessio*
31
139
 
140
+
32
141
  ## 1.4.0 / 2021-08-18
33
142
 
34
143
  * Processing Instructions are no longer allowed by Rails::Html::PermitScrubber
@@ -41,12 +150,14 @@
41
150
 
42
151
  *Mike Dalessio*
43
152
 
153
+
44
154
  ## 1.3.0
45
155
 
46
156
  * Address deprecations in Loofah 2.3.0.
47
157
 
48
158
  *Josh Goodall*
49
159
 
160
+
50
161
  ## 1.2.0
51
162
 
52
163
  * Remove needless `white_list_sanitizer` deprecation.
@@ -61,6 +172,7 @@
61
172
 
62
173
  *Kasper Timm Hansen*
63
174
 
175
+
64
176
  ## 1.1.0
65
177
 
66
178
  * Add `safe_list_sanitizer` and deprecate `white_list_sanitizer` to be removed
@@ -78,10 +190,12 @@
78
190
 
79
191
  *Kasper Timm Hansen*
80
192
 
193
+
81
194
  ## 1.0.1
82
195
 
83
196
  * Added support for Rails 4.2.0.beta2 and above
84
197
 
198
+
85
199
  ## 1.0.0
86
200
 
87
201
  * First release.
data/MIT-LICENSE CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (c) 2013-2015 Rafael Mendonça França, Kasper Timm Hansen
1
+ Copyright (c) 2013-2023 Rafael Mendonça França, Kasper Timm Hansen, Mike Dalessio
2
2
 
3
3
  MIT License
4
4
 
data/README.md CHANGED
@@ -1,61 +1,76 @@
1
- # Rails Html Sanitizers
1
+ # Rails HTML Sanitizers
2
2
 
3
- In Rails 4.2 and above this gem will be responsible for sanitizing HTML fragments in Rails
4
- applications, i.e. in the `sanitize`, `sanitize_css`, `strip_tags` and `strip_links` methods.
3
+ This gem is responsible for sanitizing HTML fragments in Rails applications. Specifically, this is the set of sanitizers used to implement the Action View `SanitizerHelper` methods `sanitize`, `sanitize_css`, `strip_tags` and `strip_links`.
5
4
 
6
- Rails Html Sanitizer is only intended to be used with Rails applications. If you need similar functionality in non Rails apps consider using [Loofah](https://github.com/flavorjones/loofah) directly (that's what handles sanitization under the hood).
5
+ Rails HTML Sanitizer is only intended to be used with Rails applications. If you need similar functionality but aren't using Rails, consider using the underlying sanitization library [Loofah](https://github.com/flavorjones/loofah) directly.
7
6
 
8
- ## Installation
9
-
10
- Add this line to your application's Gemfile:
11
7
 
12
- gem 'rails-html-sanitizer'
13
-
14
- And then execute:
8
+ ## Usage
15
9
 
16
- $ bundle
10
+ ### Sanitizers
17
11
 
18
- Or install it yourself as:
12
+ All sanitizers respond to `sanitize`, and are available in variants that use either HTML4 or HTML5 parsing, under the `Rails::HTML4` and `Rails::HTML5` namespaces, respectively.
19
13
 
20
- $ gem install rails-html-sanitizer
14
+ NOTE: The HTML5 sanitizers are not supported on JRuby. Users may programmatically check for support by calling `Rails::HTML::Sanitizer.html5_support?`.
21
15
 
22
- ## Usage
23
16
 
24
- ### Sanitizers
17
+ #### FullSanitizer
25
18
 
26
- All sanitizers respond to `sanitize`.
19
+ ```ruby
20
+ full_sanitizer = Rails::HTML5::FullSanitizer.new
21
+ full_sanitizer.sanitize("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")
22
+ # => Bold no more! See more here...
23
+ ```
27
24
 
28
- #### FullSanitizer
25
+ or, if you insist on parsing the content as HTML4:
29
26
 
30
27
  ```ruby
31
- full_sanitizer = Rails::Html::FullSanitizer.new
28
+ full_sanitizer = Rails::HTML4::FullSanitizer.new
32
29
  full_sanitizer.sanitize("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")
33
30
  # => Bold no more! See more here...
34
31
  ```
35
32
 
33
+ HTML5 version:
34
+
35
+
36
+
36
37
  #### LinkSanitizer
37
38
 
38
39
  ```ruby
39
- link_sanitizer = Rails::Html::LinkSanitizer.new
40
+ link_sanitizer = Rails::HTML5::LinkSanitizer.new
41
+ link_sanitizer.sanitize('<a href="example.com">Only the link text will be kept.</a>')
42
+ # => Only the link text will be kept.
43
+ ```
44
+
45
+ or, if you insist on parsing the content as HTML4:
46
+
47
+ ```ruby
48
+ link_sanitizer = Rails::HTML4::LinkSanitizer.new
40
49
  link_sanitizer.sanitize('<a href="example.com">Only the link text will be kept.</a>')
41
50
  # => Only the link text will be kept.
42
51
  ```
43
52
 
53
+
44
54
  #### SafeListSanitizer
45
55
 
56
+ This sanitizer is also available as an HTML4 variant, but for simplicity we'll document only the HTML5 variant below.
57
+
46
58
  ```ruby
47
- safe_list_sanitizer = Rails::Html::SafeListSanitizer.new
59
+ safe_list_sanitizer = Rails::HTML5::SafeListSanitizer.new
48
60
 
49
61
  # sanitize via an extensive safe list of allowed elements
50
62
  safe_list_sanitizer.sanitize(@article.body)
51
63
 
52
- # safe list only the supplied tags and attributes
64
+ # sanitize only the supplied tags and attributes
53
65
  safe_list_sanitizer.sanitize(@article.body, tags: %w(table tr td), attributes: %w(id class style))
54
66
 
55
- # safe list via a custom scrubber
67
+ # sanitize via a custom scrubber
56
68
  safe_list_sanitizer.sanitize(@article.body, scrubber: ArticleScrubber.new)
57
69
 
58
- # safe list sanitizer can also sanitize css
70
+ # prune nodes from the tree instead of stripping tags and leaving inner content
71
+ safe_list_sanitizer = Rails::HTML5::SafeListSanitizer.new(prune: true)
72
+
73
+ # the sanitizer can also sanitize css
59
74
  safe_list_sanitizer.sanitize_css('background-color: #000;')
60
75
  ```
61
76
 
@@ -63,14 +78,14 @@ safe_list_sanitizer.sanitize_css('background-color: #000;')
63
78
 
64
79
  Scrubbers are objects responsible for removing nodes or attributes you don't want in your HTML document.
65
80
 
66
- This gem includes two scrubbers `Rails::Html::PermitScrubber` and `Rails::Html::TargetScrubber`.
81
+ This gem includes two scrubbers `Rails::HTML::PermitScrubber` and `Rails::HTML::TargetScrubber`.
67
82
 
68
- #### `Rails::Html::PermitScrubber`
83
+ #### `Rails::HTML::PermitScrubber`
69
84
 
70
85
  This scrubber allows you to permit only the tags and attributes you want.
71
86
 
72
87
  ```ruby
73
- scrubber = Rails::Html::PermitScrubber.new
88
+ scrubber = Rails::HTML::PermitScrubber.new
74
89
  scrubber.tags = ['a']
75
90
 
76
91
  html_fragment = Loofah.fragment('<a><img/ ></a>')
@@ -78,16 +93,34 @@ html_fragment.scrub!(scrubber)
78
93
  html_fragment.to_s # => "<a></a>"
79
94
  ```
80
95
 
81
- #### `Rails::Html::TargetScrubber`
96
+ By default, inner content is left, but it can be removed as well.
97
+
98
+ ```ruby
99
+ scrubber = Rails::HTML::PermitScrubber.new
100
+ scrubber.tags = ['a']
101
+
102
+ html_fragment = Loofah.fragment('<a><span>text</span></a>')
103
+ html_fragment.scrub!(scrubber)
104
+ html_fragment.to_s # => "<a>text</a>"
105
+
106
+ scrubber = Rails::HTML::PermitScrubber.new(prune: true)
107
+ scrubber.tags = ['a']
108
+
109
+ html_fragment = Loofah.fragment('<a><span>text</span></a>')
110
+ html_fragment.scrub!(scrubber)
111
+ html_fragment.to_s # => "<a></a>"
112
+ ```
113
+
114
+ #### `Rails::HTML::TargetScrubber`
82
115
 
83
116
  Where `PermitScrubber` picks out tags and attributes to permit in sanitization,
84
- `Rails::Html::TargetScrubber` targets them for removal. See https://github.com/flavorjones/loofah/blob/main/lib/loofah/html5/safelist.rb for the tag list.
117
+ `Rails::HTML::TargetScrubber` targets them for removal. See https://github.com/flavorjones/loofah/blob/main/lib/loofah/html5/safelist.rb for the tag list.
85
118
 
86
119
  **Note:** by default, it will scrub anything that is not part of the permitted tags from
87
120
  loofah `HTML5::Scrub.allowed_element?`.
88
121
 
89
122
  ```ruby
90
- scrubber = Rails::Html::TargetScrubber.new
123
+ scrubber = Rails::HTML::TargetScrubber.new
91
124
  scrubber.tags = ['img']
92
125
 
93
126
  html_fragment = Loofah.fragment('<a><img/ ></a>')
@@ -95,12 +128,30 @@ html_fragment.scrub!(scrubber)
95
128
  html_fragment.to_s # => "<a></a>"
96
129
  ```
97
130
 
131
+ Similarly to `PermitScrubber`, nodes can be fully pruned.
132
+
133
+ ```ruby
134
+ scrubber = Rails::HTML::TargetScrubber.new
135
+ scrubber.tags = ['span']
136
+
137
+ html_fragment = Loofah.fragment('<a><span>text</span></a>')
138
+ html_fragment.scrub!(scrubber)
139
+ html_fragment.to_s # => "<a>text</a>"
140
+
141
+ scrubber = Rails::HTML::TargetScrubber.new(prune: true)
142
+ scrubber.tags = ['span']
143
+
144
+ html_fragment = Loofah.fragment('<a><span>text</span></a>')
145
+ html_fragment.scrub!(scrubber)
146
+ html_fragment.to_s # => "<a></a>"
147
+ ```
148
+
98
149
  #### Custom Scrubbers
99
150
 
100
151
  You can also create custom scrubbers in your application if you want to.
101
152
 
102
153
  ```ruby
103
- class CommentScrubber < Rails::Html::PermitScrubber
154
+ class CommentScrubber < Rails::HTML::PermitScrubber
104
155
  def initialize
105
156
  super
106
157
  self.tags = %w( form script comment blockquote )
@@ -113,7 +164,7 @@ class CommentScrubber < Rails::Html::PermitScrubber
113
164
  end
114
165
  ```
115
166
 
116
- See `Rails::Html::PermitScrubber` documentation to learn more about which methods can be overridden.
167
+ See `Rails::HTML::PermitScrubber` documentation to learn more about which methods can be overridden.
117
168
 
118
169
  #### Custom Scrubber in a Rails app
119
170
 
@@ -123,20 +174,98 @@ Using the `CommentScrubber` from above, you can use this in a Rails view like so
123
174
  <%= sanitize @comment, scrubber: CommentScrubber.new %>
124
175
  ```
125
176
 
177
+ ### A note on HTML entities
178
+
179
+ __Rails HTML sanitizers are intended to be used by the view layer, at page-render time. They are *not* intended to sanitize persisted strings that will be sanitized *again* at page-render time.__
180
+
181
+ Proper HTML sanitization will replace some characters with HTML entities. For example, text containing a `<` character will be updated to contain `&lt;` to ensure that the markup is well-formed.
182
+
183
+ This is important to keep in mind because __HTML entities will render improperly if they are sanitized twice.__
184
+
185
+
186
+ #### A concrete example showing the problem that can arise
187
+
188
+ Imagine the user is asked to enter their employer's name, which will appear on their public profile page. Then imagine they enter `JPMorgan Chase & Co.`.
189
+
190
+ If you sanitize this before persisting it in the database, the stored string will be `JPMorgan Chase &amp; Co.`
191
+
192
+ When the page is rendered, if this string is sanitized a second time by the view layer, the HTML will contain `JPMorgan Chase &amp;amp; Co.` which will render as "JPMorgan Chase &amp;amp; Co.".
193
+
194
+ Another problem that can arise is rendering the sanitized string in a non-HTML context (for example, if it ends up being part of an SMS message). In this case, it may contain inappropriate HTML entities.
195
+
196
+
197
+ #### Suggested alternatives
198
+
199
+ You might simply choose to persist the untrusted string as-is (the raw input), and then ensure that the string will be properly sanitized by the view layer.
200
+
201
+ That raw string, if rendered in an non-HTML context (like SMS), must also be sanitized by a method appropriate for that context. You may wish to look into using [Loofah](https://github.com/flavorjones/loofah) or [Sanitize](https://github.com/rgrove/sanitize) to customize how this sanitization works, including omitting HTML entities in the final string.
202
+
203
+ If you really want to sanitize the string that's stored in your database, you may wish to look into [Loofah::ActiveRecord](https://github.com/flavorjones/loofah-activerecord) rather than use the Rails HTML sanitizers.
204
+
205
+
206
+ ### A note on module names
207
+
208
+ In versions < 1.6, the only module defined by this library was `Rails::Html`. Starting in 1.6, we define three additional modules:
209
+
210
+ - `Rails::HTML` for general functionality (replacing `Rails::Html`)
211
+ - `Rails::HTML4` containing sanitizers that parse content as HTML4
212
+ - `Rails::HTML5` containing sanitizers that parse content as HTML5 (if supported)
213
+
214
+ The following aliases are maintained for backwards compatibility:
215
+
216
+ - `Rails::Html` points to `Rails::HTML`
217
+ - `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
218
+ - `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
219
+ - `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
220
+
221
+
222
+ ## Installation
223
+
224
+ Add this line to your application's Gemfile:
225
+
226
+ gem 'rails-html-sanitizer'
227
+
228
+ And then execute:
229
+
230
+ $ bundle
231
+
232
+ Or install it yourself as:
233
+
234
+ $ gem install rails-html-sanitizer
235
+
236
+
237
+ ## Support matrix
238
+
239
+ | branch | ruby support | actively maintained | security support |
240
+ |--------|--------------|---------------------|----------------------------------------|
241
+ | 1.6.x | >= 2.7 | yes | yes |
242
+ | 1.5.x | >= 2.5 | no | while Rails 6.1 is in security support |
243
+ | 1.4.x | >= 1.8.7 | no | no |
244
+
245
+
126
246
  ## Read more
127
247
 
128
248
  Loofah is what underlies the sanitizers and scrubbers of rails-html-sanitizer.
249
+
129
250
  - [Loofah and Loofah Scrubbers](https://github.com/flavorjones/loofah)
130
251
 
131
252
  The `node` argument passed to some methods in a custom scrubber is an instance of `Nokogiri::XML::Node`.
253
+
132
254
  - [`Nokogiri::XML::Node`](https://nokogiri.org/rdoc/Nokogiri/XML/Node.html)
133
255
  - [Nokogiri](http://nokogiri.org)
134
256
 
135
- ## Contributing to Rails Html Sanitizers
136
257
 
137
- Rails Html Sanitizers is work of many contributors. You're encouraged to submit pull requests, propose features and discuss issues.
258
+ ## Contributing to Rails HTML Sanitizers
259
+
260
+ Rails HTML Sanitizers is work of many contributors. You're encouraged to submit pull requests, propose features and discuss issues.
138
261
 
139
262
  See [CONTRIBUTING](CONTRIBUTING.md).
140
263
 
264
+ ### Security reports
265
+
266
+ Trying to report a possible security vulnerability in this project? Please check out the [Rails project's security policy](https://rubyonrails.org/security) for instructions.
267
+
268
+
141
269
  ## License
142
- Rails Html Sanitizers is released under the [MIT License](MIT-LICENSE).
270
+
271
+ Rails HTML Sanitizers is released under the [MIT License](MIT-LICENSE).
@@ -1,7 +1,9 @@
1
+ # frozen_string_literal: true
2
+
1
3
  module Rails
2
- module Html
4
+ module HTML
3
5
  class Sanitizer
4
- VERSION = "1.4.3"
6
+ VERSION = "1.6.0"
5
7
  end
6
8
  end
7
9
  end