rails-html-sanitizer 1.4.3 → 1.6.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 2f00d9f256478eb753c8d211c3b25efa4204bbdbc9c5abf0415413c811a2e404
4
- data.tar.gz: 65d3871aa798dfbbfb1138b666d475b590e347cdb66614d6d39b72ad3531c742
3
+ metadata.gz: 365db7c11fc174c5da0a4a670fec92033cf277b71e7bb089534b2ad1bd48b314
4
+ data.tar.gz: b33e592de2e0081f1493d9fc29e8db1a26b2f727c20aa7d111332438bfbf2f1d
5
5
  SHA512:
6
- metadata.gz: e6e31eaa72b1a2e8356aae50600ac784f85a80828cbc49ce8061384ecd3f21a1d8eaee69845dc08537c5102728c3cc41a72cb3ed8b9789c4921038398afa61e2
7
- data.tar.gz: 6b14a49842eaf4c3e0fbae5acd28fdf32a5deb6cd42f769aada848226847180c4d3a67a9dcbc439e1a4855699b0ea694cb4c7b6ee173391ac841bd334ae44b6f
6
+ metadata.gz: bafc9210e52f68f6ea033c1deb70d2d227a85a661f9c4fe988da876a73e29b7c86e0910d9705616ed536978d4c6cdf9e5a23b211e720c1f4c86d7b5ce04c03bf
7
+ data.tar.gz: acb3ed50bf5ebd95824bffc8efb4be8745c32e3d5bd5d157edc14648f4f00e07f308ce5ecb2889ae417d7cd999871f4860ac79ecb0864a25220683ae2edd5473
data/CHANGELOG.md CHANGED
@@ -1,3 +1,110 @@
1
+ ## 1.6.0 / 2023-05-26
2
+
3
+ * Dependencies have been updated:
4
+
5
+ - Loofah `~>2.21` and Nokogiri `~>1.14` for HTML5 parser support
6
+ - As a result, required Ruby version is now `>= 2.7.0`
7
+
8
+ Security updates will continue to be made on the `1.5.x` release branch as long as Rails 6.1
9
+ (which supports Ruby 2.5) is still in security support.
10
+
11
+ *Mike Dalessio*
12
+
13
+ * HTML5 standards-compliant sanitizers are now available on platforms supported by
14
+ Nokogiri::HTML5. These are available as:
15
+
16
+ - `Rails::HTML5::FullSanitizer`
17
+ - `Rails::HTML5::LinkSanitizer`
18
+ - `Rails::HTML5::SafeListSanitizer`
19
+
20
+ And a new "vendor" is provided at `Rails::HTML5::Sanitizer` that can be used in a future version
21
+ of Rails.
22
+
23
+ Note that for symmetry `Rails::HTML4::Sanitizer` is also added, though its behavior is identical
24
+ to the vendor class methods on `Rails::HTML::Sanitizer`.
25
+
26
+ Users may call `Rails::HTML::Sanitizer.best_supported_vendor` to get back the HTML5 vendor if it's
27
+ supported, else the legacy HTML4 vendor.
28
+
29
+ *Mike Dalessio*
30
+
31
+ * Module namespaces have changed, but backwards compatibility is provided by aliases.
32
+
33
+ The library defines three additional modules:
34
+
35
+ - `Rails::HTML` for general functionality (replacing `Rails::Html`)
36
+ - `Rails::HTML4` containing sanitizers that parse content as HTML4
37
+ - `Rails::HTML5` containing sanitizers that parse content as HTML5
38
+
39
+ The following aliases are maintained for backwards compatibility:
40
+
41
+ - `Rails::Html` points to `Rails::HTML`
42
+ - `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
43
+ - `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
44
+ - `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
45
+
46
+ *Mike Dalessio*
47
+
48
+ * `LinkSanitizer` always returns UTF-8 encoded strings. `SafeListSanitizer` and `FullSanitizer`
49
+ already ensured this encoding.
50
+
51
+ *Mike Dalessio*
52
+
53
+ * `SafeListSanitizer` allows `time` tag and `lang` attribute by default.
54
+
55
+ *Mike Dalessio*
56
+
57
+ * The constant `Rails::Html::XPATHS_TO_REMOVE` has been removed. It's not necessary with the
58
+ existing sanitizers, and should have been a private constant all along anyway.
59
+
60
+ *Mike Dalessio*
61
+
62
+
63
+ ## 1.5.0 / 2023-01-20
64
+
65
+ * `SafeListSanitizer`, `PermitScrubber`, and `TargetScrubber` now all support pruning of unsafe tags.
66
+
67
+ By default, unsafe tags are still stripped, but this behavior can be changed to prune the element
68
+ and its children from the document by passing `prune: true` to any of these classes' constructors.
69
+
70
+ *seyerian*
71
+
72
+
73
+ ## 1.4.4 / 2022-12-13
74
+
75
+ * Address inefficient regular expression complexity with certain configurations of Rails::Html::Sanitizer.
76
+
77
+ Fixes CVE-2022-23517. See
78
+ [GHSA-5x79-w82f-gw8w](https://github.com/rails/rails-html-sanitizer/security/advisories/GHSA-5x79-w82f-gw8w)
79
+ for more information.
80
+
81
+ *Mike Dalessio*
82
+
83
+ * Address improper sanitization of data URIs.
84
+
85
+ Fixes CVE-2022-23518 and #135. See
86
+ [GHSA-mcvf-2q2m-x72m](https://github.com/rails/rails-html-sanitizer/security/advisories/GHSA-mcvf-2q2m-x72m)
87
+ for more information.
88
+
89
+ *Mike Dalessio*
90
+
91
+ * Address possible XSS vulnerability with certain configurations of Rails::Html::Sanitizer.
92
+
93
+ Fixes CVE-2022-23520. See
94
+ [GHSA-rrfc-7g8p-99q8](https://github.com/rails/rails-html-sanitizer/security/advisories/GHSA-rrfc-7g8p-99q8)
95
+ for more information.
96
+
97
+ *Mike Dalessio*
98
+
99
+ * Address possible XSS vulnerability with certain configurations of Rails::Html::Sanitizer.
100
+
101
+ Fixes CVE-2022-23519. See
102
+ [GHSA-9h9g-93gc-623h](https://github.com/rails/rails-html-sanitizer/security/advisories/GHSA-9h9g-93gc-623h)
103
+ for more information.
104
+
105
+ *Mike Dalessio*
106
+
107
+
1
108
  ## 1.4.3 / 2022-06-09
2
109
 
3
110
  * Address a possible XSS vulnerability with certain configurations of Rails::Html::Sanitizer.
@@ -17,6 +124,7 @@
17
124
 
18
125
  *Mike Dalessio*
19
126
 
127
+
20
128
  ## 1.4.1 / 2021-08-18
21
129
 
22
130
  * Fix regression in v1.4.0 that did not pass comment nodes to the scrubber.
@@ -29,6 +137,7 @@
29
137
 
30
138
  *Mike Dalessio*
31
139
 
140
+
32
141
  ## 1.4.0 / 2021-08-18
33
142
 
34
143
  * Processing Instructions are no longer allowed by Rails::Html::PermitScrubber
@@ -41,12 +150,14 @@
41
150
 
42
151
  *Mike Dalessio*
43
152
 
153
+
44
154
  ## 1.3.0
45
155
 
46
156
  * Address deprecations in Loofah 2.3.0.
47
157
 
48
158
  *Josh Goodall*
49
159
 
160
+
50
161
  ## 1.2.0
51
162
 
52
163
  * Remove needless `white_list_sanitizer` deprecation.
@@ -61,6 +172,7 @@
61
172
 
62
173
  *Kasper Timm Hansen*
63
174
 
175
+
64
176
  ## 1.1.0
65
177
 
66
178
  * Add `safe_list_sanitizer` and deprecate `white_list_sanitizer` to be removed
@@ -78,10 +190,12 @@
78
190
 
79
191
  *Kasper Timm Hansen*
80
192
 
193
+
81
194
  ## 1.0.1
82
195
 
83
196
  * Added support for Rails 4.2.0.beta2 and above
84
197
 
198
+
85
199
  ## 1.0.0
86
200
 
87
201
  * First release.
data/MIT-LICENSE CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (c) 2013-2015 Rafael Mendonça França, Kasper Timm Hansen
1
+ Copyright (c) 2013-2023 Rafael Mendonça França, Kasper Timm Hansen, Mike Dalessio
2
2
 
3
3
  MIT License
4
4
 
data/README.md CHANGED
@@ -1,61 +1,76 @@
1
- # Rails Html Sanitizers
1
+ # Rails HTML Sanitizers
2
2
 
3
- In Rails 4.2 and above this gem will be responsible for sanitizing HTML fragments in Rails
4
- applications, i.e. in the `sanitize`, `sanitize_css`, `strip_tags` and `strip_links` methods.
3
+ This gem is responsible for sanitizing HTML fragments in Rails applications. Specifically, this is the set of sanitizers used to implement the Action View `SanitizerHelper` methods `sanitize`, `sanitize_css`, `strip_tags` and `strip_links`.
5
4
 
6
- Rails Html Sanitizer is only intended to be used with Rails applications. If you need similar functionality in non Rails apps consider using [Loofah](https://github.com/flavorjones/loofah) directly (that's what handles sanitization under the hood).
5
+ Rails HTML Sanitizer is only intended to be used with Rails applications. If you need similar functionality but aren't using Rails, consider using the underlying sanitization library [Loofah](https://github.com/flavorjones/loofah) directly.
7
6
 
8
- ## Installation
9
-
10
- Add this line to your application's Gemfile:
11
7
 
12
- gem 'rails-html-sanitizer'
13
-
14
- And then execute:
8
+ ## Usage
15
9
 
16
- $ bundle
10
+ ### Sanitizers
17
11
 
18
- Or install it yourself as:
12
+ All sanitizers respond to `sanitize`, and are available in variants that use either HTML4 or HTML5 parsing, under the `Rails::HTML4` and `Rails::HTML5` namespaces, respectively.
19
13
 
20
- $ gem install rails-html-sanitizer
14
+ NOTE: The HTML5 sanitizers are not supported on JRuby. Users may programmatically check for support by calling `Rails::HTML::Sanitizer.html5_support?`.
21
15
 
22
- ## Usage
23
16
 
24
- ### Sanitizers
17
+ #### FullSanitizer
25
18
 
26
- All sanitizers respond to `sanitize`.
19
+ ```ruby
20
+ full_sanitizer = Rails::HTML5::FullSanitizer.new
21
+ full_sanitizer.sanitize("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")
22
+ # => Bold no more! See more here...
23
+ ```
27
24
 
28
- #### FullSanitizer
25
+ or, if you insist on parsing the content as HTML4:
29
26
 
30
27
  ```ruby
31
- full_sanitizer = Rails::Html::FullSanitizer.new
28
+ full_sanitizer = Rails::HTML4::FullSanitizer.new
32
29
  full_sanitizer.sanitize("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")
33
30
  # => Bold no more! See more here...
34
31
  ```
35
32
 
33
+ HTML5 version:
34
+
35
+
36
+
36
37
  #### LinkSanitizer
37
38
 
38
39
  ```ruby
39
- link_sanitizer = Rails::Html::LinkSanitizer.new
40
+ link_sanitizer = Rails::HTML5::LinkSanitizer.new
41
+ link_sanitizer.sanitize('<a href="example.com">Only the link text will be kept.</a>')
42
+ # => Only the link text will be kept.
43
+ ```
44
+
45
+ or, if you insist on parsing the content as HTML4:
46
+
47
+ ```ruby
48
+ link_sanitizer = Rails::HTML4::LinkSanitizer.new
40
49
  link_sanitizer.sanitize('<a href="example.com">Only the link text will be kept.</a>')
41
50
  # => Only the link text will be kept.
42
51
  ```
43
52
 
53
+
44
54
  #### SafeListSanitizer
45
55
 
56
+ This sanitizer is also available as an HTML4 variant, but for simplicity we'll document only the HTML5 variant below.
57
+
46
58
  ```ruby
47
- safe_list_sanitizer = Rails::Html::SafeListSanitizer.new
59
+ safe_list_sanitizer = Rails::HTML5::SafeListSanitizer.new
48
60
 
49
61
  # sanitize via an extensive safe list of allowed elements
50
62
  safe_list_sanitizer.sanitize(@article.body)
51
63
 
52
- # safe list only the supplied tags and attributes
64
+ # sanitize only the supplied tags and attributes
53
65
  safe_list_sanitizer.sanitize(@article.body, tags: %w(table tr td), attributes: %w(id class style))
54
66
 
55
- # safe list via a custom scrubber
67
+ # sanitize via a custom scrubber
56
68
  safe_list_sanitizer.sanitize(@article.body, scrubber: ArticleScrubber.new)
57
69
 
58
- # safe list sanitizer can also sanitize css
70
+ # prune nodes from the tree instead of stripping tags and leaving inner content
71
+ safe_list_sanitizer = Rails::HTML5::SafeListSanitizer.new(prune: true)
72
+
73
+ # the sanitizer can also sanitize css
59
74
  safe_list_sanitizer.sanitize_css('background-color: #000;')
60
75
  ```
61
76
 
@@ -63,14 +78,14 @@ safe_list_sanitizer.sanitize_css('background-color: #000;')
63
78
 
64
79
  Scrubbers are objects responsible for removing nodes or attributes you don't want in your HTML document.
65
80
 
66
- This gem includes two scrubbers `Rails::Html::PermitScrubber` and `Rails::Html::TargetScrubber`.
81
+ This gem includes two scrubbers `Rails::HTML::PermitScrubber` and `Rails::HTML::TargetScrubber`.
67
82
 
68
- #### `Rails::Html::PermitScrubber`
83
+ #### `Rails::HTML::PermitScrubber`
69
84
 
70
85
  This scrubber allows you to permit only the tags and attributes you want.
71
86
 
72
87
  ```ruby
73
- scrubber = Rails::Html::PermitScrubber.new
88
+ scrubber = Rails::HTML::PermitScrubber.new
74
89
  scrubber.tags = ['a']
75
90
 
76
91
  html_fragment = Loofah.fragment('<a><img/ ></a>')
@@ -78,16 +93,34 @@ html_fragment.scrub!(scrubber)
78
93
  html_fragment.to_s # => "<a></a>"
79
94
  ```
80
95
 
81
- #### `Rails::Html::TargetScrubber`
96
+ By default, inner content is left, but it can be removed as well.
97
+
98
+ ```ruby
99
+ scrubber = Rails::HTML::PermitScrubber.new
100
+ scrubber.tags = ['a']
101
+
102
+ html_fragment = Loofah.fragment('<a><span>text</span></a>')
103
+ html_fragment.scrub!(scrubber)
104
+ html_fragment.to_s # => "<a>text</a>"
105
+
106
+ scrubber = Rails::HTML::PermitScrubber.new(prune: true)
107
+ scrubber.tags = ['a']
108
+
109
+ html_fragment = Loofah.fragment('<a><span>text</span></a>')
110
+ html_fragment.scrub!(scrubber)
111
+ html_fragment.to_s # => "<a></a>"
112
+ ```
113
+
114
+ #### `Rails::HTML::TargetScrubber`
82
115
 
83
116
  Where `PermitScrubber` picks out tags and attributes to permit in sanitization,
84
- `Rails::Html::TargetScrubber` targets them for removal. See https://github.com/flavorjones/loofah/blob/main/lib/loofah/html5/safelist.rb for the tag list.
117
+ `Rails::HTML::TargetScrubber` targets them for removal. See https://github.com/flavorjones/loofah/blob/main/lib/loofah/html5/safelist.rb for the tag list.
85
118
 
86
119
  **Note:** by default, it will scrub anything that is not part of the permitted tags from
87
120
  loofah `HTML5::Scrub.allowed_element?`.
88
121
 
89
122
  ```ruby
90
- scrubber = Rails::Html::TargetScrubber.new
123
+ scrubber = Rails::HTML::TargetScrubber.new
91
124
  scrubber.tags = ['img']
92
125
 
93
126
  html_fragment = Loofah.fragment('<a><img/ ></a>')
@@ -95,12 +128,30 @@ html_fragment.scrub!(scrubber)
95
128
  html_fragment.to_s # => "<a></a>"
96
129
  ```
97
130
 
131
+ Similarly to `PermitScrubber`, nodes can be fully pruned.
132
+
133
+ ```ruby
134
+ scrubber = Rails::HTML::TargetScrubber.new
135
+ scrubber.tags = ['span']
136
+
137
+ html_fragment = Loofah.fragment('<a><span>text</span></a>')
138
+ html_fragment.scrub!(scrubber)
139
+ html_fragment.to_s # => "<a>text</a>"
140
+
141
+ scrubber = Rails::HTML::TargetScrubber.new(prune: true)
142
+ scrubber.tags = ['span']
143
+
144
+ html_fragment = Loofah.fragment('<a><span>text</span></a>')
145
+ html_fragment.scrub!(scrubber)
146
+ html_fragment.to_s # => "<a></a>"
147
+ ```
148
+
98
149
  #### Custom Scrubbers
99
150
 
100
151
  You can also create custom scrubbers in your application if you want to.
101
152
 
102
153
  ```ruby
103
- class CommentScrubber < Rails::Html::PermitScrubber
154
+ class CommentScrubber < Rails::HTML::PermitScrubber
104
155
  def initialize
105
156
  super
106
157
  self.tags = %w( form script comment blockquote )
@@ -113,7 +164,7 @@ class CommentScrubber < Rails::Html::PermitScrubber
113
164
  end
114
165
  ```
115
166
 
116
- See `Rails::Html::PermitScrubber` documentation to learn more about which methods can be overridden.
167
+ See `Rails::HTML::PermitScrubber` documentation to learn more about which methods can be overridden.
117
168
 
118
169
  #### Custom Scrubber in a Rails app
119
170
 
@@ -123,20 +174,98 @@ Using the `CommentScrubber` from above, you can use this in a Rails view like so
123
174
  <%= sanitize @comment, scrubber: CommentScrubber.new %>
124
175
  ```
125
176
 
177
+ ### A note on HTML entities
178
+
179
+ __Rails HTML sanitizers are intended to be used by the view layer, at page-render time. They are *not* intended to sanitize persisted strings that will be sanitized *again* at page-render time.__
180
+
181
+ Proper HTML sanitization will replace some characters with HTML entities. For example, text containing a `<` character will be updated to contain `&lt;` to ensure that the markup is well-formed.
182
+
183
+ This is important to keep in mind because __HTML entities will render improperly if they are sanitized twice.__
184
+
185
+
186
+ #### A concrete example showing the problem that can arise
187
+
188
+ Imagine the user is asked to enter their employer's name, which will appear on their public profile page. Then imagine they enter `JPMorgan Chase & Co.`.
189
+
190
+ If you sanitize this before persisting it in the database, the stored string will be `JPMorgan Chase &amp; Co.`
191
+
192
+ When the page is rendered, if this string is sanitized a second time by the view layer, the HTML will contain `JPMorgan Chase &amp;amp; Co.` which will render as "JPMorgan Chase &amp;amp; Co.".
193
+
194
+ Another problem that can arise is rendering the sanitized string in a non-HTML context (for example, if it ends up being part of an SMS message). In this case, it may contain inappropriate HTML entities.
195
+
196
+
197
+ #### Suggested alternatives
198
+
199
+ You might simply choose to persist the untrusted string as-is (the raw input), and then ensure that the string will be properly sanitized by the view layer.
200
+
201
+ That raw string, if rendered in an non-HTML context (like SMS), must also be sanitized by a method appropriate for that context. You may wish to look into using [Loofah](https://github.com/flavorjones/loofah) or [Sanitize](https://github.com/rgrove/sanitize) to customize how this sanitization works, including omitting HTML entities in the final string.
202
+
203
+ If you really want to sanitize the string that's stored in your database, you may wish to look into [Loofah::ActiveRecord](https://github.com/flavorjones/loofah-activerecord) rather than use the Rails HTML sanitizers.
204
+
205
+
206
+ ### A note on module names
207
+
208
+ In versions < 1.6, the only module defined by this library was `Rails::Html`. Starting in 1.6, we define three additional modules:
209
+
210
+ - `Rails::HTML` for general functionality (replacing `Rails::Html`)
211
+ - `Rails::HTML4` containing sanitizers that parse content as HTML4
212
+ - `Rails::HTML5` containing sanitizers that parse content as HTML5 (if supported)
213
+
214
+ The following aliases are maintained for backwards compatibility:
215
+
216
+ - `Rails::Html` points to `Rails::HTML`
217
+ - `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
218
+ - `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
219
+ - `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
220
+
221
+
222
+ ## Installation
223
+
224
+ Add this line to your application's Gemfile:
225
+
226
+ gem 'rails-html-sanitizer'
227
+
228
+ And then execute:
229
+
230
+ $ bundle
231
+
232
+ Or install it yourself as:
233
+
234
+ $ gem install rails-html-sanitizer
235
+
236
+
237
+ ## Support matrix
238
+
239
+ | branch | ruby support | actively maintained | security support |
240
+ |--------|--------------|---------------------|----------------------------------------|
241
+ | 1.6.x | >= 2.7 | yes | yes |
242
+ | 1.5.x | >= 2.5 | no | while Rails 6.1 is in security support |
243
+ | 1.4.x | >= 1.8.7 | no | no |
244
+
245
+
126
246
  ## Read more
127
247
 
128
248
  Loofah is what underlies the sanitizers and scrubbers of rails-html-sanitizer.
249
+
129
250
  - [Loofah and Loofah Scrubbers](https://github.com/flavorjones/loofah)
130
251
 
131
252
  The `node` argument passed to some methods in a custom scrubber is an instance of `Nokogiri::XML::Node`.
253
+
132
254
  - [`Nokogiri::XML::Node`](https://nokogiri.org/rdoc/Nokogiri/XML/Node.html)
133
255
  - [Nokogiri](http://nokogiri.org)
134
256
 
135
- ## Contributing to Rails Html Sanitizers
136
257
 
137
- Rails Html Sanitizers is work of many contributors. You're encouraged to submit pull requests, propose features and discuss issues.
258
+ ## Contributing to Rails HTML Sanitizers
259
+
260
+ Rails HTML Sanitizers is work of many contributors. You're encouraged to submit pull requests, propose features and discuss issues.
138
261
 
139
262
  See [CONTRIBUTING](CONTRIBUTING.md).
140
263
 
264
+ ### Security reports
265
+
266
+ Trying to report a possible security vulnerability in this project? Please check out the [Rails project's security policy](https://rubyonrails.org/security) for instructions.
267
+
268
+
141
269
  ## License
142
- Rails Html Sanitizers is released under the [MIT License](MIT-LICENSE).
270
+
271
+ Rails HTML Sanitizers is released under the [MIT License](MIT-LICENSE).
@@ -1,7 +1,9 @@
1
+ # frozen_string_literal: true
2
+
1
3
  module Rails
2
- module Html
4
+ module HTML
3
5
  class Sanitizer
4
- VERSION = "1.4.3"
6
+ VERSION = "1.6.0"
5
7
  end
6
8
  end
7
9
  end