rails-html-sanitizer 1.4.4 → 1.6.0.rc1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a74021096590326ee357971bec71d2c4507a95cdaf05c8e21d383ce18fee18d3
4
- data.tar.gz: faad0d5f268dad601b633b03912e353fcc2d760fceb253d9cde2064b010b997a
3
+ metadata.gz: 369872075a1b555eb1dbcdf744e8d9f01aa4ba4c8f29449ba61668da5c4063ff
4
+ data.tar.gz: 1ae0e8e36e37c51687c965c33d55c1a1eaaab9d4e71d089378ee62fc340e0cd1
5
5
  SHA512:
6
- metadata.gz: e7f01438708076a283326c78b052ba954a42de4134d8d1d7e7c336c82ecd04c661f75dad3a0f9b1ffebe278f76ef229c98a3f2568801f82d94c94a50f399a2ef
7
- data.tar.gz: 4f44c0e92eb9e565611772ba28d426025621c0517c4217004c3409192991a17498dd38165a6c55561a5347d2fcdf34f51b24101ad6de525604e35785e89efbc0
6
+ metadata.gz: f8c948ee3f76bb85018a3491d97f89b2957247f2cae35b650ee8d1682d482377e76e2150bbf8a81a9a1aaea4384af321c36c9a621c0c1a71a5dd079cb482a144
7
+ data.tar.gz: 070f318bcdfb024310b59fc8ceec848c937e0d7e5c4824c40cbb80a9b783e96d98b3f8f67a19630f6fe26aaee35769df84e24aefb198b58a0b06f825a18259a4
data/CHANGELOG.md CHANGED
@@ -1,3 +1,62 @@
1
+ ## 1.6.0.rc1 / 2023-05-24
2
+
3
+ * Sanitizers that use an HTML5 parser are now available on platforms supported by
4
+ Nokogiri::HTML5. These are available as:
5
+
6
+ - `Rails::HTML5::FullSanitizer`
7
+ - `Rails::HTML5::LinkSanitizer`
8
+ - `Rails::HTML5::SafeListSanitizer`
9
+
10
+ And a new "vendor" is provided at `Rails::HTML5::Sanitizer` that can be used in a future version
11
+ of Rails.
12
+
13
+ Note that for symmetry `Rails::HTML4::Sanitizer` is also added, though its behavior is identical
14
+ to the vendor class methods on `Rails::HTML::Sanitizer`.
15
+
16
+ *Mike Dalessio*
17
+
18
+ * Module namespaces have changed, but backwards compatibility is provided by aliases.
19
+
20
+ The library defines three additional modules:
21
+
22
+ - `Rails::HTML` for general functionality (replacing `Rails::Html`)
23
+ - `Rails::HTML4` containing sanitizers that parse content as HTML4
24
+ - `Rails::HTML5` containing sanitizers that parse content as HTML5
25
+
26
+ The following aliases are maintained for backwards compatibility:
27
+
28
+ - `Rails::Html` points to `Rails::HTML`
29
+ - `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
30
+ - `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
31
+ - `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
32
+
33
+ *Mike Dalessio*
34
+
35
+ * `LinkSanitizer` always returns UTF-8 encoded strings. `SafeListSanitizer` and `FullSanitizer`
36
+ already ensured this encoding.
37
+
38
+ *Mike Dalessio*
39
+
40
+ * `SafeListSanitizer` allows `time` tag and `lang` attribute by default.
41
+
42
+ *Mike Dalessio*
43
+
44
+ * The constant `Rails::Html::XPATHS_TO_REMOVE` has been removed. It's not necessary with the
45
+ existing sanitizers, and should have been a private constant all along anyway.
46
+
47
+ *Mike Dalessio*
48
+
49
+
50
+ ## 1.5.0 / 2023-01-20
51
+
52
+ * `SafeListSanitizer`, `PermitScrubber`, and `TargetScrubber` now all support pruning of unsafe tags.
53
+
54
+ By default, unsafe tags are still stripped, but this behavior can be changed to prune the element
55
+ and its children from the document by passing `prune: true` to any of these classes' constructors.
56
+
57
+ *seyerian*
58
+
59
+
1
60
  ## 1.4.4 / 2022-12-13
2
61
 
3
62
  * Address inefficient regular expression complexity with certain configurations of Rails::Html::Sanitizer.
@@ -52,6 +111,7 @@
52
111
 
53
112
  *Mike Dalessio*
54
113
 
114
+
55
115
  ## 1.4.1 / 2021-08-18
56
116
 
57
117
  * Fix regression in v1.4.0 that did not pass comment nodes to the scrubber.
@@ -64,6 +124,7 @@
64
124
 
65
125
  *Mike Dalessio*
66
126
 
127
+
67
128
  ## 1.4.0 / 2021-08-18
68
129
 
69
130
  * Processing Instructions are no longer allowed by Rails::Html::PermitScrubber
@@ -76,12 +137,14 @@
76
137
 
77
138
  *Mike Dalessio*
78
139
 
140
+
79
141
  ## 1.3.0
80
142
 
81
143
  * Address deprecations in Loofah 2.3.0.
82
144
 
83
145
  *Josh Goodall*
84
146
 
147
+
85
148
  ## 1.2.0
86
149
 
87
150
  * Remove needless `white_list_sanitizer` deprecation.
@@ -96,6 +159,7 @@
96
159
 
97
160
  *Kasper Timm Hansen*
98
161
 
162
+
99
163
  ## 1.1.0
100
164
 
101
165
  * Add `safe_list_sanitizer` and deprecate `white_list_sanitizer` to be removed
@@ -113,10 +177,12 @@
113
177
 
114
178
  *Kasper Timm Hansen*
115
179
 
180
+
116
181
  ## 1.0.1
117
182
 
118
183
  * Added support for Rails 4.2.0.beta2 and above
119
184
 
185
+
120
186
  ## 1.0.0
121
187
 
122
188
  * First release.
data/MIT-LICENSE CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (c) 2013-2015 Rafael Mendonça França, Kasper Timm Hansen
1
+ Copyright (c) 2013-2023 Rafael Mendonça França, Kasper Timm Hansen, Mike Dalessio
2
2
 
3
3
  MIT License
4
4
 
data/README.md CHANGED
@@ -1,61 +1,121 @@
1
- # Rails Html Sanitizers
1
+ # Rails HTML Sanitizers
2
2
 
3
- In Rails 4.2 and above this gem will be responsible for sanitizing HTML fragments in Rails
4
- applications, i.e. in the `sanitize`, `sanitize_css`, `strip_tags` and `strip_links` methods.
3
+ This gem is responsible for sanitizing HTML fragments in Rails applications. Specifically, this is the set of sanitizers used to implement the Action View `SanitizerHelper` methods `sanitize`, `sanitize_css`, `strip_tags` and `strip_links`.
5
4
 
6
- Rails Html Sanitizer is only intended to be used with Rails applications. If you need similar functionality in non Rails apps consider using [Loofah](https://github.com/flavorjones/loofah) directly (that's what handles sanitization under the hood).
5
+ Rails HTML Sanitizer is only intended to be used with Rails applications. If you need similar functionality but aren't using Rails, consider using the underlying sanitization library [Loofah](https://github.com/flavorjones/loofah) directly.
7
6
 
8
- ## Installation
9
7
 
10
- Add this line to your application's Gemfile:
8
+ ## Usage
11
9
 
12
- gem 'rails-html-sanitizer'
10
+ ### A note on HTML entities
13
11
 
14
- And then execute:
12
+ __Rails HTML sanitizers are intended to be used by the view layer, at page-render time. They are *not* intended to sanitize persisted strings that will be sanitized *again* at page-render time.__
15
13
 
16
- $ bundle
14
+ Proper HTML sanitization will replace some characters with HTML entities. For example, text containing a `<` character will be updated to contain `&lt;` to ensure that the markup is well-formed.
17
15
 
18
- Or install it yourself as:
16
+ This is important to keep in mind because __HTML entities will render improperly if they are sanitized twice.__
19
17
 
20
- $ gem install rails-html-sanitizer
21
18
 
22
- ## Usage
19
+ #### A concrete example showing the problem that can arise
20
+
21
+ Imagine the user is asked to enter their employer's name, which will appear on their public profile page. Then imagine they enter `JPMorgan Chase & Co.`.
22
+
23
+ If you sanitize this before persisting it in the database, the stored string will be `JPMorgan Chase &amp; Co.`
24
+
25
+ When the page is rendered, if this string is sanitized a second time by the view layer, the HTML will contain `JPMorgan Chase &amp;amp; Co.` which will render as "JPMorgan Chase &amp;amp; Co.".
26
+
27
+ Another problem that can arise is rendering the sanitized string in a non-HTML context (for example, if it ends up being part of an SMS message). In this case, it may contain inappropriate HTML entities.
28
+
29
+
30
+ #### Suggested alternatives
31
+
32
+ You might simply choose to persist the untrusted string as-is (the raw input), and then ensure that the string will be properly sanitized by the view layer.
33
+
34
+ That raw string, if rendered in an non-HTML context (like SMS), must also be sanitized by a method appropriate for that context. You may wish to look into using [Loofah](https://github.com/flavorjones/loofah) or [Sanitize](https://github.com/rgrove/sanitize) to customize how this sanitization works, including omitting HTML entities in the final string.
35
+
36
+ If you really want to sanitize the string that's stored in your database, you may wish to look into [Loofah::ActiveRecord](https://github.com/flavorjones/loofah-activerecord) rather than use the Rails HTML sanitizers.
37
+
38
+
39
+ ### A note on module names
40
+
41
+ In versions < 1.6, the only module defined by this library was `Rails::Html`. Starting in 1.6, we define three additional modules:
42
+
43
+ - `Rails::HTML` for general functionality (replacing `Rails::Html`)
44
+ - `Rails::HTML4` containing sanitizers that parse content as HTML4
45
+ - `Rails::HTML5` containing sanitizers that parse content as HTML5 (if supported)
46
+
47
+ The following aliases are maintained for backwards compatibility:
48
+
49
+ - `Rails::Html` points to `Rails::HTML`
50
+ - `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
51
+ - `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
52
+ - `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
53
+
23
54
 
24
55
  ### Sanitizers
25
56
 
26
- All sanitizers respond to `sanitize`.
57
+ All sanitizers respond to `sanitize`, and are available in variants that use either HTML4 or HTML5 parsing, under the `Rails::HTML4` and `Rails::HTML5` namespaces, respectively.
58
+
59
+ NOTE: The HTML5 sanitizers are not supported on JRuby. Users may programmatically check for support by calling `Rails::HTML::Sanitizer.html5_support?`.
60
+
27
61
 
28
62
  #### FullSanitizer
29
63
 
30
64
  ```ruby
31
- full_sanitizer = Rails::Html::FullSanitizer.new
65
+ full_sanitizer = Rails::HTML5::FullSanitizer.new
32
66
  full_sanitizer.sanitize("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")
33
67
  # => Bold no more! See more here...
34
68
  ```
35
69
 
70
+ or, if you insist on parsing the content as HTML4:
71
+
72
+ ```ruby
73
+ full_sanitizer = Rails::HTML4::FullSanitizer.new
74
+ full_sanitizer.sanitize("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")
75
+ # => Bold no more! See more here...
76
+ ```
77
+
78
+ HTML5 version:
79
+
80
+
81
+
36
82
  #### LinkSanitizer
37
83
 
38
84
  ```ruby
39
- link_sanitizer = Rails::Html::LinkSanitizer.new
85
+ link_sanitizer = Rails::HTML5::LinkSanitizer.new
40
86
  link_sanitizer.sanitize('<a href="example.com">Only the link text will be kept.</a>')
41
87
  # => Only the link text will be kept.
42
88
  ```
43
89
 
90
+ or, if you insist on parsing the content as HTML4:
91
+
92
+ ```ruby
93
+ link_sanitizer = Rails::HTML4::LinkSanitizer.new
94
+ link_sanitizer.sanitize('<a href="example.com">Only the link text will be kept.</a>')
95
+ # => Only the link text will be kept.
96
+ ```
97
+
98
+
44
99
  #### SafeListSanitizer
45
100
 
101
+ This sanitizer is also available as an HTML4 variant, but for simplicity we'll document only the HTML5 variant below.
102
+
46
103
  ```ruby
47
- safe_list_sanitizer = Rails::Html::SafeListSanitizer.new
104
+ safe_list_sanitizer = Rails::HTML5::SafeListSanitizer.new
48
105
 
49
106
  # sanitize via an extensive safe list of allowed elements
50
107
  safe_list_sanitizer.sanitize(@article.body)
51
108
 
52
- # safe list only the supplied tags and attributes
109
+ # sanitize only the supplied tags and attributes
53
110
  safe_list_sanitizer.sanitize(@article.body, tags: %w(table tr td), attributes: %w(id class style))
54
111
 
55
- # safe list via a custom scrubber
112
+ # sanitize via a custom scrubber
56
113
  safe_list_sanitizer.sanitize(@article.body, scrubber: ArticleScrubber.new)
57
114
 
58
- # safe list sanitizer can also sanitize css
115
+ # prune nodes from the tree instead of stripping tags and leaving inner content
116
+ safe_list_sanitizer = Rails::HTML5::SafeListSanitizer.new(prune: true)
117
+
118
+ # the sanitizer can also sanitize css
59
119
  safe_list_sanitizer.sanitize_css('background-color: #000;')
60
120
  ```
61
121
 
@@ -63,14 +123,14 @@ safe_list_sanitizer.sanitize_css('background-color: #000;')
63
123
 
64
124
  Scrubbers are objects responsible for removing nodes or attributes you don't want in your HTML document.
65
125
 
66
- This gem includes two scrubbers `Rails::Html::PermitScrubber` and `Rails::Html::TargetScrubber`.
126
+ This gem includes two scrubbers `Rails::HTML::PermitScrubber` and `Rails::HTML::TargetScrubber`.
67
127
 
68
- #### `Rails::Html::PermitScrubber`
128
+ #### `Rails::HTML::PermitScrubber`
69
129
 
70
130
  This scrubber allows you to permit only the tags and attributes you want.
71
131
 
72
132
  ```ruby
73
- scrubber = Rails::Html::PermitScrubber.new
133
+ scrubber = Rails::HTML::PermitScrubber.new
74
134
  scrubber.tags = ['a']
75
135
 
76
136
  html_fragment = Loofah.fragment('<a><img/ ></a>')
@@ -78,16 +138,34 @@ html_fragment.scrub!(scrubber)
78
138
  html_fragment.to_s # => "<a></a>"
79
139
  ```
80
140
 
81
- #### `Rails::Html::TargetScrubber`
141
+ By default, inner content is left, but it can be removed as well.
142
+
143
+ ```ruby
144
+ scrubber = Rails::HTML::PermitScrubber.new
145
+ scrubber.tags = ['a']
146
+
147
+ html_fragment = Loofah.fragment('<a><span>text</span></a>')
148
+ html_fragment.scrub!(scrubber)
149
+ html_fragment.to_s # => "<a>text</a>"
150
+
151
+ scrubber = Rails::HTML::PermitScrubber.new(prune: true)
152
+ scrubber.tags = ['a']
153
+
154
+ html_fragment = Loofah.fragment('<a><span>text</span></a>')
155
+ html_fragment.scrub!(scrubber)
156
+ html_fragment.to_s # => "<a></a>"
157
+ ```
158
+
159
+ #### `Rails::HTML::TargetScrubber`
82
160
 
83
161
  Where `PermitScrubber` picks out tags and attributes to permit in sanitization,
84
- `Rails::Html::TargetScrubber` targets them for removal. See https://github.com/flavorjones/loofah/blob/main/lib/loofah/html5/safelist.rb for the tag list.
162
+ `Rails::HTML::TargetScrubber` targets them for removal. See https://github.com/flavorjones/loofah/blob/main/lib/loofah/html5/safelist.rb for the tag list.
85
163
 
86
164
  **Note:** by default, it will scrub anything that is not part of the permitted tags from
87
165
  loofah `HTML5::Scrub.allowed_element?`.
88
166
 
89
167
  ```ruby
90
- scrubber = Rails::Html::TargetScrubber.new
168
+ scrubber = Rails::HTML::TargetScrubber.new
91
169
  scrubber.tags = ['img']
92
170
 
93
171
  html_fragment = Loofah.fragment('<a><img/ ></a>')
@@ -95,12 +173,30 @@ html_fragment.scrub!(scrubber)
95
173
  html_fragment.to_s # => "<a></a>"
96
174
  ```
97
175
 
176
+ Similarly to `PermitScrubber`, nodes can be fully pruned.
177
+
178
+ ```ruby
179
+ scrubber = Rails::HTML::TargetScrubber.new
180
+ scrubber.tags = ['span']
181
+
182
+ html_fragment = Loofah.fragment('<a><span>text</span></a>')
183
+ html_fragment.scrub!(scrubber)
184
+ html_fragment.to_s # => "<a>text</a>"
185
+
186
+ scrubber = Rails::HTML::TargetScrubber.new(prune: true)
187
+ scrubber.tags = ['span']
188
+
189
+ html_fragment = Loofah.fragment('<a><span>text</span></a>')
190
+ html_fragment.scrub!(scrubber)
191
+ html_fragment.to_s # => "<a></a>"
192
+ ```
193
+
98
194
  #### Custom Scrubbers
99
195
 
100
196
  You can also create custom scrubbers in your application if you want to.
101
197
 
102
198
  ```ruby
103
- class CommentScrubber < Rails::Html::PermitScrubber
199
+ class CommentScrubber < Rails::HTML::PermitScrubber
104
200
  def initialize
105
201
  super
106
202
  self.tags = %w( form script comment blockquote )
@@ -113,7 +209,7 @@ class CommentScrubber < Rails::Html::PermitScrubber
113
209
  end
114
210
  ```
115
211
 
116
- See `Rails::Html::PermitScrubber` documentation to learn more about which methods can be overridden.
212
+ See `Rails::HTML::PermitScrubber` documentation to learn more about which methods can be overridden.
117
213
 
118
214
  #### Custom Scrubber in a Rails app
119
215
 
@@ -123,20 +219,44 @@ Using the `CommentScrubber` from above, you can use this in a Rails view like so
123
219
  <%= sanitize @comment, scrubber: CommentScrubber.new %>
124
220
  ```
125
221
 
222
+ ## Installation
223
+
224
+ Add this line to your application's Gemfile:
225
+
226
+ gem 'rails-html-sanitizer'
227
+
228
+ And then execute:
229
+
230
+ $ bundle
231
+
232
+ Or install it yourself as:
233
+
234
+ $ gem install rails-html-sanitizer
235
+
236
+
126
237
  ## Read more
127
238
 
128
239
  Loofah is what underlies the sanitizers and scrubbers of rails-html-sanitizer.
240
+
129
241
  - [Loofah and Loofah Scrubbers](https://github.com/flavorjones/loofah)
130
242
 
131
243
  The `node` argument passed to some methods in a custom scrubber is an instance of `Nokogiri::XML::Node`.
244
+
132
245
  - [`Nokogiri::XML::Node`](https://nokogiri.org/rdoc/Nokogiri/XML/Node.html)
133
246
  - [Nokogiri](http://nokogiri.org)
134
247
 
135
- ## Contributing to Rails Html Sanitizers
136
248
 
137
- Rails Html Sanitizers is work of many contributors. You're encouraged to submit pull requests, propose features and discuss issues.
249
+ ## Contributing to Rails HTML Sanitizers
250
+
251
+ Rails HTML Sanitizers is work of many contributors. You're encouraged to submit pull requests, propose features and discuss issues.
138
252
 
139
253
  See [CONTRIBUTING](CONTRIBUTING.md).
140
254
 
255
+ ### Security reports
256
+
257
+ Trying to report a possible security vulnerability in this project? Please check out the [Rails project's security policy](https://rubyonrails.org/security) for instructions.
258
+
259
+
141
260
  ## License
142
- Rails Html Sanitizers is released under the [MIT License](MIT-LICENSE).
261
+
262
+ Rails HTML Sanitizers is released under the [MIT License](MIT-LICENSE).
@@ -1,7 +1,9 @@
1
+ # frozen_string_literal: true
2
+
1
3
  module Rails
2
- module Html
4
+ module HTML
3
5
  class Sanitizer
4
- VERSION = "1.4.4"
6
+ VERSION = "1.6.0.rc1"
5
7
  end
6
8
  end
7
9
  end