rails-html-sanitizer 1.5.0 → 1.6.0.rc1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 59897b4a0d7f69a21932ec1cb44e24ea0ba4c2cf79ef7101ef32e90f40ad766b
4
- data.tar.gz: '063919ea5426a6938040672fefeabcaf82087a5ccd12ffc7457e34eb6984042b'
3
+ metadata.gz: 369872075a1b555eb1dbcdf744e8d9f01aa4ba4c8f29449ba61668da5c4063ff
4
+ data.tar.gz: 1ae0e8e36e37c51687c965c33d55c1a1eaaab9d4e71d089378ee62fc340e0cd1
5
5
  SHA512:
6
- metadata.gz: 2b0c23a07bc8acb3c1a039266cf053ad9044670a96620365b3ed722eb9a602def1bebe8de40697a2b12deba61cf224461ae8a4dc93749fa8c9675cda4cd216dd
7
- data.tar.gz: 5dce2af04dd887e08773a975cc67d93987b15330d023cc68a6cc51322ed73b60681309b25feab1a2c54b0e062696af1dd78c83cdb609e8d179b6ef95419573b3
6
+ metadata.gz: f8c948ee3f76bb85018a3491d97f89b2957247f2cae35b650ee8d1682d482377e76e2150bbf8a81a9a1aaea4384af321c36c9a621c0c1a71a5dd079cb482a144
7
+ data.tar.gz: 070f318bcdfb024310b59fc8ceec848c937e0d7e5c4824c40cbb80a9b783e96d98b3f8f67a19630f6fe26aaee35769df84e24aefb198b58a0b06f825a18259a4
data/CHANGELOG.md CHANGED
@@ -1,3 +1,52 @@
1
+ ## 1.6.0.rc1 / 2023-05-24
2
+
3
+ * Sanitizers that use an HTML5 parser are now available on platforms supported by
4
+ Nokogiri::HTML5. These are available as:
5
+
6
+ - `Rails::HTML5::FullSanitizer`
7
+ - `Rails::HTML5::LinkSanitizer`
8
+ - `Rails::HTML5::SafeListSanitizer`
9
+
10
+ And a new "vendor" is provided at `Rails::HTML5::Sanitizer` that can be used in a future version
11
+ of Rails.
12
+
13
+ Note that for symmetry `Rails::HTML4::Sanitizer` is also added, though its behavior is identical
14
+ to the vendor class methods on `Rails::HTML::Sanitizer`.
15
+
16
+ *Mike Dalessio*
17
+
18
+ * Module namespaces have changed, but backwards compatibility is provided by aliases.
19
+
20
+ The library defines three additional modules:
21
+
22
+ - `Rails::HTML` for general functionality (replacing `Rails::Html`)
23
+ - `Rails::HTML4` containing sanitizers that parse content as HTML4
24
+ - `Rails::HTML5` containing sanitizers that parse content as HTML5
25
+
26
+ The following aliases are maintained for backwards compatibility:
27
+
28
+ - `Rails::Html` points to `Rails::HTML`
29
+ - `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
30
+ - `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
31
+ - `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
32
+
33
+ *Mike Dalessio*
34
+
35
+ * `LinkSanitizer` always returns UTF-8 encoded strings. `SafeListSanitizer` and `FullSanitizer`
36
+ already ensured this encoding.
37
+
38
+ *Mike Dalessio*
39
+
40
+ * `SafeListSanitizer` allows `time` tag and `lang` attribute by default.
41
+
42
+ *Mike Dalessio*
43
+
44
+ * The constant `Rails::Html::XPATHS_TO_REMOVE` has been removed. It's not necessary with the
45
+ existing sanitizers, and should have been a private constant all along anyway.
46
+
47
+ *Mike Dalessio*
48
+
49
+
1
50
  ## 1.5.0 / 2023-01-20
2
51
 
3
52
  * `SafeListSanitizer`, `PermitScrubber`, and `TargetScrubber` now all support pruning of unsafe tags.
@@ -7,6 +56,7 @@
7
56
 
8
57
  *seyerian*
9
58
 
59
+
10
60
  ## 1.4.4 / 2022-12-13
11
61
 
12
62
  * Address inefficient regular expression complexity with certain configurations of Rails::Html::Sanitizer.
@@ -52,6 +102,7 @@
52
102
 
53
103
  *Mike Dalessio*
54
104
 
105
+
55
106
  ## 1.4.2 / 2021-08-23
56
107
 
57
108
  * Slightly improve performance.
@@ -60,6 +111,7 @@
60
111
 
61
112
  *Mike Dalessio*
62
113
 
114
+
63
115
  ## 1.4.1 / 2021-08-18
64
116
 
65
117
  * Fix regression in v1.4.0 that did not pass comment nodes to the scrubber.
@@ -72,6 +124,7 @@
72
124
 
73
125
  *Mike Dalessio*
74
126
 
127
+
75
128
  ## 1.4.0 / 2021-08-18
76
129
 
77
130
  * Processing Instructions are no longer allowed by Rails::Html::PermitScrubber
@@ -84,12 +137,14 @@
84
137
 
85
138
  *Mike Dalessio*
86
139
 
140
+
87
141
  ## 1.3.0
88
142
 
89
143
  * Address deprecations in Loofah 2.3.0.
90
144
 
91
145
  *Josh Goodall*
92
146
 
147
+
93
148
  ## 1.2.0
94
149
 
95
150
  * Remove needless `white_list_sanitizer` deprecation.
@@ -104,6 +159,7 @@
104
159
 
105
160
  *Kasper Timm Hansen*
106
161
 
162
+
107
163
  ## 1.1.0
108
164
 
109
165
  * Add `safe_list_sanitizer` and deprecate `white_list_sanitizer` to be removed
@@ -121,10 +177,12 @@
121
177
 
122
178
  *Kasper Timm Hansen*
123
179
 
180
+
124
181
  ## 1.0.1
125
182
 
126
183
  * Added support for Rails 4.2.0.beta2 and above
127
184
 
185
+
128
186
  ## 1.0.0
129
187
 
130
188
  * First release.
data/MIT-LICENSE CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (c) 2013-2015 Rafael Mendonça França, Kasper Timm Hansen
1
+ Copyright (c) 2013-2023 Rafael Mendonça França, Kasper Timm Hansen, Mike Dalessio
2
2
 
3
3
  MIT License
4
4
 
data/README.md CHANGED
@@ -1,31 +1,17 @@
1
- # Rails Html Sanitizers
1
+ # Rails HTML Sanitizers
2
2
 
3
- In Rails 4.2 and above this gem will be responsible for sanitizing HTML fragments in Rails
4
- applications, i.e. in the `sanitize`, `sanitize_css`, `strip_tags` and `strip_links` methods.
3
+ This gem is responsible for sanitizing HTML fragments in Rails applications. Specifically, this is the set of sanitizers used to implement the Action View `SanitizerHelper` methods `sanitize`, `sanitize_css`, `strip_tags` and `strip_links`.
5
4
 
6
- Rails Html Sanitizer is only intended to be used with Rails applications. If you need similar functionality in non Rails apps consider using [Loofah](https://github.com/flavorjones/loofah) directly (that's what handles sanitization under the hood).
5
+ Rails HTML Sanitizer is only intended to be used with Rails applications. If you need similar functionality but aren't using Rails, consider using the underlying sanitization library [Loofah](https://github.com/flavorjones/loofah) directly.
7
6
 
8
- ## Installation
9
-
10
- Add this line to your application's Gemfile:
11
-
12
- gem 'rails-html-sanitizer'
13
-
14
- And then execute:
15
-
16
- $ bundle
17
-
18
- Or install it yourself as:
19
-
20
- $ gem install rails-html-sanitizer
21
7
 
22
8
  ## Usage
23
9
 
24
10
  ### A note on HTML entities
25
11
 
26
- __Rails::HTML sanitizers are intended to be used by the view layer, at page-render time. They are *not* intended to sanitize persisted strings that will sanitized *again* at page-render time.__
12
+ __Rails HTML sanitizers are intended to be used by the view layer, at page-render time. They are *not* intended to sanitize persisted strings that will be sanitized *again* at page-render time.__
27
13
 
28
- Proper HTML sanitization will replace some characters with HTML entities. For example, `<` will be replaced with `&lt;` to ensure that the markup is well-formed.
14
+ Proper HTML sanitization will replace some characters with HTML entities. For example, text containing a `<` character will be updated to contain `&lt;` to ensure that the markup is well-formed.
29
15
 
30
16
  This is important to keep in mind because __HTML entities will render improperly if they are sanitized twice.__
31
17
 
@@ -47,62 +33,104 @@ You might simply choose to persist the untrusted string as-is (the raw input), a
47
33
 
48
34
  That raw string, if rendered in an non-HTML context (like SMS), must also be sanitized by a method appropriate for that context. You may wish to look into using [Loofah](https://github.com/flavorjones/loofah) or [Sanitize](https://github.com/rgrove/sanitize) to customize how this sanitization works, including omitting HTML entities in the final string.
49
35
 
50
- If you really want to sanitize the string that's stored in your database, you may wish to look into [Loofah::ActiveRecord](https://github.com/flavorjones/loofah-activerecord) rather than use the Rails::HTML sanitizers.
36
+ If you really want to sanitize the string that's stored in your database, you may wish to look into [Loofah::ActiveRecord](https://github.com/flavorjones/loofah-activerecord) rather than use the Rails HTML sanitizers.
37
+
38
+
39
+ ### A note on module names
40
+
41
+ In versions < 1.6, the only module defined by this library was `Rails::Html`. Starting in 1.6, we define three additional modules:
42
+
43
+ - `Rails::HTML` for general functionality (replacing `Rails::Html`)
44
+ - `Rails::HTML4` containing sanitizers that parse content as HTML4
45
+ - `Rails::HTML5` containing sanitizers that parse content as HTML5 (if supported)
46
+
47
+ The following aliases are maintained for backwards compatibility:
48
+
49
+ - `Rails::Html` points to `Rails::HTML`
50
+ - `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
51
+ - `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
52
+ - `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
51
53
 
52
54
 
53
55
  ### Sanitizers
54
56
 
55
- All sanitizers respond to `sanitize`.
57
+ All sanitizers respond to `sanitize`, and are available in variants that use either HTML4 or HTML5 parsing, under the `Rails::HTML4` and `Rails::HTML5` namespaces, respectively.
58
+
59
+ NOTE: The HTML5 sanitizers are not supported on JRuby. Users may programmatically check for support by calling `Rails::HTML::Sanitizer.html5_support?`.
60
+
56
61
 
57
62
  #### FullSanitizer
58
63
 
59
64
  ```ruby
60
- full_sanitizer = Rails::Html::FullSanitizer.new
65
+ full_sanitizer = Rails::HTML5::FullSanitizer.new
61
66
  full_sanitizer.sanitize("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")
62
67
  # => Bold no more! See more here...
63
68
  ```
64
69
 
70
+ or, if you insist on parsing the content as HTML4:
71
+
72
+ ```ruby
73
+ full_sanitizer = Rails::HTML4::FullSanitizer.new
74
+ full_sanitizer.sanitize("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")
75
+ # => Bold no more! See more here...
76
+ ```
77
+
78
+ HTML5 version:
79
+
80
+
81
+
65
82
  #### LinkSanitizer
66
83
 
67
84
  ```ruby
68
- link_sanitizer = Rails::Html::LinkSanitizer.new
85
+ link_sanitizer = Rails::HTML5::LinkSanitizer.new
69
86
  link_sanitizer.sanitize('<a href="example.com">Only the link text will be kept.</a>')
70
87
  # => Only the link text will be kept.
71
88
  ```
72
89
 
90
+ or, if you insist on parsing the content as HTML4:
91
+
92
+ ```ruby
93
+ link_sanitizer = Rails::HTML4::LinkSanitizer.new
94
+ link_sanitizer.sanitize('<a href="example.com">Only the link text will be kept.</a>')
95
+ # => Only the link text will be kept.
96
+ ```
97
+
98
+
73
99
  #### SafeListSanitizer
74
100
 
101
+ This sanitizer is also available as an HTML4 variant, but for simplicity we'll document only the HTML5 variant below.
102
+
75
103
  ```ruby
76
- safe_list_sanitizer = Rails::Html::SafeListSanitizer.new
104
+ safe_list_sanitizer = Rails::HTML5::SafeListSanitizer.new
77
105
 
78
106
  # sanitize via an extensive safe list of allowed elements
79
107
  safe_list_sanitizer.sanitize(@article.body)
80
108
 
81
- # safe list only the supplied tags and attributes
109
+ # sanitize only the supplied tags and attributes
82
110
  safe_list_sanitizer.sanitize(@article.body, tags: %w(table tr td), attributes: %w(id class style))
83
111
 
84
- # safe list via a custom scrubber
112
+ # sanitize via a custom scrubber
85
113
  safe_list_sanitizer.sanitize(@article.body, scrubber: ArticleScrubber.new)
86
114
 
87
- # safe list sanitizer can also sanitize css
88
- safe_list_sanitizer.sanitize_css('background-color: #000;')
115
+ # prune nodes from the tree instead of stripping tags and leaving inner content
116
+ safe_list_sanitizer = Rails::HTML5::SafeListSanitizer.new(prune: true)
89
117
 
90
- # fully prune nodes from the tree instead of stripping tags and leaving inner content
91
- safe_list_sanitizer = Rails::Html::SafeListSanitizer.new(prune: true)
118
+ # the sanitizer can also sanitize css
119
+ safe_list_sanitizer.sanitize_css('background-color: #000;')
92
120
  ```
93
121
 
94
122
  ### Scrubbers
95
123
 
96
124
  Scrubbers are objects responsible for removing nodes or attributes you don't want in your HTML document.
97
125
 
98
- This gem includes two scrubbers `Rails::Html::PermitScrubber` and `Rails::Html::TargetScrubber`.
126
+ This gem includes two scrubbers `Rails::HTML::PermitScrubber` and `Rails::HTML::TargetScrubber`.
99
127
 
100
- #### `Rails::Html::PermitScrubber`
128
+ #### `Rails::HTML::PermitScrubber`
101
129
 
102
130
  This scrubber allows you to permit only the tags and attributes you want.
103
131
 
104
132
  ```ruby
105
- scrubber = Rails::Html::PermitScrubber.new
133
+ scrubber = Rails::HTML::PermitScrubber.new
106
134
  scrubber.tags = ['a']
107
135
 
108
136
  html_fragment = Loofah.fragment('<a><img/ ></a>')
@@ -113,14 +141,14 @@ html_fragment.to_s # => "<a></a>"
113
141
  By default, inner content is left, but it can be removed as well.
114
142
 
115
143
  ```ruby
116
- scrubber = Rails::Html::PermitScrubber.new
144
+ scrubber = Rails::HTML::PermitScrubber.new
117
145
  scrubber.tags = ['a']
118
146
 
119
147
  html_fragment = Loofah.fragment('<a><span>text</span></a>')
120
148
  html_fragment.scrub!(scrubber)
121
149
  html_fragment.to_s # => "<a>text</a>"
122
150
 
123
- scrubber = Rails::Html::PermitScrubber.new(prune: true)
151
+ scrubber = Rails::HTML::PermitScrubber.new(prune: true)
124
152
  scrubber.tags = ['a']
125
153
 
126
154
  html_fragment = Loofah.fragment('<a><span>text</span></a>')
@@ -128,16 +156,16 @@ html_fragment.scrub!(scrubber)
128
156
  html_fragment.to_s # => "<a></a>"
129
157
  ```
130
158
 
131
- #### `Rails::Html::TargetScrubber`
159
+ #### `Rails::HTML::TargetScrubber`
132
160
 
133
161
  Where `PermitScrubber` picks out tags and attributes to permit in sanitization,
134
- `Rails::Html::TargetScrubber` targets them for removal. See https://github.com/flavorjones/loofah/blob/main/lib/loofah/html5/safelist.rb for the tag list.
162
+ `Rails::HTML::TargetScrubber` targets them for removal. See https://github.com/flavorjones/loofah/blob/main/lib/loofah/html5/safelist.rb for the tag list.
135
163
 
136
164
  **Note:** by default, it will scrub anything that is not part of the permitted tags from
137
165
  loofah `HTML5::Scrub.allowed_element?`.
138
166
 
139
167
  ```ruby
140
- scrubber = Rails::Html::TargetScrubber.new
168
+ scrubber = Rails::HTML::TargetScrubber.new
141
169
  scrubber.tags = ['img']
142
170
 
143
171
  html_fragment = Loofah.fragment('<a><img/ ></a>')
@@ -148,26 +176,27 @@ html_fragment.to_s # => "<a></a>"
148
176
  Similarly to `PermitScrubber`, nodes can be fully pruned.
149
177
 
150
178
  ```ruby
151
- scrubber = Rails::Html::TargetScrubber.new
179
+ scrubber = Rails::HTML::TargetScrubber.new
152
180
  scrubber.tags = ['span']
153
181
 
154
182
  html_fragment = Loofah.fragment('<a><span>text</span></a>')
155
183
  html_fragment.scrub!(scrubber)
156
184
  html_fragment.to_s # => "<a>text</a>"
157
185
 
158
- scrubber = Rails::Html::TargetScrubber.new(prune: true)
186
+ scrubber = Rails::HTML::TargetScrubber.new(prune: true)
159
187
  scrubber.tags = ['span']
160
188
 
161
189
  html_fragment = Loofah.fragment('<a><span>text</span></a>')
162
190
  html_fragment.scrub!(scrubber)
163
191
  html_fragment.to_s # => "<a></a>"
164
192
  ```
193
+
165
194
  #### Custom Scrubbers
166
195
 
167
196
  You can also create custom scrubbers in your application if you want to.
168
197
 
169
198
  ```ruby
170
- class CommentScrubber < Rails::Html::PermitScrubber
199
+ class CommentScrubber < Rails::HTML::PermitScrubber
171
200
  def initialize
172
201
  super
173
202
  self.tags = %w( form script comment blockquote )
@@ -180,7 +209,7 @@ class CommentScrubber < Rails::Html::PermitScrubber
180
209
  end
181
210
  ```
182
211
 
183
- See `Rails::Html::PermitScrubber` documentation to learn more about which methods can be overridden.
212
+ See `Rails::HTML::PermitScrubber` documentation to learn more about which methods can be overridden.
184
213
 
185
214
  #### Custom Scrubber in a Rails app
186
215
 
@@ -190,26 +219,44 @@ Using the `CommentScrubber` from above, you can use this in a Rails view like so
190
219
  <%= sanitize @comment, scrubber: CommentScrubber.new %>
191
220
  ```
192
221
 
222
+ ## Installation
223
+
224
+ Add this line to your application's Gemfile:
225
+
226
+ gem 'rails-html-sanitizer'
227
+
228
+ And then execute:
229
+
230
+ $ bundle
231
+
232
+ Or install it yourself as:
233
+
234
+ $ gem install rails-html-sanitizer
235
+
236
+
193
237
  ## Read more
194
238
 
195
239
  Loofah is what underlies the sanitizers and scrubbers of rails-html-sanitizer.
240
+
196
241
  - [Loofah and Loofah Scrubbers](https://github.com/flavorjones/loofah)
197
242
 
198
243
  The `node` argument passed to some methods in a custom scrubber is an instance of `Nokogiri::XML::Node`.
244
+
199
245
  - [`Nokogiri::XML::Node`](https://nokogiri.org/rdoc/Nokogiri/XML/Node.html)
200
246
  - [Nokogiri](http://nokogiri.org)
201
247
 
202
- ## Contributing to Rails Html Sanitizers
203
248
 
204
- Rails Html Sanitizers is work of many contributors. You're encouraged to submit pull requests, propose features and discuss issues.
249
+ ## Contributing to Rails HTML Sanitizers
250
+
251
+ Rails HTML Sanitizers is work of many contributors. You're encouraged to submit pull requests, propose features and discuss issues.
205
252
 
206
253
  See [CONTRIBUTING](CONTRIBUTING.md).
207
254
 
208
255
  ### Security reports
209
256
 
210
- Trying to report a possible security vulnerability in this project? Please
211
- check out our [security policy](https://rubyonrails.org/security) for
212
- guidelines about how to proceed.
257
+ Trying to report a possible security vulnerability in this project? Please check out the [Rails project's security policy](https://rubyonrails.org/security) for instructions.
258
+
213
259
 
214
260
  ## License
215
- Rails Html Sanitizers is released under the [MIT License](MIT-LICENSE).
261
+
262
+ Rails HTML Sanitizers is released under the [MIT License](MIT-LICENSE).
@@ -1,7 +1,9 @@
1
+ # frozen_string_literal: true
2
+
1
3
  module Rails
2
- module Html
4
+ module HTML
3
5
  class Sanitizer
4
- VERSION = "1.5.0"
6
+ VERSION = "1.6.0.rc1"
5
7
  end
6
8
  end
7
9
  end