rails-html-sanitizer 1.5.0 → 1.6.0.rc1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 59897b4a0d7f69a21932ec1cb44e24ea0ba4c2cf79ef7101ef32e90f40ad766b
4
- data.tar.gz: '063919ea5426a6938040672fefeabcaf82087a5ccd12ffc7457e34eb6984042b'
3
+ metadata.gz: 369872075a1b555eb1dbcdf744e8d9f01aa4ba4c8f29449ba61668da5c4063ff
4
+ data.tar.gz: 1ae0e8e36e37c51687c965c33d55c1a1eaaab9d4e71d089378ee62fc340e0cd1
5
5
  SHA512:
6
- metadata.gz: 2b0c23a07bc8acb3c1a039266cf053ad9044670a96620365b3ed722eb9a602def1bebe8de40697a2b12deba61cf224461ae8a4dc93749fa8c9675cda4cd216dd
7
- data.tar.gz: 5dce2af04dd887e08773a975cc67d93987b15330d023cc68a6cc51322ed73b60681309b25feab1a2c54b0e062696af1dd78c83cdb609e8d179b6ef95419573b3
6
+ metadata.gz: f8c948ee3f76bb85018a3491d97f89b2957247f2cae35b650ee8d1682d482377e76e2150bbf8a81a9a1aaea4384af321c36c9a621c0c1a71a5dd079cb482a144
7
+ data.tar.gz: 070f318bcdfb024310b59fc8ceec848c937e0d7e5c4824c40cbb80a9b783e96d98b3f8f67a19630f6fe26aaee35769df84e24aefb198b58a0b06f825a18259a4
data/CHANGELOG.md CHANGED
@@ -1,3 +1,52 @@
1
+ ## 1.6.0.rc1 / 2023-05-24
2
+
3
+ * Sanitizers that use an HTML5 parser are now available on platforms supported by
4
+ Nokogiri::HTML5. These are available as:
5
+
6
+ - `Rails::HTML5::FullSanitizer`
7
+ - `Rails::HTML5::LinkSanitizer`
8
+ - `Rails::HTML5::SafeListSanitizer`
9
+
10
+ And a new "vendor" is provided at `Rails::HTML5::Sanitizer` that can be used in a future version
11
+ of Rails.
12
+
13
+ Note that for symmetry `Rails::HTML4::Sanitizer` is also added, though its behavior is identical
14
+ to the vendor class methods on `Rails::HTML::Sanitizer`.
15
+
16
+ *Mike Dalessio*
17
+
18
+ * Module namespaces have changed, but backwards compatibility is provided by aliases.
19
+
20
+ The library defines three additional modules:
21
+
22
+ - `Rails::HTML` for general functionality (replacing `Rails::Html`)
23
+ - `Rails::HTML4` containing sanitizers that parse content as HTML4
24
+ - `Rails::HTML5` containing sanitizers that parse content as HTML5
25
+
26
+ The following aliases are maintained for backwards compatibility:
27
+
28
+ - `Rails::Html` points to `Rails::HTML`
29
+ - `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
30
+ - `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
31
+ - `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
32
+
33
+ *Mike Dalessio*
34
+
35
+ * `LinkSanitizer` always returns UTF-8 encoded strings. `SafeListSanitizer` and `FullSanitizer`
36
+ already ensured this encoding.
37
+
38
+ *Mike Dalessio*
39
+
40
+ * `SafeListSanitizer` allows `time` tag and `lang` attribute by default.
41
+
42
+ *Mike Dalessio*
43
+
44
+ * The constant `Rails::Html::XPATHS_TO_REMOVE` has been removed. It's not necessary with the
45
+ existing sanitizers, and should have been a private constant all along anyway.
46
+
47
+ *Mike Dalessio*
48
+
49
+
1
50
  ## 1.5.0 / 2023-01-20
2
51
 
3
52
  * `SafeListSanitizer`, `PermitScrubber`, and `TargetScrubber` now all support pruning of unsafe tags.
@@ -7,6 +56,7 @@
7
56
 
8
57
  *seyerian*
9
58
 
59
+
10
60
  ## 1.4.4 / 2022-12-13
11
61
 
12
62
  * Address inefficient regular expression complexity with certain configurations of Rails::Html::Sanitizer.
@@ -52,6 +102,7 @@
52
102
 
53
103
  *Mike Dalessio*
54
104
 
105
+
55
106
  ## 1.4.2 / 2021-08-23
56
107
 
57
108
  * Slightly improve performance.
@@ -60,6 +111,7 @@
60
111
 
61
112
  *Mike Dalessio*
62
113
 
114
+
63
115
  ## 1.4.1 / 2021-08-18
64
116
 
65
117
  * Fix regression in v1.4.0 that did not pass comment nodes to the scrubber.
@@ -72,6 +124,7 @@
72
124
 
73
125
  *Mike Dalessio*
74
126
 
127
+
75
128
  ## 1.4.0 / 2021-08-18
76
129
 
77
130
  * Processing Instructions are no longer allowed by Rails::Html::PermitScrubber
@@ -84,12 +137,14 @@
84
137
 
85
138
  *Mike Dalessio*
86
139
 
140
+
87
141
  ## 1.3.0
88
142
 
89
143
  * Address deprecations in Loofah 2.3.0.
90
144
 
91
145
  *Josh Goodall*
92
146
 
147
+
93
148
  ## 1.2.0
94
149
 
95
150
  * Remove needless `white_list_sanitizer` deprecation.
@@ -104,6 +159,7 @@
104
159
 
105
160
  *Kasper Timm Hansen*
106
161
 
162
+
107
163
  ## 1.1.0
108
164
 
109
165
  * Add `safe_list_sanitizer` and deprecate `white_list_sanitizer` to be removed
@@ -121,10 +177,12 @@
121
177
 
122
178
  *Kasper Timm Hansen*
123
179
 
180
+
124
181
  ## 1.0.1
125
182
 
126
183
  * Added support for Rails 4.2.0.beta2 and above
127
184
 
185
+
128
186
  ## 1.0.0
129
187
 
130
188
  * First release.
data/MIT-LICENSE CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (c) 2013-2015 Rafael Mendonça França, Kasper Timm Hansen
1
+ Copyright (c) 2013-2023 Rafael Mendonça França, Kasper Timm Hansen, Mike Dalessio
2
2
 
3
3
  MIT License
4
4
 
data/README.md CHANGED
@@ -1,31 +1,17 @@
1
- # Rails Html Sanitizers
1
+ # Rails HTML Sanitizers
2
2
 
3
- In Rails 4.2 and above this gem will be responsible for sanitizing HTML fragments in Rails
4
- applications, i.e. in the `sanitize`, `sanitize_css`, `strip_tags` and `strip_links` methods.
3
+ This gem is responsible for sanitizing HTML fragments in Rails applications. Specifically, this is the set of sanitizers used to implement the Action View `SanitizerHelper` methods `sanitize`, `sanitize_css`, `strip_tags` and `strip_links`.
5
4
 
6
- Rails Html Sanitizer is only intended to be used with Rails applications. If you need similar functionality in non Rails apps consider using [Loofah](https://github.com/flavorjones/loofah) directly (that's what handles sanitization under the hood).
5
+ Rails HTML Sanitizer is only intended to be used with Rails applications. If you need similar functionality but aren't using Rails, consider using the underlying sanitization library [Loofah](https://github.com/flavorjones/loofah) directly.
7
6
 
8
- ## Installation
9
-
10
- Add this line to your application's Gemfile:
11
-
12
- gem 'rails-html-sanitizer'
13
-
14
- And then execute:
15
-
16
- $ bundle
17
-
18
- Or install it yourself as:
19
-
20
- $ gem install rails-html-sanitizer
21
7
 
22
8
  ## Usage
23
9
 
24
10
  ### A note on HTML entities
25
11
 
26
- __Rails::HTML sanitizers are intended to be used by the view layer, at page-render time. They are *not* intended to sanitize persisted strings that will sanitized *again* at page-render time.__
12
+ __Rails HTML sanitizers are intended to be used by the view layer, at page-render time. They are *not* intended to sanitize persisted strings that will be sanitized *again* at page-render time.__
27
13
 
28
- Proper HTML sanitization will replace some characters with HTML entities. For example, `<` will be replaced with `&lt;` to ensure that the markup is well-formed.
14
+ Proper HTML sanitization will replace some characters with HTML entities. For example, text containing a `<` character will be updated to contain `&lt;` to ensure that the markup is well-formed.
29
15
 
30
16
  This is important to keep in mind because __HTML entities will render improperly if they are sanitized twice.__
31
17
 
@@ -47,62 +33,104 @@ You might simply choose to persist the untrusted string as-is (the raw input), a
47
33
 
48
34
  That raw string, if rendered in an non-HTML context (like SMS), must also be sanitized by a method appropriate for that context. You may wish to look into using [Loofah](https://github.com/flavorjones/loofah) or [Sanitize](https://github.com/rgrove/sanitize) to customize how this sanitization works, including omitting HTML entities in the final string.
49
35
 
50
- If you really want to sanitize the string that's stored in your database, you may wish to look into [Loofah::ActiveRecord](https://github.com/flavorjones/loofah-activerecord) rather than use the Rails::HTML sanitizers.
36
+ If you really want to sanitize the string that's stored in your database, you may wish to look into [Loofah::ActiveRecord](https://github.com/flavorjones/loofah-activerecord) rather than use the Rails HTML sanitizers.
37
+
38
+
39
+ ### A note on module names
40
+
41
+ In versions < 1.6, the only module defined by this library was `Rails::Html`. Starting in 1.6, we define three additional modules:
42
+
43
+ - `Rails::HTML` for general functionality (replacing `Rails::Html`)
44
+ - `Rails::HTML4` containing sanitizers that parse content as HTML4
45
+ - `Rails::HTML5` containing sanitizers that parse content as HTML5 (if supported)
46
+
47
+ The following aliases are maintained for backwards compatibility:
48
+
49
+ - `Rails::Html` points to `Rails::HTML`
50
+ - `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
51
+ - `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
52
+ - `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
51
53
 
52
54
 
53
55
  ### Sanitizers
54
56
 
55
- All sanitizers respond to `sanitize`.
57
+ All sanitizers respond to `sanitize`, and are available in variants that use either HTML4 or HTML5 parsing, under the `Rails::HTML4` and `Rails::HTML5` namespaces, respectively.
58
+
59
+ NOTE: The HTML5 sanitizers are not supported on JRuby. Users may programmatically check for support by calling `Rails::HTML::Sanitizer.html5_support?`.
60
+
56
61
 
57
62
  #### FullSanitizer
58
63
 
59
64
  ```ruby
60
- full_sanitizer = Rails::Html::FullSanitizer.new
65
+ full_sanitizer = Rails::HTML5::FullSanitizer.new
61
66
  full_sanitizer.sanitize("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")
62
67
  # => Bold no more! See more here...
63
68
  ```
64
69
 
70
+ or, if you insist on parsing the content as HTML4:
71
+
72
+ ```ruby
73
+ full_sanitizer = Rails::HTML4::FullSanitizer.new
74
+ full_sanitizer.sanitize("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")
75
+ # => Bold no more! See more here...
76
+ ```
77
+
78
+ HTML5 version:
79
+
80
+
81
+
65
82
  #### LinkSanitizer
66
83
 
67
84
  ```ruby
68
- link_sanitizer = Rails::Html::LinkSanitizer.new
85
+ link_sanitizer = Rails::HTML5::LinkSanitizer.new
69
86
  link_sanitizer.sanitize('<a href="example.com">Only the link text will be kept.</a>')
70
87
  # => Only the link text will be kept.
71
88
  ```
72
89
 
90
+ or, if you insist on parsing the content as HTML4:
91
+
92
+ ```ruby
93
+ link_sanitizer = Rails::HTML4::LinkSanitizer.new
94
+ link_sanitizer.sanitize('<a href="example.com">Only the link text will be kept.</a>')
95
+ # => Only the link text will be kept.
96
+ ```
97
+
98
+
73
99
  #### SafeListSanitizer
74
100
 
101
+ This sanitizer is also available as an HTML4 variant, but for simplicity we'll document only the HTML5 variant below.
102
+
75
103
  ```ruby
76
- safe_list_sanitizer = Rails::Html::SafeListSanitizer.new
104
+ safe_list_sanitizer = Rails::HTML5::SafeListSanitizer.new
77
105
 
78
106
  # sanitize via an extensive safe list of allowed elements
79
107
  safe_list_sanitizer.sanitize(@article.body)
80
108
 
81
- # safe list only the supplied tags and attributes
109
+ # sanitize only the supplied tags and attributes
82
110
  safe_list_sanitizer.sanitize(@article.body, tags: %w(table tr td), attributes: %w(id class style))
83
111
 
84
- # safe list via a custom scrubber
112
+ # sanitize via a custom scrubber
85
113
  safe_list_sanitizer.sanitize(@article.body, scrubber: ArticleScrubber.new)
86
114
 
87
- # safe list sanitizer can also sanitize css
88
- safe_list_sanitizer.sanitize_css('background-color: #000;')
115
+ # prune nodes from the tree instead of stripping tags and leaving inner content
116
+ safe_list_sanitizer = Rails::HTML5::SafeListSanitizer.new(prune: true)
89
117
 
90
- # fully prune nodes from the tree instead of stripping tags and leaving inner content
91
- safe_list_sanitizer = Rails::Html::SafeListSanitizer.new(prune: true)
118
+ # the sanitizer can also sanitize css
119
+ safe_list_sanitizer.sanitize_css('background-color: #000;')
92
120
  ```
93
121
 
94
122
  ### Scrubbers
95
123
 
96
124
  Scrubbers are objects responsible for removing nodes or attributes you don't want in your HTML document.
97
125
 
98
- This gem includes two scrubbers `Rails::Html::PermitScrubber` and `Rails::Html::TargetScrubber`.
126
+ This gem includes two scrubbers `Rails::HTML::PermitScrubber` and `Rails::HTML::TargetScrubber`.
99
127
 
100
- #### `Rails::Html::PermitScrubber`
128
+ #### `Rails::HTML::PermitScrubber`
101
129
 
102
130
  This scrubber allows you to permit only the tags and attributes you want.
103
131
 
104
132
  ```ruby
105
- scrubber = Rails::Html::PermitScrubber.new
133
+ scrubber = Rails::HTML::PermitScrubber.new
106
134
  scrubber.tags = ['a']
107
135
 
108
136
  html_fragment = Loofah.fragment('<a><img/ ></a>')
@@ -113,14 +141,14 @@ html_fragment.to_s # => "<a></a>"
113
141
  By default, inner content is left, but it can be removed as well.
114
142
 
115
143
  ```ruby
116
- scrubber = Rails::Html::PermitScrubber.new
144
+ scrubber = Rails::HTML::PermitScrubber.new
117
145
  scrubber.tags = ['a']
118
146
 
119
147
  html_fragment = Loofah.fragment('<a><span>text</span></a>')
120
148
  html_fragment.scrub!(scrubber)
121
149
  html_fragment.to_s # => "<a>text</a>"
122
150
 
123
- scrubber = Rails::Html::PermitScrubber.new(prune: true)
151
+ scrubber = Rails::HTML::PermitScrubber.new(prune: true)
124
152
  scrubber.tags = ['a']
125
153
 
126
154
  html_fragment = Loofah.fragment('<a><span>text</span></a>')
@@ -128,16 +156,16 @@ html_fragment.scrub!(scrubber)
128
156
  html_fragment.to_s # => "<a></a>"
129
157
  ```
130
158
 
131
- #### `Rails::Html::TargetScrubber`
159
+ #### `Rails::HTML::TargetScrubber`
132
160
 
133
161
  Where `PermitScrubber` picks out tags and attributes to permit in sanitization,
134
- `Rails::Html::TargetScrubber` targets them for removal. See https://github.com/flavorjones/loofah/blob/main/lib/loofah/html5/safelist.rb for the tag list.
162
+ `Rails::HTML::TargetScrubber` targets them for removal. See https://github.com/flavorjones/loofah/blob/main/lib/loofah/html5/safelist.rb for the tag list.
135
163
 
136
164
  **Note:** by default, it will scrub anything that is not part of the permitted tags from
137
165
  loofah `HTML5::Scrub.allowed_element?`.
138
166
 
139
167
  ```ruby
140
- scrubber = Rails::Html::TargetScrubber.new
168
+ scrubber = Rails::HTML::TargetScrubber.new
141
169
  scrubber.tags = ['img']
142
170
 
143
171
  html_fragment = Loofah.fragment('<a><img/ ></a>')
@@ -148,26 +176,27 @@ html_fragment.to_s # => "<a></a>"
148
176
  Similarly to `PermitScrubber`, nodes can be fully pruned.
149
177
 
150
178
  ```ruby
151
- scrubber = Rails::Html::TargetScrubber.new
179
+ scrubber = Rails::HTML::TargetScrubber.new
152
180
  scrubber.tags = ['span']
153
181
 
154
182
  html_fragment = Loofah.fragment('<a><span>text</span></a>')
155
183
  html_fragment.scrub!(scrubber)
156
184
  html_fragment.to_s # => "<a>text</a>"
157
185
 
158
- scrubber = Rails::Html::TargetScrubber.new(prune: true)
186
+ scrubber = Rails::HTML::TargetScrubber.new(prune: true)
159
187
  scrubber.tags = ['span']
160
188
 
161
189
  html_fragment = Loofah.fragment('<a><span>text</span></a>')
162
190
  html_fragment.scrub!(scrubber)
163
191
  html_fragment.to_s # => "<a></a>"
164
192
  ```
193
+
165
194
  #### Custom Scrubbers
166
195
 
167
196
  You can also create custom scrubbers in your application if you want to.
168
197
 
169
198
  ```ruby
170
- class CommentScrubber < Rails::Html::PermitScrubber
199
+ class CommentScrubber < Rails::HTML::PermitScrubber
171
200
  def initialize
172
201
  super
173
202
  self.tags = %w( form script comment blockquote )
@@ -180,7 +209,7 @@ class CommentScrubber < Rails::Html::PermitScrubber
180
209
  end
181
210
  ```
182
211
 
183
- See `Rails::Html::PermitScrubber` documentation to learn more about which methods can be overridden.
212
+ See `Rails::HTML::PermitScrubber` documentation to learn more about which methods can be overridden.
184
213
 
185
214
  #### Custom Scrubber in a Rails app
186
215
 
@@ -190,26 +219,44 @@ Using the `CommentScrubber` from above, you can use this in a Rails view like so
190
219
  <%= sanitize @comment, scrubber: CommentScrubber.new %>
191
220
  ```
192
221
 
222
+ ## Installation
223
+
224
+ Add this line to your application's Gemfile:
225
+
226
+ gem 'rails-html-sanitizer'
227
+
228
+ And then execute:
229
+
230
+ $ bundle
231
+
232
+ Or install it yourself as:
233
+
234
+ $ gem install rails-html-sanitizer
235
+
236
+
193
237
  ## Read more
194
238
 
195
239
  Loofah is what underlies the sanitizers and scrubbers of rails-html-sanitizer.
240
+
196
241
  - [Loofah and Loofah Scrubbers](https://github.com/flavorjones/loofah)
197
242
 
198
243
  The `node` argument passed to some methods in a custom scrubber is an instance of `Nokogiri::XML::Node`.
244
+
199
245
  - [`Nokogiri::XML::Node`](https://nokogiri.org/rdoc/Nokogiri/XML/Node.html)
200
246
  - [Nokogiri](http://nokogiri.org)
201
247
 
202
- ## Contributing to Rails Html Sanitizers
203
248
 
204
- Rails Html Sanitizers is work of many contributors. You're encouraged to submit pull requests, propose features and discuss issues.
249
+ ## Contributing to Rails HTML Sanitizers
250
+
251
+ Rails HTML Sanitizers is work of many contributors. You're encouraged to submit pull requests, propose features and discuss issues.
205
252
 
206
253
  See [CONTRIBUTING](CONTRIBUTING.md).
207
254
 
208
255
  ### Security reports
209
256
 
210
- Trying to report a possible security vulnerability in this project? Please
211
- check out our [security policy](https://rubyonrails.org/security) for
212
- guidelines about how to proceed.
257
+ Trying to report a possible security vulnerability in this project? Please check out the [Rails project's security policy](https://rubyonrails.org/security) for instructions.
258
+
213
259
 
214
260
  ## License
215
- Rails Html Sanitizers is released under the [MIT License](MIT-LICENSE).
261
+
262
+ Rails HTML Sanitizers is released under the [MIT License](MIT-LICENSE).
@@ -1,7 +1,9 @@
1
+ # frozen_string_literal: true
2
+
1
3
  module Rails
2
- module Html
4
+ module HTML
3
5
  class Sanitizer
4
- VERSION = "1.5.0"
6
+ VERSION = "1.6.0.rc1"
5
7
  end
6
8
  end
7
9
  end