rails-html-sanitizer 1.5.0 → 1.6.0.rc2
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +61 -0
- data/MIT-LICENSE +1 -1
- data/README.md +95 -48
- data/lib/rails/html/sanitizer/version.rb +4 -2
- data/lib/rails/html/sanitizer.rb +371 -104
- data/lib/rails/html/scrubbers.rb +70 -69
- data/lib/rails-html-sanitizer.rb +7 -23
- data/test/rails_api_test.rb +88 -0
- data/test/sanitizer_test.rb +900 -590
- data/test/scrubbers_test.rb +49 -36
- metadata +21 -65
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 3ce8562c96b3e842ebf50227e682c3fa948ebf8474786f100dbf78adff7f98d0
|
4
|
+
data.tar.gz: 7ca7beb76be35dea0dd926819212445e885a2b205ca0b2e45628f58d734a1a9f
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 30d9b9288698da75f713811e8b507edda0645eb4a485b0466847a4b8246aa854cd12e4dae238e61b09e371c950ee8516595b39207d4b10323bf81cf74b0a5114
|
7
|
+
data.tar.gz: 81698f017c423bac3434e7129b70c4ecba80b27a4f8c6d86294d548aeeb9238d98cd65c15f06bcf62c6ee104e8b3beca782c8e9b338302590c33b65ab9ed8121
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,55 @@
|
|
1
|
+
## 1.6.0.rc2 / 2023-05-24
|
2
|
+
|
3
|
+
* HTML5 standards-compliant sanitizers are now available on platforms supported by
|
4
|
+
Nokogiri::HTML5. These are available as:
|
5
|
+
|
6
|
+
- `Rails::HTML5::FullSanitizer`
|
7
|
+
- `Rails::HTML5::LinkSanitizer`
|
8
|
+
- `Rails::HTML5::SafeListSanitizer`
|
9
|
+
|
10
|
+
And a new "vendor" is provided at `Rails::HTML5::Sanitizer` that can be used in a future version
|
11
|
+
of Rails.
|
12
|
+
|
13
|
+
Note that for symmetry `Rails::HTML4::Sanitizer` is also added, though its behavior is identical
|
14
|
+
to the vendor class methods on `Rails::HTML::Sanitizer`.
|
15
|
+
|
16
|
+
Users may call `Rails::HTML::Sanitizer.best_supported_vendor` to get back the HTML5 vendor if it's
|
17
|
+
supported, else the legacy HTML4 vendor.
|
18
|
+
|
19
|
+
*Mike Dalessio*
|
20
|
+
|
21
|
+
* Module namespaces have changed, but backwards compatibility is provided by aliases.
|
22
|
+
|
23
|
+
The library defines three additional modules:
|
24
|
+
|
25
|
+
- `Rails::HTML` for general functionality (replacing `Rails::Html`)
|
26
|
+
- `Rails::HTML4` containing sanitizers that parse content as HTML4
|
27
|
+
- `Rails::HTML5` containing sanitizers that parse content as HTML5
|
28
|
+
|
29
|
+
The following aliases are maintained for backwards compatibility:
|
30
|
+
|
31
|
+
- `Rails::Html` points to `Rails::HTML`
|
32
|
+
- `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
|
33
|
+
- `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
|
34
|
+
- `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
|
35
|
+
|
36
|
+
*Mike Dalessio*
|
37
|
+
|
38
|
+
* `LinkSanitizer` always returns UTF-8 encoded strings. `SafeListSanitizer` and `FullSanitizer`
|
39
|
+
already ensured this encoding.
|
40
|
+
|
41
|
+
*Mike Dalessio*
|
42
|
+
|
43
|
+
* `SafeListSanitizer` allows `time` tag and `lang` attribute by default.
|
44
|
+
|
45
|
+
*Mike Dalessio*
|
46
|
+
|
47
|
+
* The constant `Rails::Html::XPATHS_TO_REMOVE` has been removed. It's not necessary with the
|
48
|
+
existing sanitizers, and should have been a private constant all along anyway.
|
49
|
+
|
50
|
+
*Mike Dalessio*
|
51
|
+
|
52
|
+
|
1
53
|
## 1.5.0 / 2023-01-20
|
2
54
|
|
3
55
|
* `SafeListSanitizer`, `PermitScrubber`, and `TargetScrubber` now all support pruning of unsafe tags.
|
@@ -7,6 +59,7 @@
|
|
7
59
|
|
8
60
|
*seyerian*
|
9
61
|
|
62
|
+
|
10
63
|
## 1.4.4 / 2022-12-13
|
11
64
|
|
12
65
|
* Address inefficient regular expression complexity with certain configurations of Rails::Html::Sanitizer.
|
@@ -52,6 +105,7 @@
|
|
52
105
|
|
53
106
|
*Mike Dalessio*
|
54
107
|
|
108
|
+
|
55
109
|
## 1.4.2 / 2021-08-23
|
56
110
|
|
57
111
|
* Slightly improve performance.
|
@@ -60,6 +114,7 @@
|
|
60
114
|
|
61
115
|
*Mike Dalessio*
|
62
116
|
|
117
|
+
|
63
118
|
## 1.4.1 / 2021-08-18
|
64
119
|
|
65
120
|
* Fix regression in v1.4.0 that did not pass comment nodes to the scrubber.
|
@@ -72,6 +127,7 @@
|
|
72
127
|
|
73
128
|
*Mike Dalessio*
|
74
129
|
|
130
|
+
|
75
131
|
## 1.4.0 / 2021-08-18
|
76
132
|
|
77
133
|
* Processing Instructions are no longer allowed by Rails::Html::PermitScrubber
|
@@ -84,12 +140,14 @@
|
|
84
140
|
|
85
141
|
*Mike Dalessio*
|
86
142
|
|
143
|
+
|
87
144
|
## 1.3.0
|
88
145
|
|
89
146
|
* Address deprecations in Loofah 2.3.0.
|
90
147
|
|
91
148
|
*Josh Goodall*
|
92
149
|
|
150
|
+
|
93
151
|
## 1.2.0
|
94
152
|
|
95
153
|
* Remove needless `white_list_sanitizer` deprecation.
|
@@ -104,6 +162,7 @@
|
|
104
162
|
|
105
163
|
*Kasper Timm Hansen*
|
106
164
|
|
165
|
+
|
107
166
|
## 1.1.0
|
108
167
|
|
109
168
|
* Add `safe_list_sanitizer` and deprecate `white_list_sanitizer` to be removed
|
@@ -121,10 +180,12 @@
|
|
121
180
|
|
122
181
|
*Kasper Timm Hansen*
|
123
182
|
|
183
|
+
|
124
184
|
## 1.0.1
|
125
185
|
|
126
186
|
* Added support for Rails 4.2.0.beta2 and above
|
127
187
|
|
188
|
+
|
128
189
|
## 1.0.0
|
129
190
|
|
130
191
|
* First release.
|
data/MIT-LICENSE
CHANGED
data/README.md
CHANGED
@@ -1,31 +1,17 @@
|
|
1
|
-
# Rails
|
1
|
+
# Rails HTML Sanitizers
|
2
2
|
|
3
|
-
|
4
|
-
applications, i.e. in the `sanitize`, `sanitize_css`, `strip_tags` and `strip_links` methods.
|
3
|
+
This gem is responsible for sanitizing HTML fragments in Rails applications. Specifically, this is the set of sanitizers used to implement the Action View `SanitizerHelper` methods `sanitize`, `sanitize_css`, `strip_tags` and `strip_links`.
|
5
4
|
|
6
|
-
Rails
|
5
|
+
Rails HTML Sanitizer is only intended to be used with Rails applications. If you need similar functionality but aren't using Rails, consider using the underlying sanitization library [Loofah](https://github.com/flavorjones/loofah) directly.
|
7
6
|
|
8
|
-
## Installation
|
9
|
-
|
10
|
-
Add this line to your application's Gemfile:
|
11
|
-
|
12
|
-
gem 'rails-html-sanitizer'
|
13
|
-
|
14
|
-
And then execute:
|
15
|
-
|
16
|
-
$ bundle
|
17
|
-
|
18
|
-
Or install it yourself as:
|
19
|
-
|
20
|
-
$ gem install rails-html-sanitizer
|
21
7
|
|
22
8
|
## Usage
|
23
9
|
|
24
10
|
### A note on HTML entities
|
25
11
|
|
26
|
-
__Rails
|
12
|
+
__Rails HTML sanitizers are intended to be used by the view layer, at page-render time. They are *not* intended to sanitize persisted strings that will be sanitized *again* at page-render time.__
|
27
13
|
|
28
|
-
Proper HTML sanitization will replace some characters with HTML entities. For example, `<` will be
|
14
|
+
Proper HTML sanitization will replace some characters with HTML entities. For example, text containing a `<` character will be updated to contain `<` to ensure that the markup is well-formed.
|
29
15
|
|
30
16
|
This is important to keep in mind because __HTML entities will render improperly if they are sanitized twice.__
|
31
17
|
|
@@ -47,62 +33,104 @@ You might simply choose to persist the untrusted string as-is (the raw input), a
|
|
47
33
|
|
48
34
|
That raw string, if rendered in an non-HTML context (like SMS), must also be sanitized by a method appropriate for that context. You may wish to look into using [Loofah](https://github.com/flavorjones/loofah) or [Sanitize](https://github.com/rgrove/sanitize) to customize how this sanitization works, including omitting HTML entities in the final string.
|
49
35
|
|
50
|
-
If you really want to sanitize the string that's stored in your database, you may wish to look into [Loofah::ActiveRecord](https://github.com/flavorjones/loofah-activerecord) rather than use the Rails
|
36
|
+
If you really want to sanitize the string that's stored in your database, you may wish to look into [Loofah::ActiveRecord](https://github.com/flavorjones/loofah-activerecord) rather than use the Rails HTML sanitizers.
|
37
|
+
|
38
|
+
|
39
|
+
### A note on module names
|
40
|
+
|
41
|
+
In versions < 1.6, the only module defined by this library was `Rails::Html`. Starting in 1.6, we define three additional modules:
|
42
|
+
|
43
|
+
- `Rails::HTML` for general functionality (replacing `Rails::Html`)
|
44
|
+
- `Rails::HTML4` containing sanitizers that parse content as HTML4
|
45
|
+
- `Rails::HTML5` containing sanitizers that parse content as HTML5 (if supported)
|
46
|
+
|
47
|
+
The following aliases are maintained for backwards compatibility:
|
48
|
+
|
49
|
+
- `Rails::Html` points to `Rails::HTML`
|
50
|
+
- `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
|
51
|
+
- `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
|
52
|
+
- `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
|
51
53
|
|
52
54
|
|
53
55
|
### Sanitizers
|
54
56
|
|
55
|
-
All sanitizers respond to `sanitize
|
57
|
+
All sanitizers respond to `sanitize`, and are available in variants that use either HTML4 or HTML5 parsing, under the `Rails::HTML4` and `Rails::HTML5` namespaces, respectively.
|
58
|
+
|
59
|
+
NOTE: The HTML5 sanitizers are not supported on JRuby. Users may programmatically check for support by calling `Rails::HTML::Sanitizer.html5_support?`.
|
60
|
+
|
56
61
|
|
57
62
|
#### FullSanitizer
|
58
63
|
|
59
64
|
```ruby
|
60
|
-
full_sanitizer = Rails::
|
65
|
+
full_sanitizer = Rails::HTML5::FullSanitizer.new
|
61
66
|
full_sanitizer.sanitize("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")
|
62
67
|
# => Bold no more! See more here...
|
63
68
|
```
|
64
69
|
|
70
|
+
or, if you insist on parsing the content as HTML4:
|
71
|
+
|
72
|
+
```ruby
|
73
|
+
full_sanitizer = Rails::HTML4::FullSanitizer.new
|
74
|
+
full_sanitizer.sanitize("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")
|
75
|
+
# => Bold no more! See more here...
|
76
|
+
```
|
77
|
+
|
78
|
+
HTML5 version:
|
79
|
+
|
80
|
+
|
81
|
+
|
65
82
|
#### LinkSanitizer
|
66
83
|
|
67
84
|
```ruby
|
68
|
-
link_sanitizer = Rails::
|
85
|
+
link_sanitizer = Rails::HTML5::LinkSanitizer.new
|
69
86
|
link_sanitizer.sanitize('<a href="example.com">Only the link text will be kept.</a>')
|
70
87
|
# => Only the link text will be kept.
|
71
88
|
```
|
72
89
|
|
90
|
+
or, if you insist on parsing the content as HTML4:
|
91
|
+
|
92
|
+
```ruby
|
93
|
+
link_sanitizer = Rails::HTML4::LinkSanitizer.new
|
94
|
+
link_sanitizer.sanitize('<a href="example.com">Only the link text will be kept.</a>')
|
95
|
+
# => Only the link text will be kept.
|
96
|
+
```
|
97
|
+
|
98
|
+
|
73
99
|
#### SafeListSanitizer
|
74
100
|
|
101
|
+
This sanitizer is also available as an HTML4 variant, but for simplicity we'll document only the HTML5 variant below.
|
102
|
+
|
75
103
|
```ruby
|
76
|
-
safe_list_sanitizer = Rails::
|
104
|
+
safe_list_sanitizer = Rails::HTML5::SafeListSanitizer.new
|
77
105
|
|
78
106
|
# sanitize via an extensive safe list of allowed elements
|
79
107
|
safe_list_sanitizer.sanitize(@article.body)
|
80
108
|
|
81
|
-
#
|
109
|
+
# sanitize only the supplied tags and attributes
|
82
110
|
safe_list_sanitizer.sanitize(@article.body, tags: %w(table tr td), attributes: %w(id class style))
|
83
111
|
|
84
|
-
#
|
112
|
+
# sanitize via a custom scrubber
|
85
113
|
safe_list_sanitizer.sanitize(@article.body, scrubber: ArticleScrubber.new)
|
86
114
|
|
87
|
-
#
|
88
|
-
safe_list_sanitizer.
|
115
|
+
# prune nodes from the tree instead of stripping tags and leaving inner content
|
116
|
+
safe_list_sanitizer = Rails::HTML5::SafeListSanitizer.new(prune: true)
|
89
117
|
|
90
|
-
#
|
91
|
-
safe_list_sanitizer
|
118
|
+
# the sanitizer can also sanitize css
|
119
|
+
safe_list_sanitizer.sanitize_css('background-color: #000;')
|
92
120
|
```
|
93
121
|
|
94
122
|
### Scrubbers
|
95
123
|
|
96
124
|
Scrubbers are objects responsible for removing nodes or attributes you don't want in your HTML document.
|
97
125
|
|
98
|
-
This gem includes two scrubbers `Rails::
|
126
|
+
This gem includes two scrubbers `Rails::HTML::PermitScrubber` and `Rails::HTML::TargetScrubber`.
|
99
127
|
|
100
|
-
#### `Rails::
|
128
|
+
#### `Rails::HTML::PermitScrubber`
|
101
129
|
|
102
130
|
This scrubber allows you to permit only the tags and attributes you want.
|
103
131
|
|
104
132
|
```ruby
|
105
|
-
scrubber = Rails::
|
133
|
+
scrubber = Rails::HTML::PermitScrubber.new
|
106
134
|
scrubber.tags = ['a']
|
107
135
|
|
108
136
|
html_fragment = Loofah.fragment('<a><img/ ></a>')
|
@@ -113,14 +141,14 @@ html_fragment.to_s # => "<a></a>"
|
|
113
141
|
By default, inner content is left, but it can be removed as well.
|
114
142
|
|
115
143
|
```ruby
|
116
|
-
scrubber = Rails::
|
144
|
+
scrubber = Rails::HTML::PermitScrubber.new
|
117
145
|
scrubber.tags = ['a']
|
118
146
|
|
119
147
|
html_fragment = Loofah.fragment('<a><span>text</span></a>')
|
120
148
|
html_fragment.scrub!(scrubber)
|
121
149
|
html_fragment.to_s # => "<a>text</a>"
|
122
150
|
|
123
|
-
scrubber = Rails::
|
151
|
+
scrubber = Rails::HTML::PermitScrubber.new(prune: true)
|
124
152
|
scrubber.tags = ['a']
|
125
153
|
|
126
154
|
html_fragment = Loofah.fragment('<a><span>text</span></a>')
|
@@ -128,16 +156,16 @@ html_fragment.scrub!(scrubber)
|
|
128
156
|
html_fragment.to_s # => "<a></a>"
|
129
157
|
```
|
130
158
|
|
131
|
-
#### `Rails::
|
159
|
+
#### `Rails::HTML::TargetScrubber`
|
132
160
|
|
133
161
|
Where `PermitScrubber` picks out tags and attributes to permit in sanitization,
|
134
|
-
`Rails::
|
162
|
+
`Rails::HTML::TargetScrubber` targets them for removal. See https://github.com/flavorjones/loofah/blob/main/lib/loofah/html5/safelist.rb for the tag list.
|
135
163
|
|
136
164
|
**Note:** by default, it will scrub anything that is not part of the permitted tags from
|
137
165
|
loofah `HTML5::Scrub.allowed_element?`.
|
138
166
|
|
139
167
|
```ruby
|
140
|
-
scrubber = Rails::
|
168
|
+
scrubber = Rails::HTML::TargetScrubber.new
|
141
169
|
scrubber.tags = ['img']
|
142
170
|
|
143
171
|
html_fragment = Loofah.fragment('<a><img/ ></a>')
|
@@ -148,26 +176,27 @@ html_fragment.to_s # => "<a></a>"
|
|
148
176
|
Similarly to `PermitScrubber`, nodes can be fully pruned.
|
149
177
|
|
150
178
|
```ruby
|
151
|
-
scrubber = Rails::
|
179
|
+
scrubber = Rails::HTML::TargetScrubber.new
|
152
180
|
scrubber.tags = ['span']
|
153
181
|
|
154
182
|
html_fragment = Loofah.fragment('<a><span>text</span></a>')
|
155
183
|
html_fragment.scrub!(scrubber)
|
156
184
|
html_fragment.to_s # => "<a>text</a>"
|
157
185
|
|
158
|
-
scrubber = Rails::
|
186
|
+
scrubber = Rails::HTML::TargetScrubber.new(prune: true)
|
159
187
|
scrubber.tags = ['span']
|
160
188
|
|
161
189
|
html_fragment = Loofah.fragment('<a><span>text</span></a>')
|
162
190
|
html_fragment.scrub!(scrubber)
|
163
191
|
html_fragment.to_s # => "<a></a>"
|
164
192
|
```
|
193
|
+
|
165
194
|
#### Custom Scrubbers
|
166
195
|
|
167
196
|
You can also create custom scrubbers in your application if you want to.
|
168
197
|
|
169
198
|
```ruby
|
170
|
-
class CommentScrubber < Rails::
|
199
|
+
class CommentScrubber < Rails::HTML::PermitScrubber
|
171
200
|
def initialize
|
172
201
|
super
|
173
202
|
self.tags = %w( form script comment blockquote )
|
@@ -180,7 +209,7 @@ class CommentScrubber < Rails::Html::PermitScrubber
|
|
180
209
|
end
|
181
210
|
```
|
182
211
|
|
183
|
-
See `Rails::
|
212
|
+
See `Rails::HTML::PermitScrubber` documentation to learn more about which methods can be overridden.
|
184
213
|
|
185
214
|
#### Custom Scrubber in a Rails app
|
186
215
|
|
@@ -190,26 +219,44 @@ Using the `CommentScrubber` from above, you can use this in a Rails view like so
|
|
190
219
|
<%= sanitize @comment, scrubber: CommentScrubber.new %>
|
191
220
|
```
|
192
221
|
|
222
|
+
## Installation
|
223
|
+
|
224
|
+
Add this line to your application's Gemfile:
|
225
|
+
|
226
|
+
gem 'rails-html-sanitizer'
|
227
|
+
|
228
|
+
And then execute:
|
229
|
+
|
230
|
+
$ bundle
|
231
|
+
|
232
|
+
Or install it yourself as:
|
233
|
+
|
234
|
+
$ gem install rails-html-sanitizer
|
235
|
+
|
236
|
+
|
193
237
|
## Read more
|
194
238
|
|
195
239
|
Loofah is what underlies the sanitizers and scrubbers of rails-html-sanitizer.
|
240
|
+
|
196
241
|
- [Loofah and Loofah Scrubbers](https://github.com/flavorjones/loofah)
|
197
242
|
|
198
243
|
The `node` argument passed to some methods in a custom scrubber is an instance of `Nokogiri::XML::Node`.
|
244
|
+
|
199
245
|
- [`Nokogiri::XML::Node`](https://nokogiri.org/rdoc/Nokogiri/XML/Node.html)
|
200
246
|
- [Nokogiri](http://nokogiri.org)
|
201
247
|
|
202
|
-
## Contributing to Rails Html Sanitizers
|
203
248
|
|
204
|
-
|
249
|
+
## Contributing to Rails HTML Sanitizers
|
250
|
+
|
251
|
+
Rails HTML Sanitizers is work of many contributors. You're encouraged to submit pull requests, propose features and discuss issues.
|
205
252
|
|
206
253
|
See [CONTRIBUTING](CONTRIBUTING.md).
|
207
254
|
|
208
255
|
### Security reports
|
209
256
|
|
210
|
-
Trying to report a possible security vulnerability in this project? Please
|
211
|
-
|
212
|
-
guidelines about how to proceed.
|
257
|
+
Trying to report a possible security vulnerability in this project? Please check out the [Rails project's security policy](https://rubyonrails.org/security) for instructions.
|
258
|
+
|
213
259
|
|
214
260
|
## License
|
215
|
-
|
261
|
+
|
262
|
+
Rails HTML Sanitizers is released under the [MIT License](MIT-LICENSE).
|