rails-html-sanitizer 1.5.0 → 1.6.0.rc1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +58 -0
- data/MIT-LICENSE +1 -1
- data/README.md +95 -48
- data/lib/rails/html/sanitizer/version.rb +4 -2
- data/lib/rails/html/sanitizer.rb +367 -104
- data/lib/rails/html/scrubbers.rb +70 -69
- data/lib/rails-html-sanitizer.rb +7 -23
- data/test/rails_api_test.rb +74 -0
- data/test/sanitizer_test.rb +900 -590
- data/test/scrubbers_test.rb +49 -36
- metadata +21 -65
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 369872075a1b555eb1dbcdf744e8d9f01aa4ba4c8f29449ba61668da5c4063ff
|
4
|
+
data.tar.gz: 1ae0e8e36e37c51687c965c33d55c1a1eaaab9d4e71d089378ee62fc340e0cd1
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: f8c948ee3f76bb85018a3491d97f89b2957247f2cae35b650ee8d1682d482377e76e2150bbf8a81a9a1aaea4384af321c36c9a621c0c1a71a5dd079cb482a144
|
7
|
+
data.tar.gz: 070f318bcdfb024310b59fc8ceec848c937e0d7e5c4824c40cbb80a9b783e96d98b3f8f67a19630f6fe26aaee35769df84e24aefb198b58a0b06f825a18259a4
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,52 @@
|
|
1
|
+
## 1.6.0.rc1 / 2023-05-24
|
2
|
+
|
3
|
+
* Sanitizers that use an HTML5 parser are now available on platforms supported by
|
4
|
+
Nokogiri::HTML5. These are available as:
|
5
|
+
|
6
|
+
- `Rails::HTML5::FullSanitizer`
|
7
|
+
- `Rails::HTML5::LinkSanitizer`
|
8
|
+
- `Rails::HTML5::SafeListSanitizer`
|
9
|
+
|
10
|
+
And a new "vendor" is provided at `Rails::HTML5::Sanitizer` that can be used in a future version
|
11
|
+
of Rails.
|
12
|
+
|
13
|
+
Note that for symmetry `Rails::HTML4::Sanitizer` is also added, though its behavior is identical
|
14
|
+
to the vendor class methods on `Rails::HTML::Sanitizer`.
|
15
|
+
|
16
|
+
*Mike Dalessio*
|
17
|
+
|
18
|
+
* Module namespaces have changed, but backwards compatibility is provided by aliases.
|
19
|
+
|
20
|
+
The library defines three additional modules:
|
21
|
+
|
22
|
+
- `Rails::HTML` for general functionality (replacing `Rails::Html`)
|
23
|
+
- `Rails::HTML4` containing sanitizers that parse content as HTML4
|
24
|
+
- `Rails::HTML5` containing sanitizers that parse content as HTML5
|
25
|
+
|
26
|
+
The following aliases are maintained for backwards compatibility:
|
27
|
+
|
28
|
+
- `Rails::Html` points to `Rails::HTML`
|
29
|
+
- `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
|
30
|
+
- `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
|
31
|
+
- `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
|
32
|
+
|
33
|
+
*Mike Dalessio*
|
34
|
+
|
35
|
+
* `LinkSanitizer` always returns UTF-8 encoded strings. `SafeListSanitizer` and `FullSanitizer`
|
36
|
+
already ensured this encoding.
|
37
|
+
|
38
|
+
*Mike Dalessio*
|
39
|
+
|
40
|
+
* `SafeListSanitizer` allows `time` tag and `lang` attribute by default.
|
41
|
+
|
42
|
+
*Mike Dalessio*
|
43
|
+
|
44
|
+
* The constant `Rails::Html::XPATHS_TO_REMOVE` has been removed. It's not necessary with the
|
45
|
+
existing sanitizers, and should have been a private constant all along anyway.
|
46
|
+
|
47
|
+
*Mike Dalessio*
|
48
|
+
|
49
|
+
|
1
50
|
## 1.5.0 / 2023-01-20
|
2
51
|
|
3
52
|
* `SafeListSanitizer`, `PermitScrubber`, and `TargetScrubber` now all support pruning of unsafe tags.
|
@@ -7,6 +56,7 @@
|
|
7
56
|
|
8
57
|
*seyerian*
|
9
58
|
|
59
|
+
|
10
60
|
## 1.4.4 / 2022-12-13
|
11
61
|
|
12
62
|
* Address inefficient regular expression complexity with certain configurations of Rails::Html::Sanitizer.
|
@@ -52,6 +102,7 @@
|
|
52
102
|
|
53
103
|
*Mike Dalessio*
|
54
104
|
|
105
|
+
|
55
106
|
## 1.4.2 / 2021-08-23
|
56
107
|
|
57
108
|
* Slightly improve performance.
|
@@ -60,6 +111,7 @@
|
|
60
111
|
|
61
112
|
*Mike Dalessio*
|
62
113
|
|
114
|
+
|
63
115
|
## 1.4.1 / 2021-08-18
|
64
116
|
|
65
117
|
* Fix regression in v1.4.0 that did not pass comment nodes to the scrubber.
|
@@ -72,6 +124,7 @@
|
|
72
124
|
|
73
125
|
*Mike Dalessio*
|
74
126
|
|
127
|
+
|
75
128
|
## 1.4.0 / 2021-08-18
|
76
129
|
|
77
130
|
* Processing Instructions are no longer allowed by Rails::Html::PermitScrubber
|
@@ -84,12 +137,14 @@
|
|
84
137
|
|
85
138
|
*Mike Dalessio*
|
86
139
|
|
140
|
+
|
87
141
|
## 1.3.0
|
88
142
|
|
89
143
|
* Address deprecations in Loofah 2.3.0.
|
90
144
|
|
91
145
|
*Josh Goodall*
|
92
146
|
|
147
|
+
|
93
148
|
## 1.2.0
|
94
149
|
|
95
150
|
* Remove needless `white_list_sanitizer` deprecation.
|
@@ -104,6 +159,7 @@
|
|
104
159
|
|
105
160
|
*Kasper Timm Hansen*
|
106
161
|
|
162
|
+
|
107
163
|
## 1.1.0
|
108
164
|
|
109
165
|
* Add `safe_list_sanitizer` and deprecate `white_list_sanitizer` to be removed
|
@@ -121,10 +177,12 @@
|
|
121
177
|
|
122
178
|
*Kasper Timm Hansen*
|
123
179
|
|
180
|
+
|
124
181
|
## 1.0.1
|
125
182
|
|
126
183
|
* Added support for Rails 4.2.0.beta2 and above
|
127
184
|
|
185
|
+
|
128
186
|
## 1.0.0
|
129
187
|
|
130
188
|
* First release.
|
data/MIT-LICENSE
CHANGED
data/README.md
CHANGED
@@ -1,31 +1,17 @@
|
|
1
|
-
# Rails
|
1
|
+
# Rails HTML Sanitizers
|
2
2
|
|
3
|
-
|
4
|
-
applications, i.e. in the `sanitize`, `sanitize_css`, `strip_tags` and `strip_links` methods.
|
3
|
+
This gem is responsible for sanitizing HTML fragments in Rails applications. Specifically, this is the set of sanitizers used to implement the Action View `SanitizerHelper` methods `sanitize`, `sanitize_css`, `strip_tags` and `strip_links`.
|
5
4
|
|
6
|
-
Rails
|
5
|
+
Rails HTML Sanitizer is only intended to be used with Rails applications. If you need similar functionality but aren't using Rails, consider using the underlying sanitization library [Loofah](https://github.com/flavorjones/loofah) directly.
|
7
6
|
|
8
|
-
## Installation
|
9
|
-
|
10
|
-
Add this line to your application's Gemfile:
|
11
|
-
|
12
|
-
gem 'rails-html-sanitizer'
|
13
|
-
|
14
|
-
And then execute:
|
15
|
-
|
16
|
-
$ bundle
|
17
|
-
|
18
|
-
Or install it yourself as:
|
19
|
-
|
20
|
-
$ gem install rails-html-sanitizer
|
21
7
|
|
22
8
|
## Usage
|
23
9
|
|
24
10
|
### A note on HTML entities
|
25
11
|
|
26
|
-
__Rails
|
12
|
+
__Rails HTML sanitizers are intended to be used by the view layer, at page-render time. They are *not* intended to sanitize persisted strings that will be sanitized *again* at page-render time.__
|
27
13
|
|
28
|
-
Proper HTML sanitization will replace some characters with HTML entities. For example, `<` will be
|
14
|
+
Proper HTML sanitization will replace some characters with HTML entities. For example, text containing a `<` character will be updated to contain `<` to ensure that the markup is well-formed.
|
29
15
|
|
30
16
|
This is important to keep in mind because __HTML entities will render improperly if they are sanitized twice.__
|
31
17
|
|
@@ -47,62 +33,104 @@ You might simply choose to persist the untrusted string as-is (the raw input), a
|
|
47
33
|
|
48
34
|
That raw string, if rendered in an non-HTML context (like SMS), must also be sanitized by a method appropriate for that context. You may wish to look into using [Loofah](https://github.com/flavorjones/loofah) or [Sanitize](https://github.com/rgrove/sanitize) to customize how this sanitization works, including omitting HTML entities in the final string.
|
49
35
|
|
50
|
-
If you really want to sanitize the string that's stored in your database, you may wish to look into [Loofah::ActiveRecord](https://github.com/flavorjones/loofah-activerecord) rather than use the Rails
|
36
|
+
If you really want to sanitize the string that's stored in your database, you may wish to look into [Loofah::ActiveRecord](https://github.com/flavorjones/loofah-activerecord) rather than use the Rails HTML sanitizers.
|
37
|
+
|
38
|
+
|
39
|
+
### A note on module names
|
40
|
+
|
41
|
+
In versions < 1.6, the only module defined by this library was `Rails::Html`. Starting in 1.6, we define three additional modules:
|
42
|
+
|
43
|
+
- `Rails::HTML` for general functionality (replacing `Rails::Html`)
|
44
|
+
- `Rails::HTML4` containing sanitizers that parse content as HTML4
|
45
|
+
- `Rails::HTML5` containing sanitizers that parse content as HTML5 (if supported)
|
46
|
+
|
47
|
+
The following aliases are maintained for backwards compatibility:
|
48
|
+
|
49
|
+
- `Rails::Html` points to `Rails::HTML`
|
50
|
+
- `Rails::HTML::FullSanitizer` points to `Rails::HTML4::FullSanitizer`
|
51
|
+
- `Rails::HTML::LinkSanitizer` points to `Rails::HTML4::LinkSanitizer`
|
52
|
+
- `Rails::HTML::SafeListSanitizer` points to `Rails::HTML4::SafeListSanitizer`
|
51
53
|
|
52
54
|
|
53
55
|
### Sanitizers
|
54
56
|
|
55
|
-
All sanitizers respond to `sanitize
|
57
|
+
All sanitizers respond to `sanitize`, and are available in variants that use either HTML4 or HTML5 parsing, under the `Rails::HTML4` and `Rails::HTML5` namespaces, respectively.
|
58
|
+
|
59
|
+
NOTE: The HTML5 sanitizers are not supported on JRuby. Users may programmatically check for support by calling `Rails::HTML::Sanitizer.html5_support?`.
|
60
|
+
|
56
61
|
|
57
62
|
#### FullSanitizer
|
58
63
|
|
59
64
|
```ruby
|
60
|
-
full_sanitizer = Rails::
|
65
|
+
full_sanitizer = Rails::HTML5::FullSanitizer.new
|
61
66
|
full_sanitizer.sanitize("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")
|
62
67
|
# => Bold no more! See more here...
|
63
68
|
```
|
64
69
|
|
70
|
+
or, if you insist on parsing the content as HTML4:
|
71
|
+
|
72
|
+
```ruby
|
73
|
+
full_sanitizer = Rails::HTML4::FullSanitizer.new
|
74
|
+
full_sanitizer.sanitize("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")
|
75
|
+
# => Bold no more! See more here...
|
76
|
+
```
|
77
|
+
|
78
|
+
HTML5 version:
|
79
|
+
|
80
|
+
|
81
|
+
|
65
82
|
#### LinkSanitizer
|
66
83
|
|
67
84
|
```ruby
|
68
|
-
link_sanitizer = Rails::
|
85
|
+
link_sanitizer = Rails::HTML5::LinkSanitizer.new
|
69
86
|
link_sanitizer.sanitize('<a href="example.com">Only the link text will be kept.</a>')
|
70
87
|
# => Only the link text will be kept.
|
71
88
|
```
|
72
89
|
|
90
|
+
or, if you insist on parsing the content as HTML4:
|
91
|
+
|
92
|
+
```ruby
|
93
|
+
link_sanitizer = Rails::HTML4::LinkSanitizer.new
|
94
|
+
link_sanitizer.sanitize('<a href="example.com">Only the link text will be kept.</a>')
|
95
|
+
# => Only the link text will be kept.
|
96
|
+
```
|
97
|
+
|
98
|
+
|
73
99
|
#### SafeListSanitizer
|
74
100
|
|
101
|
+
This sanitizer is also available as an HTML4 variant, but for simplicity we'll document only the HTML5 variant below.
|
102
|
+
|
75
103
|
```ruby
|
76
|
-
safe_list_sanitizer = Rails::
|
104
|
+
safe_list_sanitizer = Rails::HTML5::SafeListSanitizer.new
|
77
105
|
|
78
106
|
# sanitize via an extensive safe list of allowed elements
|
79
107
|
safe_list_sanitizer.sanitize(@article.body)
|
80
108
|
|
81
|
-
#
|
109
|
+
# sanitize only the supplied tags and attributes
|
82
110
|
safe_list_sanitizer.sanitize(@article.body, tags: %w(table tr td), attributes: %w(id class style))
|
83
111
|
|
84
|
-
#
|
112
|
+
# sanitize via a custom scrubber
|
85
113
|
safe_list_sanitizer.sanitize(@article.body, scrubber: ArticleScrubber.new)
|
86
114
|
|
87
|
-
#
|
88
|
-
safe_list_sanitizer.
|
115
|
+
# prune nodes from the tree instead of stripping tags and leaving inner content
|
116
|
+
safe_list_sanitizer = Rails::HTML5::SafeListSanitizer.new(prune: true)
|
89
117
|
|
90
|
-
#
|
91
|
-
safe_list_sanitizer
|
118
|
+
# the sanitizer can also sanitize css
|
119
|
+
safe_list_sanitizer.sanitize_css('background-color: #000;')
|
92
120
|
```
|
93
121
|
|
94
122
|
### Scrubbers
|
95
123
|
|
96
124
|
Scrubbers are objects responsible for removing nodes or attributes you don't want in your HTML document.
|
97
125
|
|
98
|
-
This gem includes two scrubbers `Rails::
|
126
|
+
This gem includes two scrubbers `Rails::HTML::PermitScrubber` and `Rails::HTML::TargetScrubber`.
|
99
127
|
|
100
|
-
#### `Rails::
|
128
|
+
#### `Rails::HTML::PermitScrubber`
|
101
129
|
|
102
130
|
This scrubber allows you to permit only the tags and attributes you want.
|
103
131
|
|
104
132
|
```ruby
|
105
|
-
scrubber = Rails::
|
133
|
+
scrubber = Rails::HTML::PermitScrubber.new
|
106
134
|
scrubber.tags = ['a']
|
107
135
|
|
108
136
|
html_fragment = Loofah.fragment('<a><img/ ></a>')
|
@@ -113,14 +141,14 @@ html_fragment.to_s # => "<a></a>"
|
|
113
141
|
By default, inner content is left, but it can be removed as well.
|
114
142
|
|
115
143
|
```ruby
|
116
|
-
scrubber = Rails::
|
144
|
+
scrubber = Rails::HTML::PermitScrubber.new
|
117
145
|
scrubber.tags = ['a']
|
118
146
|
|
119
147
|
html_fragment = Loofah.fragment('<a><span>text</span></a>')
|
120
148
|
html_fragment.scrub!(scrubber)
|
121
149
|
html_fragment.to_s # => "<a>text</a>"
|
122
150
|
|
123
|
-
scrubber = Rails::
|
151
|
+
scrubber = Rails::HTML::PermitScrubber.new(prune: true)
|
124
152
|
scrubber.tags = ['a']
|
125
153
|
|
126
154
|
html_fragment = Loofah.fragment('<a><span>text</span></a>')
|
@@ -128,16 +156,16 @@ html_fragment.scrub!(scrubber)
|
|
128
156
|
html_fragment.to_s # => "<a></a>"
|
129
157
|
```
|
130
158
|
|
131
|
-
#### `Rails::
|
159
|
+
#### `Rails::HTML::TargetScrubber`
|
132
160
|
|
133
161
|
Where `PermitScrubber` picks out tags and attributes to permit in sanitization,
|
134
|
-
`Rails::
|
162
|
+
`Rails::HTML::TargetScrubber` targets them for removal. See https://github.com/flavorjones/loofah/blob/main/lib/loofah/html5/safelist.rb for the tag list.
|
135
163
|
|
136
164
|
**Note:** by default, it will scrub anything that is not part of the permitted tags from
|
137
165
|
loofah `HTML5::Scrub.allowed_element?`.
|
138
166
|
|
139
167
|
```ruby
|
140
|
-
scrubber = Rails::
|
168
|
+
scrubber = Rails::HTML::TargetScrubber.new
|
141
169
|
scrubber.tags = ['img']
|
142
170
|
|
143
171
|
html_fragment = Loofah.fragment('<a><img/ ></a>')
|
@@ -148,26 +176,27 @@ html_fragment.to_s # => "<a></a>"
|
|
148
176
|
Similarly to `PermitScrubber`, nodes can be fully pruned.
|
149
177
|
|
150
178
|
```ruby
|
151
|
-
scrubber = Rails::
|
179
|
+
scrubber = Rails::HTML::TargetScrubber.new
|
152
180
|
scrubber.tags = ['span']
|
153
181
|
|
154
182
|
html_fragment = Loofah.fragment('<a><span>text</span></a>')
|
155
183
|
html_fragment.scrub!(scrubber)
|
156
184
|
html_fragment.to_s # => "<a>text</a>"
|
157
185
|
|
158
|
-
scrubber = Rails::
|
186
|
+
scrubber = Rails::HTML::TargetScrubber.new(prune: true)
|
159
187
|
scrubber.tags = ['span']
|
160
188
|
|
161
189
|
html_fragment = Loofah.fragment('<a><span>text</span></a>')
|
162
190
|
html_fragment.scrub!(scrubber)
|
163
191
|
html_fragment.to_s # => "<a></a>"
|
164
192
|
```
|
193
|
+
|
165
194
|
#### Custom Scrubbers
|
166
195
|
|
167
196
|
You can also create custom scrubbers in your application if you want to.
|
168
197
|
|
169
198
|
```ruby
|
170
|
-
class CommentScrubber < Rails::
|
199
|
+
class CommentScrubber < Rails::HTML::PermitScrubber
|
171
200
|
def initialize
|
172
201
|
super
|
173
202
|
self.tags = %w( form script comment blockquote )
|
@@ -180,7 +209,7 @@ class CommentScrubber < Rails::Html::PermitScrubber
|
|
180
209
|
end
|
181
210
|
```
|
182
211
|
|
183
|
-
See `Rails::
|
212
|
+
See `Rails::HTML::PermitScrubber` documentation to learn more about which methods can be overridden.
|
184
213
|
|
185
214
|
#### Custom Scrubber in a Rails app
|
186
215
|
|
@@ -190,26 +219,44 @@ Using the `CommentScrubber` from above, you can use this in a Rails view like so
|
|
190
219
|
<%= sanitize @comment, scrubber: CommentScrubber.new %>
|
191
220
|
```
|
192
221
|
|
222
|
+
## Installation
|
223
|
+
|
224
|
+
Add this line to your application's Gemfile:
|
225
|
+
|
226
|
+
gem 'rails-html-sanitizer'
|
227
|
+
|
228
|
+
And then execute:
|
229
|
+
|
230
|
+
$ bundle
|
231
|
+
|
232
|
+
Or install it yourself as:
|
233
|
+
|
234
|
+
$ gem install rails-html-sanitizer
|
235
|
+
|
236
|
+
|
193
237
|
## Read more
|
194
238
|
|
195
239
|
Loofah is what underlies the sanitizers and scrubbers of rails-html-sanitizer.
|
240
|
+
|
196
241
|
- [Loofah and Loofah Scrubbers](https://github.com/flavorjones/loofah)
|
197
242
|
|
198
243
|
The `node` argument passed to some methods in a custom scrubber is an instance of `Nokogiri::XML::Node`.
|
244
|
+
|
199
245
|
- [`Nokogiri::XML::Node`](https://nokogiri.org/rdoc/Nokogiri/XML/Node.html)
|
200
246
|
- [Nokogiri](http://nokogiri.org)
|
201
247
|
|
202
|
-
## Contributing to Rails Html Sanitizers
|
203
248
|
|
204
|
-
|
249
|
+
## Contributing to Rails HTML Sanitizers
|
250
|
+
|
251
|
+
Rails HTML Sanitizers is work of many contributors. You're encouraged to submit pull requests, propose features and discuss issues.
|
205
252
|
|
206
253
|
See [CONTRIBUTING](CONTRIBUTING.md).
|
207
254
|
|
208
255
|
### Security reports
|
209
256
|
|
210
|
-
Trying to report a possible security vulnerability in this project? Please
|
211
|
-
|
212
|
-
guidelines about how to proceed.
|
257
|
+
Trying to report a possible security vulnerability in this project? Please check out the [Rails project's security policy](https://rubyonrails.org/security) for instructions.
|
258
|
+
|
213
259
|
|
214
260
|
## License
|
215
|
-
|
261
|
+
|
262
|
+
Rails HTML Sanitizers is released under the [MIT License](MIT-LICENSE).
|