rails-html-sanitizer 1.4.4 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a74021096590326ee357971bec71d2c4507a95cdaf05c8e21d383ce18fee18d3
4
- data.tar.gz: faad0d5f268dad601b633b03912e353fcc2d760fceb253d9cde2064b010b997a
3
+ metadata.gz: 59897b4a0d7f69a21932ec1cb44e24ea0ba4c2cf79ef7101ef32e90f40ad766b
4
+ data.tar.gz: '063919ea5426a6938040672fefeabcaf82087a5ccd12ffc7457e34eb6984042b'
5
5
  SHA512:
6
- metadata.gz: e7f01438708076a283326c78b052ba954a42de4134d8d1d7e7c336c82ecd04c661f75dad3a0f9b1ffebe278f76ef229c98a3f2568801f82d94c94a50f399a2ef
7
- data.tar.gz: 4f44c0e92eb9e565611772ba28d426025621c0517c4217004c3409192991a17498dd38165a6c55561a5347d2fcdf34f51b24101ad6de525604e35785e89efbc0
6
+ metadata.gz: 2b0c23a07bc8acb3c1a039266cf053ad9044670a96620365b3ed722eb9a602def1bebe8de40697a2b12deba61cf224461ae8a4dc93749fa8c9675cda4cd216dd
7
+ data.tar.gz: 5dce2af04dd887e08773a975cc67d93987b15330d023cc68a6cc51322ed73b60681309b25feab1a2c54b0e062696af1dd78c83cdb609e8d179b6ef95419573b3
data/CHANGELOG.md CHANGED
@@ -1,3 +1,12 @@
1
+ ## 1.5.0 / 2023-01-20
2
+
3
+ * `SafeListSanitizer`, `PermitScrubber`, and `TargetScrubber` now all support pruning of unsafe tags.
4
+
5
+ By default, unsafe tags are still stripped, but this behavior can be changed to prune the element
6
+ and its children from the document by passing `prune: true` to any of these classes' constructors.
7
+
8
+ *seyerian*
9
+
1
10
  ## 1.4.4 / 2022-12-13
2
11
 
3
12
  * Address inefficient regular expression complexity with certain configurations of Rails::Html::Sanitizer.
@@ -43,7 +52,6 @@
43
52
 
44
53
  *Mike Dalessio*
45
54
 
46
-
47
55
  ## 1.4.2 / 2021-08-23
48
56
 
49
57
  * Slightly improve performance.
data/README.md CHANGED
@@ -21,6 +21,35 @@ Or install it yourself as:
21
21
 
22
22
  ## Usage
23
23
 
24
+ ### A note on HTML entities
25
+
26
+ __Rails::HTML sanitizers are intended to be used by the view layer, at page-render time. They are *not* intended to sanitize persisted strings that will sanitized *again* at page-render time.__
27
+
28
+ Proper HTML sanitization will replace some characters with HTML entities. For example, `<` will be replaced with `&lt;` to ensure that the markup is well-formed.
29
+
30
+ This is important to keep in mind because __HTML entities will render improperly if they are sanitized twice.__
31
+
32
+
33
+ #### A concrete example showing the problem that can arise
34
+
35
+ Imagine the user is asked to enter their employer's name, which will appear on their public profile page. Then imagine they enter `JPMorgan Chase & Co.`.
36
+
37
+ If you sanitize this before persisting it in the database, the stored string will be `JPMorgan Chase &amp; Co.`
38
+
39
+ When the page is rendered, if this string is sanitized a second time by the view layer, the HTML will contain `JPMorgan Chase &amp;amp; Co.` which will render as "JPMorgan Chase &amp;amp; Co.".
40
+
41
+ Another problem that can arise is rendering the sanitized string in a non-HTML context (for example, if it ends up being part of an SMS message). In this case, it may contain inappropriate HTML entities.
42
+
43
+
44
+ #### Suggested alternatives
45
+
46
+ You might simply choose to persist the untrusted string as-is (the raw input), and then ensure that the string will be properly sanitized by the view layer.
47
+
48
+ That raw string, if rendered in an non-HTML context (like SMS), must also be sanitized by a method appropriate for that context. You may wish to look into using [Loofah](https://github.com/flavorjones/loofah) or [Sanitize](https://github.com/rgrove/sanitize) to customize how this sanitization works, including omitting HTML entities in the final string.
49
+
50
+ If you really want to sanitize the string that's stored in your database, you may wish to look into [Loofah::ActiveRecord](https://github.com/flavorjones/loofah-activerecord) rather than use the Rails::HTML sanitizers.
51
+
52
+
24
53
  ### Sanitizers
25
54
 
26
55
  All sanitizers respond to `sanitize`.
@@ -57,6 +86,9 @@ safe_list_sanitizer.sanitize(@article.body, scrubber: ArticleScrubber.new)
57
86
 
58
87
  # safe list sanitizer can also sanitize css
59
88
  safe_list_sanitizer.sanitize_css('background-color: #000;')
89
+
90
+ # fully prune nodes from the tree instead of stripping tags and leaving inner content
91
+ safe_list_sanitizer = Rails::Html::SafeListSanitizer.new(prune: true)
60
92
  ```
61
93
 
62
94
  ### Scrubbers
@@ -78,6 +110,24 @@ html_fragment.scrub!(scrubber)
78
110
  html_fragment.to_s # => "<a></a>"
79
111
  ```
80
112
 
113
+ By default, inner content is left, but it can be removed as well.
114
+
115
+ ```ruby
116
+ scrubber = Rails::Html::PermitScrubber.new
117
+ scrubber.tags = ['a']
118
+
119
+ html_fragment = Loofah.fragment('<a><span>text</span></a>')
120
+ html_fragment.scrub!(scrubber)
121
+ html_fragment.to_s # => "<a>text</a>"
122
+
123
+ scrubber = Rails::Html::PermitScrubber.new(prune: true)
124
+ scrubber.tags = ['a']
125
+
126
+ html_fragment = Loofah.fragment('<a><span>text</span></a>')
127
+ html_fragment.scrub!(scrubber)
128
+ html_fragment.to_s # => "<a></a>"
129
+ ```
130
+
81
131
  #### `Rails::Html::TargetScrubber`
82
132
 
83
133
  Where `PermitScrubber` picks out tags and attributes to permit in sanitization,
@@ -95,6 +145,23 @@ html_fragment.scrub!(scrubber)
95
145
  html_fragment.to_s # => "<a></a>"
96
146
  ```
97
147
 
148
+ Similarly to `PermitScrubber`, nodes can be fully pruned.
149
+
150
+ ```ruby
151
+ scrubber = Rails::Html::TargetScrubber.new
152
+ scrubber.tags = ['span']
153
+
154
+ html_fragment = Loofah.fragment('<a><span>text</span></a>')
155
+ html_fragment.scrub!(scrubber)
156
+ html_fragment.to_s # => "<a>text</a>"
157
+
158
+ scrubber = Rails::Html::TargetScrubber.new(prune: true)
159
+ scrubber.tags = ['span']
160
+
161
+ html_fragment = Loofah.fragment('<a><span>text</span></a>')
162
+ html_fragment.scrub!(scrubber)
163
+ html_fragment.to_s # => "<a></a>"
164
+ ```
98
165
  #### Custom Scrubbers
99
166
 
100
167
  You can also create custom scrubbers in your application if you want to.
@@ -138,5 +205,11 @@ Rails Html Sanitizers is work of many contributors. You're encouraged to submit
138
205
 
139
206
  See [CONTRIBUTING](CONTRIBUTING.md).
140
207
 
208
+ ### Security reports
209
+
210
+ Trying to report a possible security vulnerability in this project? Please
211
+ check out our [security policy](https://rubyonrails.org/security) for
212
+ guidelines about how to proceed.
213
+
141
214
  ## License
142
215
  Rails Html Sanitizers is released under the [MIT License](MIT-LICENSE).
@@ -1,7 +1,7 @@
1
1
  module Rails
2
2
  module Html
3
3
  class Sanitizer
4
- VERSION = "1.4.4"
4
+ VERSION = "1.5.0"
5
5
  end
6
6
  end
7
7
  end
@@ -110,8 +110,8 @@ module Rails
110
110
  acronym a img blockquote del ins))
111
111
  self.allowed_attributes = Set.new(%w(href src width height alt cite datetime title class name xml:lang abbr))
112
112
 
113
- def initialize
114
- @permit_scrubber = PermitScrubber.new
113
+ def initialize(prune: false)
114
+ @permit_scrubber = PermitScrubber.new(prune: prune)
115
115
  end
116
116
 
117
117
  def sanitize(html, options = {})
@@ -45,10 +45,11 @@ module Rails
45
45
  # See the documentation for +Nokogiri::XML::Node+ to understand what's possible
46
46
  # with nodes: https://nokogiri.org/rdoc/Nokogiri/XML/Node.html
47
47
  class PermitScrubber < Loofah::Scrubber
48
- attr_reader :tags, :attributes
48
+ attr_reader :tags, :attributes, :prune
49
49
 
50
- def initialize
51
- @direction = :bottom_up
50
+ def initialize(prune: false)
51
+ @prune = prune
52
+ @direction = @prune ? :top_down : :bottom_up
52
53
  @tags, @attributes = nil, nil
53
54
  end
54
55
 
@@ -98,7 +99,7 @@ module Rails
98
99
  end
99
100
 
100
101
  def scrub_node(node)
101
- node.before(node.children) # strip
102
+ node.before(node.children) unless prune # strip
102
103
  node.remove
103
104
  end
104
105
 
@@ -256,6 +256,12 @@ class SanitizersTest < Minitest::Test
256
256
  end
257
257
  end
258
258
 
259
+ def test_should_allow_prune
260
+ sanitizer = Rails::Html::SafeListSanitizer.new(prune: true)
261
+ text = '<u>leave me <b>now</b></u>'
262
+ assert_equal "<u>leave me </u>", sanitizer.sanitize(text, tags: %w(u))
263
+ end
264
+
259
265
  def test_should_allow_custom_tags
260
266
  text = "<u>foo</u>"
261
267
  assert_equal text, safe_list_sanitize(text, tags: %w(u))
@@ -66,6 +66,13 @@ class PermitScrubberTest < ScrubberTest
66
66
  assert_scrubbed html, '<tag>leave me now</tag>'
67
67
  end
68
68
 
69
+ def test_prunes_tags
70
+ @scrubber = Rails::Html::PermitScrubber.new(prune: true)
71
+ @scrubber.tags = %w(tag)
72
+ html = '<tag>leave me <span>now</span></tag>'
73
+ assert_scrubbed html, '<tag>leave me </tag>'
74
+ end
75
+
69
76
  def test_leaves_comments_when_supplied_as_tag
70
77
  @scrubber.tags = %w(div comment)
71
78
  assert_scrubbed('<div>one</div><!-- two --><span>three</span>',
@@ -157,6 +164,13 @@ class TargetScrubberTest < ScrubberTest
157
164
  html = '<tag remove="" other=""></tag><a remove="" other=""></a>'
158
165
  assert_scrubbed html, '<a other=""></a>'
159
166
  end
167
+
168
+ def test_prunes_tags
169
+ @scrubber = Rails::Html::TargetScrubber.new(prune: true)
170
+ @scrubber.tags = %w(span)
171
+ html = '<tag>leave me <span>now</span></tag>'
172
+ assert_scrubbed html, '<tag>leave me </tag>'
173
+ end
160
174
  end
161
175
 
162
176
  class TextOnlyScrubberTest < ScrubberTest
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: rails-html-sanitizer
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.4.4
4
+ version: 1.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Rafael Mendonça França
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2022-12-13 00:00:00.000000000 Z
12
+ date: 2023-01-20 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: loofah
@@ -109,9 +109,9 @@ licenses:
109
109
  - MIT
110
110
  metadata:
111
111
  bug_tracker_uri: https://github.com/rails/rails-html-sanitizer/issues
112
- changelog_uri: https://github.com/rails/rails-html-sanitizer/blob/v1.4.4/CHANGELOG.md
113
- documentation_uri: https://www.rubydoc.info/gems/rails-html-sanitizer/1.4.4
114
- source_code_uri: https://github.com/rails/rails-html-sanitizer/tree/v1.4.4
112
+ changelog_uri: https://github.com/rails/rails-html-sanitizer/blob/v1.5.0/CHANGELOG.md
113
+ documentation_uri: https://www.rubydoc.info/gems/rails-html-sanitizer/1.5.0
114
+ source_code_uri: https://github.com/rails/rails-html-sanitizer/tree/v1.5.0
115
115
  post_install_message:
116
116
  rdoc_options: []
117
117
  require_paths:
@@ -127,7 +127,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
127
127
  - !ruby/object:Gem::Version
128
128
  version: '0'
129
129
  requirements: []
130
- rubygems_version: 3.3.7
130
+ rubygems_version: 3.4.2
131
131
  signing_key:
132
132
  specification_version: 4
133
133
  summary: This gem is responsible to sanitize HTML fragments in Rails applications.