loofah 2.21.3 → 2.24.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: '00569b28a0bc6307a0a8eb8704ad374f10269008dada5d09470c1a2a87da0a6f'
4
- data.tar.gz: eff12a44f1152dc377ac0a6859be97f85b5a0a031a0b7688387c73f7130351d3
3
+ metadata.gz: 16850a48486ab3e9191ceff0a4fd6d768f82151049332ae162068f6712efccb8
4
+ data.tar.gz: 6ccd67672b489120796711e08643cbaec9c88648622fc0c3a1ac013e49534b25
5
5
  SHA512:
6
- metadata.gz: fbcf412c0105203fe3d9ee057dc146cc7f8e9f29587c0f97b9c03a5206eacd31f7170042c74050d40093875b8109f524434a0960ef9305269fdda536e3049e3b
7
- data.tar.gz: 48c3f2c0f4a0b2316c46ad1826c61ded9c3438ace28c477d7b67cc345847a00bb5747bab5b22cf5d774e82c90bedd085a9878c40609f93abda49f6aa01257f7c
6
+ metadata.gz: b2a4f569f20365f63d548506946736a20ee195a3b4149228489c39f1d6fddf2fe9c774ded5d88d0d3bd547a00110b42ab37d582f8701a01eb2a047070cc2b440
7
+ data.tar.gz: 2bca5a9c58d363251e8ca5b3803a57b73e51506e9d294e45d69d1fef376b658f31901a359315c6d60974d469047e78307e7cd33005884314e98bf9d2775bd36a
data/CHANGELOG.md CHANGED
@@ -1,7 +1,56 @@
1
1
  # Changelog
2
2
 
3
+ ## 2.24.1 / 2025-05-12
4
+
5
+ ### Ruby support
6
+
7
+ * Import only what's needed from `cgi` for support for Ruby 3.5 #296 @Earlopain
8
+
9
+
10
+ ## 2.24.0 / 2024-12-24
11
+
12
+ ### Added
13
+
14
+ * Built-in scrubber `:double_breakpoint` which sees `<br><br>` and wraps the surrounding content in `<p>` tags. #279, #284 @josecolella @torihuang
15
+
16
+ ### Improved
17
+
18
+ * Built-in scrubber `:targetblank` now skips `a` tags whose `href` attribute is an anchor link. Previously, all `a` tags were modified to have `target='_blank'`. #291 @fnando
19
+
20
+
21
+ ## 2.23.1 / 2024-10-25
22
+
23
+ ### Added
24
+
25
+ * Allow CSS properties `min-height` and `max-height`. [#288] @lazyatom
26
+
27
+
28
+ ## 2.23.0 / 2024-10-24
29
+
30
+ ### Added
31
+
32
+ * Allow CSS property `min-width`. [#287] @lazyatom
33
+
34
+
35
+ ## 2.22.0 / 2023-11-13
36
+
37
+ ### Added
38
+
39
+ * A `:targetblank` HTML scrubber which ensures all hyperlinks have `target="_blank"`. [#275] @stefannibrasil and @thdaraujo
40
+ * A `:noreferrer` HTML scrubber which ensures all hyperlinks have `rel=noreferrer`, similar to the `:nofollow` and `:noopener` scrubbers. [#277] @wynksaiddestroy
41
+
42
+
43
+ ## 2.21.4 / 2023-10-10
44
+
45
+ ### Fixed
46
+
47
+ * `Loofah::HTML5::Scrub.scrub_css` is more consistent in preserving whitespace (and lack of whitespace) in CSS property values. In particular, `.scrub_css` no longer inserts whitespace between tokens that did not already have whitespace between them. [[#273](https://github.com/flavorjones/loofah/issues/273), fixes [#271](https://github.com/flavorjones/loofah/issues/271)]
48
+
49
+
3
50
  ## 2.21.3 / 2023-05-15
4
51
 
52
+ ### Fixed
53
+
5
54
  * Quash "instance variable not initialized" warning in Ruby < 3.0. [[#268](https://github.com/flavorjones/loofah/issues/268)] (Thanks, [@dharamgollapudi](https://github.com/dharamgollapudi)!)
6
55
 
7
56
 
data/README.md CHANGED
@@ -29,7 +29,10 @@ Active Record extensions for HTML sanitization are available in the [`loofah-act
29
29
  * _Whitewash_ the markup, removing all attributes and namespaced nodes.
30
30
  * Other common HTML transformations are built-in:
31
31
  * Add the _nofollow_ attribute to all hyperlinks.
32
+ * Add the _target=\_blank_ attribute to all hyperlinks.
32
33
  * Remove _unprintable_ characters from text nodes.
34
+ * Some specialized HTML transformations are also built-in:
35
+ * Where `<br><br>` exists inside a `p` tag, close the `p` and open a new one.
33
36
  * Format markup as plain text, with (or without) sensible whitespace handling around block elements.
34
37
  * Replace Rails's `strip_tags` and `sanitize` view helper methods.
35
38
 
@@ -226,11 +229,15 @@ doc.scrub!(:whitewash) # removes unknown/unsafe/namespaced tags and their chi
226
229
  # and strips all node attributes
227
230
  ```
228
231
 
229
- Loofah also comes with some common transformation tasks:
232
+ Loofah also comes with built-in scrubers for some common transformation tasks:
230
233
 
231
234
  ``` ruby
232
- doc.scrub!(:nofollow) # adds rel="nofollow" attribute to links
233
- doc.scrub!(:unprintable) # removes unprintable characters from text nodes
235
+ doc.scrub!(:nofollow) # adds rel="nofollow" attribute to links
236
+ doc.scrub!(:noopener) # adds rel="noopener" attribute to links
237
+ doc.scrub!(:noreferrer) # adds rel="noreferrer" attribute to links
238
+ doc.scrub!(:unprintable) # removes unprintable characters from text nodes
239
+ doc.scrub!(:targetblank) # adds target="_blank" attribute to links
240
+ doc.scrub!(:double_breakpoint) # where `<br><br>` appears in a `p` tag, close the `p` and open a new one
234
241
  ```
235
242
 
236
243
  See `Loofah::Scrubbers` for more details and example usage.
@@ -333,20 +340,64 @@ See [`SECURITY.md`](SECURITY.md) for vulnerability reporting details.
333
340
 
334
341
  Featuring code contributed by:
335
342
 
336
- * Aaron Patterson
337
- * John Barnette
338
- * Josh Owens
339
- * Paul Dix
340
- * Luke Melia
343
+ * [@flavorjones](https://github.com/flavorjones)
344
+ * [@brynary](https://github.com/brynary)
345
+ * [@olleolleolle](https://github.com/olleolleolle)
346
+ * [@JuanitoFatas](https://github.com/JuanitoFatas)
347
+ * [@kaspth](https://github.com/kaspth)
348
+ * [@tenderlove](https://github.com/tenderlove)
349
+ * [@ktdreyer](https://github.com/ktdreyer)
350
+ * [@orien](https://github.com/orien)
351
+ * [@asok](https://github.com/asok)
352
+ * [@junaruga](https://github.com/junaruga)
353
+ * [@MothOnMars](https://github.com/MothOnMars)
354
+ * [@nick-desteffen](https://github.com/nick-desteffen)
355
+ * [@NikoRoberts](https://github.com/NikoRoberts)
356
+ * [@trans](https://github.com/trans)
357
+ * [@andreynering](https://github.com/andreynering)
358
+ * [@aried3r](https://github.com/aried3r)
359
+ * [@baopham](https://github.com/baopham)
360
+ * [@batter](https://github.com/batter)
361
+ * [@brendon](https://github.com/brendon)
362
+ * [@cjba7](https://github.com/cjba7)
363
+ * [@christiankisssner](https://github.com/christiankisssner)
364
+ * [@dacort](https://github.com/dacort)
365
+ * [@danfstucky](https://github.com/danfstucky)
366
+ * [@david-a-wheeler](https://github.com/david-a-wheeler)
367
+ * [@dharamgollapudi](https://github.com/dharamgollapudi)
368
+ * [@georgeclaghorn](https://github.com/georgeclaghorn)
369
+ * [@gogainda](https://github.com/gogainda)
370
+ * [@jaredbeck](https://github.com/jaredbeck)
371
+ * [@ThatHurleyGuy](https://github.com/ThatHurleyGuy)
372
+ * [@jstorimer](https://github.com/jstorimer)
373
+ * [@jbarnette](https://github.com/jbarnette)
374
+ * [@queso](https://github.com/queso)
375
+ * [@technicalpickles](https://github.com/technicalpickles)
376
+ * [@kyoshidajp](https://github.com/kyoshidajp)
377
+ * [@kristianfreeman](https://github.com/kristianfreeman)
378
+ * [@louim](https://github.com/louim)
379
+ * [@mrpasquini](https://github.com/mrpasquini)
380
+ * [@olivierlacan](https://github.com/olivierlacan)
381
+ * [@pauldix](https://github.com/pauldix)
382
+ * [@sampokuokkanen](https://github.com/sampokuokkanen)
383
+ * [@stefannibrasil](https://github.com/stefannibrasil)
384
+ * [@tastycode](https://github.com/tastycode)
385
+ * [@vipulnsward](https://github.com/vipulnsward)
386
+ * [@joncalhoun](https://github.com/joncalhoun)
387
+ * [@ahorek](https://github.com/ahorek)
388
+ * [@rmacklin](https://github.com/rmacklin)
389
+ * [@y-yagi](https://github.com/y-yagi)
390
+ * [@lazyatom](https://github.com/lazyatom)
341
391
 
342
392
  And a big shout-out to Corey Innis for the name, and feedback on the API.
343
393
 
344
394
 
345
395
  ## Thank You
346
396
 
347
- The following people have generously funded Loofah:
397
+ The following people have generously funded Loofah with financial sponsorship:
348
398
 
349
399
  * Bill Harding
400
+ * [Sentry](https://sentry.io/) @getsentry
350
401
 
351
402
 
352
403
  ## Historical Note
@@ -662,7 +662,10 @@ module Loofah
662
662
  "line-height",
663
663
  "list-style",
664
664
  "list-style-type",
665
+ "max-height",
665
666
  "max-width",
667
+ "min-height",
668
+ "min-width",
666
669
  "order",
667
670
  "overflow",
668
671
  "overflow-x",
@@ -1,6 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- require "cgi"
3
+ require "cgi/escape"
4
+ require "cgi/util" if RUBY_VERSION < "3.5"
4
5
  require "crass"
5
6
 
6
7
  module Loofah
@@ -10,6 +11,7 @@ module Loofah
10
11
  CSS_KEYWORDISH = /\A(#[0-9a-fA-F]+|rgb\(\d+%?,\d*%?,?\d*%?\)?|-?\d{0,3}\.?\d{0,10}(ch|cm|r?em|ex|in|lh|mm|pc|pt|px|Q|vmax|vmin|vw|vh|%|,|\))?)\z/ # rubocop:disable Layout/LineLength
11
12
  CRASS_SEMICOLON = { node: :semicolon, raw: ";" }
12
13
  CSS_IMPORTANT = "!important"
14
+ CSS_WHITESPACE = " "
13
15
  CSS_PROPERTY_STRING_WITHOUT_EMBEDDED_QUOTES = /\A(["'])?[^"']+\1\z/
14
16
  DATA_ATTRIBUTE_NAME = /\Adata-[\w-]+\z/
15
17
 
@@ -87,7 +89,7 @@ module Loofah
87
89
  value = node[:children].map do |child|
88
90
  case child[:node]
89
91
  when :whitespace
90
- nil
92
+ CSS_WHITESPACE
91
93
  when :string
92
94
  if CSS_PROPERTY_STRING_WITHOUT_EMBEDDED_QUOTES.match?(child[:raw])
93
95
  Crass::Parser.stringify(child)
@@ -106,12 +108,12 @@ module Loofah
106
108
  else
107
109
  child[:raw]
108
110
  end
109
- end.compact
111
+ end.compact.join.strip
110
112
 
111
113
  next if value.empty?
112
114
 
113
- value << CSS_IMPORTANT if node[:important]
114
- propstring = format("%s:%s", name, value.join(" "))
115
+ value << CSS_WHITESPACE << CSS_IMPORTANT if node[:important]
116
+ propstring = format("%s:%s", name, value)
115
117
  sanitized_node = Crass.parse_properties(propstring).first
116
118
  sanitized_tree << sanitized_node << CRASS_SEMICOLON
117
119
  end
@@ -61,6 +61,15 @@ module Loofah
61
61
  # => "ohai! <a href='http://www.myswarmysite.com/' rel="nofollow">I like your blog post</a>"
62
62
  #
63
63
  #
64
+ # === Loofah::Scrubbers::TargetBlank / scrub!(:targetblank)
65
+ #
66
+ # +:targetblank+ adds a target="_blank" attribute to all links
67
+ #
68
+ # link_farmers_markup = "ohai! <a href='http://www.myswarmysite.com/'>I like your blog post</a>"
69
+ # Loofah.html5_fragment(link_farmers_markup).scrub!(:targetblank)
70
+ # => "ohai! <a href='http://www.myswarmysite.com/' target="_blank">I like your blog post</a>"
71
+ #
72
+ #
64
73
  # === Loofah::Scrubbers::NoOpener / scrub!(:noopener)
65
74
  #
66
75
  # +:noopener+ adds a rel="noopener" attribute to all links
@@ -69,6 +78,14 @@ module Loofah
69
78
  # Loofah.html5_fragment(link_farmers_markup).scrub!(:noopener)
70
79
  # => "ohai! <a href='http://www.myswarmysite.com/' rel="noopener">I like your blog post</a>"
71
80
  #
81
+ # === Loofah::Scrubbers::NoReferrer / scrub!(:noreferrer)
82
+ #
83
+ # +:noreferrer+ adds a rel="noreferrer" attribute to all links
84
+ #
85
+ # link_farmers_markup = "ohai! <a href='http://www.myswarmysite.com/'>I like your blog post</a>"
86
+ # Loofah.html5_fragment(link_farmers_markup).scrub!(:noreferrer)
87
+ # => "ohai! <a href='http://www.myswarmysite.com/' rel="noreferrer">I like your blog post</a>"
88
+ #
72
89
  #
73
90
  # === Loofah::Scrubbers::Unprintable / scrub!(:unprintable)
74
91
  #
@@ -213,6 +230,35 @@ module Loofah
213
230
  end
214
231
  end
215
232
 
233
+ #
234
+ # === scrub!(:targetblank)
235
+ #
236
+ # +:targetblank+ adds a target="_blank" attribute to all links.
237
+ # If there is a target already set, replaces it with target="_blank".
238
+ #
239
+ # link_farmers_markup = "ohai! <a href='http://www.myswarmysite.com/'>I like your blog post</a>"
240
+ # Loofah.html5_fragment(link_farmers_markup).scrub!(:targetblank)
241
+ # => "ohai! <a href='http://www.myswarmysite.com/' target="_blank">I like your blog post</a>"
242
+ #
243
+ # On modern browsers, setting target="_blank" on anchor elements implicitly provides the same
244
+ # behavior as setting rel="noopener".
245
+ #
246
+ class TargetBlank < Scrubber
247
+ def initialize # rubocop:disable Lint/MissingSuper
248
+ @direction = :top_down
249
+ end
250
+
251
+ def scrub(node)
252
+ return CONTINUE unless (node.type == Nokogiri::XML::Node::ELEMENT_NODE) && (node.name == "a")
253
+
254
+ href = node["href"]
255
+
256
+ node.set_attribute("target", "_blank") if href && href[0] != "#"
257
+
258
+ STOP
259
+ end
260
+ end
261
+
216
262
  #
217
263
  # === scrub!(:noopener)
218
264
  #
@@ -235,6 +281,28 @@ module Loofah
235
281
  end
236
282
  end
237
283
 
284
+ #
285
+ # === scrub!(:noreferrer)
286
+ #
287
+ # +:noreferrer+ adds a rel="noreferrer" attribute to all links
288
+ #
289
+ # link_farmers_markup = "ohai! <a href='http://www.myswarmysite.com/'>I like your blog post</a>"
290
+ # Loofah.html5_fragment(link_farmers_markup).scrub!(:noreferrer)
291
+ # => "ohai! <a href='http://www.myswarmysite.com/' rel="noreferrer">I like your blog post</a>"
292
+ #
293
+ class NoReferrer < Scrubber
294
+ def initialize # rubocop:disable Lint/MissingSuper
295
+ @direction = :top_down
296
+ end
297
+
298
+ def scrub(node)
299
+ return CONTINUE unless (node.type == Nokogiri::XML::Node::ELEMENT_NODE) && (node.name == "a")
300
+
301
+ append_attribute(node, "rel", "noreferrer")
302
+ STOP
303
+ end
304
+ end
305
+
238
306
  # This class probably isn't useful publicly, but is used for #to_text's current implemention
239
307
  class NewlineBlockElements < Scrubber # :nodoc:
240
308
  def initialize # rubocop:disable Lint/MissingSuper
@@ -282,6 +350,57 @@ module Loofah
282
350
  end
283
351
  end
284
352
 
353
+ #
354
+ # === scrub!(:double_breakpoint)
355
+ #
356
+ # +:double_breakpoint+ replaces double-break tags with closing/opening paragraph tags.
357
+ #
358
+ # markup = "<p>Some text here in a logical paragraph.<br><br>Some more text, apparently a second paragraph.</p>"
359
+ # Loofah.html5_fragment(markup).scrub!(:double_breakpoint)
360
+ # => "<p>Some text here in a logical paragraph.</p><p>Some more text, apparently a second paragraph.</p>"
361
+ #
362
+ class DoubleBreakpoint < Scrubber
363
+ def initialize # rubocop:disable Lint/MissingSuper
364
+ @direction = :top_down
365
+ end
366
+
367
+ def scrub(node)
368
+ return CONTINUE unless (node.type == Nokogiri::XML::Node::ELEMENT_NODE) && (node.name == "p")
369
+
370
+ paragraph_with_break_point_nodes = node.xpath("//p[br[following-sibling::br]]")
371
+
372
+ paragraph_with_break_point_nodes.each do |paragraph_node|
373
+ new_paragraph = paragraph_node.add_previous_sibling("<p>").first
374
+
375
+ paragraph_node.children.each do |child|
376
+ remove_blank_text_nodes(child)
377
+ end
378
+
379
+ paragraph_node.children.each do |child|
380
+ # already unlinked
381
+ next if child.parent.nil?
382
+
383
+ if child.name == "br" && child.next_sibling.name == "br"
384
+ new_paragraph = paragraph_node.add_previous_sibling("<p>").first
385
+ child.next_sibling.unlink
386
+ child.unlink
387
+ else
388
+ child.parent = new_paragraph
389
+ end
390
+ end
391
+
392
+ paragraph_node.unlink
393
+ end
394
+
395
+ CONTINUE
396
+ end
397
+
398
+ private
399
+
400
+ def remove_blank_text_nodes(node)
401
+ node.unlink if node.text? && node.blank?
402
+ end
403
+ end
285
404
  #
286
405
  # A hash that maps a symbol (like +:prune+) to the appropriate Scrubber (Loofah::Scrubbers::Prune).
287
406
  #
@@ -292,8 +411,11 @@ module Loofah
292
411
  strip: Strip,
293
412
  nofollow: NoFollow,
294
413
  noopener: NoOpener,
414
+ noreferrer: NoReferrer,
415
+ targetblank: TargetBlank,
295
416
  newline_block_elements: NewlineBlockElements,
296
417
  unprintable: Unprintable,
418
+ double_breakpoint: DoubleBreakpoint,
297
419
  }
298
420
 
299
421
  class << self
@@ -2,5 +2,5 @@
2
2
 
3
3
  module Loofah
4
4
  # The version of Loofah you are using
5
- VERSION = "2.21.3"
5
+ VERSION = "2.24.1"
6
6
  end
metadata CHANGED
@@ -1,15 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: loofah
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.21.3
4
+ version: 2.24.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Mike Dalessio
8
8
  - Bryan Helmkamp
9
- autorequire:
10
9
  bindir: bin
11
10
  cert_chain: []
12
- date: 2023-05-15 00:00:00.000000000 Z
11
+ date: 1980-01-02 00:00:00.000000000 Z
13
12
  dependencies:
14
13
  - !ruby/object:Gem::Dependency
15
14
  name: crass
@@ -82,7 +81,7 @@ metadata:
82
81
  bug_tracker_uri: https://github.com/flavorjones/loofah/issues
83
82
  changelog_uri: https://github.com/flavorjones/loofah/blob/main/CHANGELOG.md
84
83
  documentation_uri: https://www.rubydoc.info/gems/loofah/
85
- post_install_message:
84
+ funding_uri: https://github.com/sponsors/flavorjones
86
85
  rdoc_options: []
87
86
  require_paths:
88
87
  - lib
@@ -97,8 +96,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
97
96
  - !ruby/object:Gem::Version
98
97
  version: '0'
99
98
  requirements: []
100
- rubygems_version: 3.4.10
101
- signing_key:
99
+ rubygems_version: 3.6.8
102
100
  specification_version: 4
103
101
  summary: Loofah is a general library for manipulating and transforming HTML/XML documents
104
102
  and fragments, built on top of Nokogiri.