loofah 2.21.3 → 2.24.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +49 -0
- data/README.md +60 -9
- data/lib/loofah/html5/safelist.rb +3 -0
- data/lib/loofah/html5/scrub.rb +7 -5
- data/lib/loofah/scrubbers.rb +122 -0
- data/lib/loofah/version.rb +1 -1
- metadata +4 -6
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 16850a48486ab3e9191ceff0a4fd6d768f82151049332ae162068f6712efccb8
|
4
|
+
data.tar.gz: 6ccd67672b489120796711e08643cbaec9c88648622fc0c3a1ac013e49534b25
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: b2a4f569f20365f63d548506946736a20ee195a3b4149228489c39f1d6fddf2fe9c774ded5d88d0d3bd547a00110b42ab37d582f8701a01eb2a047070cc2b440
|
7
|
+
data.tar.gz: 2bca5a9c58d363251e8ca5b3803a57b73e51506e9d294e45d69d1fef376b658f31901a359315c6d60974d469047e78307e7cd33005884314e98bf9d2775bd36a
|
data/CHANGELOG.md
CHANGED
@@ -1,7 +1,56 @@
|
|
1
1
|
# Changelog
|
2
2
|
|
3
|
+
## 2.24.1 / 2025-05-12
|
4
|
+
|
5
|
+
### Ruby support
|
6
|
+
|
7
|
+
* Import only what's needed from `cgi` for support for Ruby 3.5 #296 @Earlopain
|
8
|
+
|
9
|
+
|
10
|
+
## 2.24.0 / 2024-12-24
|
11
|
+
|
12
|
+
### Added
|
13
|
+
|
14
|
+
* Built-in scrubber `:double_breakpoint` which sees `<br><br>` and wraps the surrounding content in `<p>` tags. #279, #284 @josecolella @torihuang
|
15
|
+
|
16
|
+
### Improved
|
17
|
+
|
18
|
+
* Built-in scrubber `:targetblank` now skips `a` tags whose `href` attribute is an anchor link. Previously, all `a` tags were modified to have `target='_blank'`. #291 @fnando
|
19
|
+
|
20
|
+
|
21
|
+
## 2.23.1 / 2024-10-25
|
22
|
+
|
23
|
+
### Added
|
24
|
+
|
25
|
+
* Allow CSS properties `min-height` and `max-height`. [#288] @lazyatom
|
26
|
+
|
27
|
+
|
28
|
+
## 2.23.0 / 2024-10-24
|
29
|
+
|
30
|
+
### Added
|
31
|
+
|
32
|
+
* Allow CSS property `min-width`. [#287] @lazyatom
|
33
|
+
|
34
|
+
|
35
|
+
## 2.22.0 / 2023-11-13
|
36
|
+
|
37
|
+
### Added
|
38
|
+
|
39
|
+
* A `:targetblank` HTML scrubber which ensures all hyperlinks have `target="_blank"`. [#275] @stefannibrasil and @thdaraujo
|
40
|
+
* A `:noreferrer` HTML scrubber which ensures all hyperlinks have `rel=noreferrer`, similar to the `:nofollow` and `:noopener` scrubbers. [#277] @wynksaiddestroy
|
41
|
+
|
42
|
+
|
43
|
+
## 2.21.4 / 2023-10-10
|
44
|
+
|
45
|
+
### Fixed
|
46
|
+
|
47
|
+
* `Loofah::HTML5::Scrub.scrub_css` is more consistent in preserving whitespace (and lack of whitespace) in CSS property values. In particular, `.scrub_css` no longer inserts whitespace between tokens that did not already have whitespace between them. [[#273](https://github.com/flavorjones/loofah/issues/273), fixes [#271](https://github.com/flavorjones/loofah/issues/271)]
|
48
|
+
|
49
|
+
|
3
50
|
## 2.21.3 / 2023-05-15
|
4
51
|
|
52
|
+
### Fixed
|
53
|
+
|
5
54
|
* Quash "instance variable not initialized" warning in Ruby < 3.0. [[#268](https://github.com/flavorjones/loofah/issues/268)] (Thanks, [@dharamgollapudi](https://github.com/dharamgollapudi)!)
|
6
55
|
|
7
56
|
|
data/README.md
CHANGED
@@ -29,7 +29,10 @@ Active Record extensions for HTML sanitization are available in the [`loofah-act
|
|
29
29
|
* _Whitewash_ the markup, removing all attributes and namespaced nodes.
|
30
30
|
* Other common HTML transformations are built-in:
|
31
31
|
* Add the _nofollow_ attribute to all hyperlinks.
|
32
|
+
* Add the _target=\_blank_ attribute to all hyperlinks.
|
32
33
|
* Remove _unprintable_ characters from text nodes.
|
34
|
+
* Some specialized HTML transformations are also built-in:
|
35
|
+
* Where `<br><br>` exists inside a `p` tag, close the `p` and open a new one.
|
33
36
|
* Format markup as plain text, with (or without) sensible whitespace handling around block elements.
|
34
37
|
* Replace Rails's `strip_tags` and `sanitize` view helper methods.
|
35
38
|
|
@@ -226,11 +229,15 @@ doc.scrub!(:whitewash) # removes unknown/unsafe/namespaced tags and their chi
|
|
226
229
|
# and strips all node attributes
|
227
230
|
```
|
228
231
|
|
229
|
-
Loofah also comes with some common transformation tasks:
|
232
|
+
Loofah also comes with built-in scrubers for some common transformation tasks:
|
230
233
|
|
231
234
|
``` ruby
|
232
|
-
doc.scrub!(:nofollow)
|
233
|
-
doc.scrub!(:
|
235
|
+
doc.scrub!(:nofollow) # adds rel="nofollow" attribute to links
|
236
|
+
doc.scrub!(:noopener) # adds rel="noopener" attribute to links
|
237
|
+
doc.scrub!(:noreferrer) # adds rel="noreferrer" attribute to links
|
238
|
+
doc.scrub!(:unprintable) # removes unprintable characters from text nodes
|
239
|
+
doc.scrub!(:targetblank) # adds target="_blank" attribute to links
|
240
|
+
doc.scrub!(:double_breakpoint) # where `<br><br>` appears in a `p` tag, close the `p` and open a new one
|
234
241
|
```
|
235
242
|
|
236
243
|
See `Loofah::Scrubbers` for more details and example usage.
|
@@ -333,20 +340,64 @@ See [`SECURITY.md`](SECURITY.md) for vulnerability reporting details.
|
|
333
340
|
|
334
341
|
Featuring code contributed by:
|
335
342
|
|
336
|
-
*
|
337
|
-
*
|
338
|
-
*
|
339
|
-
*
|
340
|
-
*
|
343
|
+
* [@flavorjones](https://github.com/flavorjones)
|
344
|
+
* [@brynary](https://github.com/brynary)
|
345
|
+
* [@olleolleolle](https://github.com/olleolleolle)
|
346
|
+
* [@JuanitoFatas](https://github.com/JuanitoFatas)
|
347
|
+
* [@kaspth](https://github.com/kaspth)
|
348
|
+
* [@tenderlove](https://github.com/tenderlove)
|
349
|
+
* [@ktdreyer](https://github.com/ktdreyer)
|
350
|
+
* [@orien](https://github.com/orien)
|
351
|
+
* [@asok](https://github.com/asok)
|
352
|
+
* [@junaruga](https://github.com/junaruga)
|
353
|
+
* [@MothOnMars](https://github.com/MothOnMars)
|
354
|
+
* [@nick-desteffen](https://github.com/nick-desteffen)
|
355
|
+
* [@NikoRoberts](https://github.com/NikoRoberts)
|
356
|
+
* [@trans](https://github.com/trans)
|
357
|
+
* [@andreynering](https://github.com/andreynering)
|
358
|
+
* [@aried3r](https://github.com/aried3r)
|
359
|
+
* [@baopham](https://github.com/baopham)
|
360
|
+
* [@batter](https://github.com/batter)
|
361
|
+
* [@brendon](https://github.com/brendon)
|
362
|
+
* [@cjba7](https://github.com/cjba7)
|
363
|
+
* [@christiankisssner](https://github.com/christiankisssner)
|
364
|
+
* [@dacort](https://github.com/dacort)
|
365
|
+
* [@danfstucky](https://github.com/danfstucky)
|
366
|
+
* [@david-a-wheeler](https://github.com/david-a-wheeler)
|
367
|
+
* [@dharamgollapudi](https://github.com/dharamgollapudi)
|
368
|
+
* [@georgeclaghorn](https://github.com/georgeclaghorn)
|
369
|
+
* [@gogainda](https://github.com/gogainda)
|
370
|
+
* [@jaredbeck](https://github.com/jaredbeck)
|
371
|
+
* [@ThatHurleyGuy](https://github.com/ThatHurleyGuy)
|
372
|
+
* [@jstorimer](https://github.com/jstorimer)
|
373
|
+
* [@jbarnette](https://github.com/jbarnette)
|
374
|
+
* [@queso](https://github.com/queso)
|
375
|
+
* [@technicalpickles](https://github.com/technicalpickles)
|
376
|
+
* [@kyoshidajp](https://github.com/kyoshidajp)
|
377
|
+
* [@kristianfreeman](https://github.com/kristianfreeman)
|
378
|
+
* [@louim](https://github.com/louim)
|
379
|
+
* [@mrpasquini](https://github.com/mrpasquini)
|
380
|
+
* [@olivierlacan](https://github.com/olivierlacan)
|
381
|
+
* [@pauldix](https://github.com/pauldix)
|
382
|
+
* [@sampokuokkanen](https://github.com/sampokuokkanen)
|
383
|
+
* [@stefannibrasil](https://github.com/stefannibrasil)
|
384
|
+
* [@tastycode](https://github.com/tastycode)
|
385
|
+
* [@vipulnsward](https://github.com/vipulnsward)
|
386
|
+
* [@joncalhoun](https://github.com/joncalhoun)
|
387
|
+
* [@ahorek](https://github.com/ahorek)
|
388
|
+
* [@rmacklin](https://github.com/rmacklin)
|
389
|
+
* [@y-yagi](https://github.com/y-yagi)
|
390
|
+
* [@lazyatom](https://github.com/lazyatom)
|
341
391
|
|
342
392
|
And a big shout-out to Corey Innis for the name, and feedback on the API.
|
343
393
|
|
344
394
|
|
345
395
|
## Thank You
|
346
396
|
|
347
|
-
The following people have generously funded Loofah:
|
397
|
+
The following people have generously funded Loofah with financial sponsorship:
|
348
398
|
|
349
399
|
* Bill Harding
|
400
|
+
* [Sentry](https://sentry.io/) @getsentry
|
350
401
|
|
351
402
|
|
352
403
|
## Historical Note
|
data/lib/loofah/html5/scrub.rb
CHANGED
@@ -1,6 +1,7 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
|
-
require "cgi"
|
3
|
+
require "cgi/escape"
|
4
|
+
require "cgi/util" if RUBY_VERSION < "3.5"
|
4
5
|
require "crass"
|
5
6
|
|
6
7
|
module Loofah
|
@@ -10,6 +11,7 @@ module Loofah
|
|
10
11
|
CSS_KEYWORDISH = /\A(#[0-9a-fA-F]+|rgb\(\d+%?,\d*%?,?\d*%?\)?|-?\d{0,3}\.?\d{0,10}(ch|cm|r?em|ex|in|lh|mm|pc|pt|px|Q|vmax|vmin|vw|vh|%|,|\))?)\z/ # rubocop:disable Layout/LineLength
|
11
12
|
CRASS_SEMICOLON = { node: :semicolon, raw: ";" }
|
12
13
|
CSS_IMPORTANT = "!important"
|
14
|
+
CSS_WHITESPACE = " "
|
13
15
|
CSS_PROPERTY_STRING_WITHOUT_EMBEDDED_QUOTES = /\A(["'])?[^"']+\1\z/
|
14
16
|
DATA_ATTRIBUTE_NAME = /\Adata-[\w-]+\z/
|
15
17
|
|
@@ -87,7 +89,7 @@ module Loofah
|
|
87
89
|
value = node[:children].map do |child|
|
88
90
|
case child[:node]
|
89
91
|
when :whitespace
|
90
|
-
|
92
|
+
CSS_WHITESPACE
|
91
93
|
when :string
|
92
94
|
if CSS_PROPERTY_STRING_WITHOUT_EMBEDDED_QUOTES.match?(child[:raw])
|
93
95
|
Crass::Parser.stringify(child)
|
@@ -106,12 +108,12 @@ module Loofah
|
|
106
108
|
else
|
107
109
|
child[:raw]
|
108
110
|
end
|
109
|
-
end.compact
|
111
|
+
end.compact.join.strip
|
110
112
|
|
111
113
|
next if value.empty?
|
112
114
|
|
113
|
-
value << CSS_IMPORTANT if node[:important]
|
114
|
-
propstring = format("%s:%s", name, value
|
115
|
+
value << CSS_WHITESPACE << CSS_IMPORTANT if node[:important]
|
116
|
+
propstring = format("%s:%s", name, value)
|
115
117
|
sanitized_node = Crass.parse_properties(propstring).first
|
116
118
|
sanitized_tree << sanitized_node << CRASS_SEMICOLON
|
117
119
|
end
|
data/lib/loofah/scrubbers.rb
CHANGED
@@ -61,6 +61,15 @@ module Loofah
|
|
61
61
|
# => "ohai! <a href='http://www.myswarmysite.com/' rel="nofollow">I like your blog post</a>"
|
62
62
|
#
|
63
63
|
#
|
64
|
+
# === Loofah::Scrubbers::TargetBlank / scrub!(:targetblank)
|
65
|
+
#
|
66
|
+
# +:targetblank+ adds a target="_blank" attribute to all links
|
67
|
+
#
|
68
|
+
# link_farmers_markup = "ohai! <a href='http://www.myswarmysite.com/'>I like your blog post</a>"
|
69
|
+
# Loofah.html5_fragment(link_farmers_markup).scrub!(:targetblank)
|
70
|
+
# => "ohai! <a href='http://www.myswarmysite.com/' target="_blank">I like your blog post</a>"
|
71
|
+
#
|
72
|
+
#
|
64
73
|
# === Loofah::Scrubbers::NoOpener / scrub!(:noopener)
|
65
74
|
#
|
66
75
|
# +:noopener+ adds a rel="noopener" attribute to all links
|
@@ -69,6 +78,14 @@ module Loofah
|
|
69
78
|
# Loofah.html5_fragment(link_farmers_markup).scrub!(:noopener)
|
70
79
|
# => "ohai! <a href='http://www.myswarmysite.com/' rel="noopener">I like your blog post</a>"
|
71
80
|
#
|
81
|
+
# === Loofah::Scrubbers::NoReferrer / scrub!(:noreferrer)
|
82
|
+
#
|
83
|
+
# +:noreferrer+ adds a rel="noreferrer" attribute to all links
|
84
|
+
#
|
85
|
+
# link_farmers_markup = "ohai! <a href='http://www.myswarmysite.com/'>I like your blog post</a>"
|
86
|
+
# Loofah.html5_fragment(link_farmers_markup).scrub!(:noreferrer)
|
87
|
+
# => "ohai! <a href='http://www.myswarmysite.com/' rel="noreferrer">I like your blog post</a>"
|
88
|
+
#
|
72
89
|
#
|
73
90
|
# === Loofah::Scrubbers::Unprintable / scrub!(:unprintable)
|
74
91
|
#
|
@@ -213,6 +230,35 @@ module Loofah
|
|
213
230
|
end
|
214
231
|
end
|
215
232
|
|
233
|
+
#
|
234
|
+
# === scrub!(:targetblank)
|
235
|
+
#
|
236
|
+
# +:targetblank+ adds a target="_blank" attribute to all links.
|
237
|
+
# If there is a target already set, replaces it with target="_blank".
|
238
|
+
#
|
239
|
+
# link_farmers_markup = "ohai! <a href='http://www.myswarmysite.com/'>I like your blog post</a>"
|
240
|
+
# Loofah.html5_fragment(link_farmers_markup).scrub!(:targetblank)
|
241
|
+
# => "ohai! <a href='http://www.myswarmysite.com/' target="_blank">I like your blog post</a>"
|
242
|
+
#
|
243
|
+
# On modern browsers, setting target="_blank" on anchor elements implicitly provides the same
|
244
|
+
# behavior as setting rel="noopener".
|
245
|
+
#
|
246
|
+
class TargetBlank < Scrubber
|
247
|
+
def initialize # rubocop:disable Lint/MissingSuper
|
248
|
+
@direction = :top_down
|
249
|
+
end
|
250
|
+
|
251
|
+
def scrub(node)
|
252
|
+
return CONTINUE unless (node.type == Nokogiri::XML::Node::ELEMENT_NODE) && (node.name == "a")
|
253
|
+
|
254
|
+
href = node["href"]
|
255
|
+
|
256
|
+
node.set_attribute("target", "_blank") if href && href[0] != "#"
|
257
|
+
|
258
|
+
STOP
|
259
|
+
end
|
260
|
+
end
|
261
|
+
|
216
262
|
#
|
217
263
|
# === scrub!(:noopener)
|
218
264
|
#
|
@@ -235,6 +281,28 @@ module Loofah
|
|
235
281
|
end
|
236
282
|
end
|
237
283
|
|
284
|
+
#
|
285
|
+
# === scrub!(:noreferrer)
|
286
|
+
#
|
287
|
+
# +:noreferrer+ adds a rel="noreferrer" attribute to all links
|
288
|
+
#
|
289
|
+
# link_farmers_markup = "ohai! <a href='http://www.myswarmysite.com/'>I like your blog post</a>"
|
290
|
+
# Loofah.html5_fragment(link_farmers_markup).scrub!(:noreferrer)
|
291
|
+
# => "ohai! <a href='http://www.myswarmysite.com/' rel="noreferrer">I like your blog post</a>"
|
292
|
+
#
|
293
|
+
class NoReferrer < Scrubber
|
294
|
+
def initialize # rubocop:disable Lint/MissingSuper
|
295
|
+
@direction = :top_down
|
296
|
+
end
|
297
|
+
|
298
|
+
def scrub(node)
|
299
|
+
return CONTINUE unless (node.type == Nokogiri::XML::Node::ELEMENT_NODE) && (node.name == "a")
|
300
|
+
|
301
|
+
append_attribute(node, "rel", "noreferrer")
|
302
|
+
STOP
|
303
|
+
end
|
304
|
+
end
|
305
|
+
|
238
306
|
# This class probably isn't useful publicly, but is used for #to_text's current implemention
|
239
307
|
class NewlineBlockElements < Scrubber # :nodoc:
|
240
308
|
def initialize # rubocop:disable Lint/MissingSuper
|
@@ -282,6 +350,57 @@ module Loofah
|
|
282
350
|
end
|
283
351
|
end
|
284
352
|
|
353
|
+
#
|
354
|
+
# === scrub!(:double_breakpoint)
|
355
|
+
#
|
356
|
+
# +:double_breakpoint+ replaces double-break tags with closing/opening paragraph tags.
|
357
|
+
#
|
358
|
+
# markup = "<p>Some text here in a logical paragraph.<br><br>Some more text, apparently a second paragraph.</p>"
|
359
|
+
# Loofah.html5_fragment(markup).scrub!(:double_breakpoint)
|
360
|
+
# => "<p>Some text here in a logical paragraph.</p><p>Some more text, apparently a second paragraph.</p>"
|
361
|
+
#
|
362
|
+
class DoubleBreakpoint < Scrubber
|
363
|
+
def initialize # rubocop:disable Lint/MissingSuper
|
364
|
+
@direction = :top_down
|
365
|
+
end
|
366
|
+
|
367
|
+
def scrub(node)
|
368
|
+
return CONTINUE unless (node.type == Nokogiri::XML::Node::ELEMENT_NODE) && (node.name == "p")
|
369
|
+
|
370
|
+
paragraph_with_break_point_nodes = node.xpath("//p[br[following-sibling::br]]")
|
371
|
+
|
372
|
+
paragraph_with_break_point_nodes.each do |paragraph_node|
|
373
|
+
new_paragraph = paragraph_node.add_previous_sibling("<p>").first
|
374
|
+
|
375
|
+
paragraph_node.children.each do |child|
|
376
|
+
remove_blank_text_nodes(child)
|
377
|
+
end
|
378
|
+
|
379
|
+
paragraph_node.children.each do |child|
|
380
|
+
# already unlinked
|
381
|
+
next if child.parent.nil?
|
382
|
+
|
383
|
+
if child.name == "br" && child.next_sibling.name == "br"
|
384
|
+
new_paragraph = paragraph_node.add_previous_sibling("<p>").first
|
385
|
+
child.next_sibling.unlink
|
386
|
+
child.unlink
|
387
|
+
else
|
388
|
+
child.parent = new_paragraph
|
389
|
+
end
|
390
|
+
end
|
391
|
+
|
392
|
+
paragraph_node.unlink
|
393
|
+
end
|
394
|
+
|
395
|
+
CONTINUE
|
396
|
+
end
|
397
|
+
|
398
|
+
private
|
399
|
+
|
400
|
+
def remove_blank_text_nodes(node)
|
401
|
+
node.unlink if node.text? && node.blank?
|
402
|
+
end
|
403
|
+
end
|
285
404
|
#
|
286
405
|
# A hash that maps a symbol (like +:prune+) to the appropriate Scrubber (Loofah::Scrubbers::Prune).
|
287
406
|
#
|
@@ -292,8 +411,11 @@ module Loofah
|
|
292
411
|
strip: Strip,
|
293
412
|
nofollow: NoFollow,
|
294
413
|
noopener: NoOpener,
|
414
|
+
noreferrer: NoReferrer,
|
415
|
+
targetblank: TargetBlank,
|
295
416
|
newline_block_elements: NewlineBlockElements,
|
296
417
|
unprintable: Unprintable,
|
418
|
+
double_breakpoint: DoubleBreakpoint,
|
297
419
|
}
|
298
420
|
|
299
421
|
class << self
|
data/lib/loofah/version.rb
CHANGED
metadata
CHANGED
@@ -1,15 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: loofah
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 2.
|
4
|
+
version: 2.24.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Mike Dalessio
|
8
8
|
- Bryan Helmkamp
|
9
|
-
autorequire:
|
10
9
|
bindir: bin
|
11
10
|
cert_chain: []
|
12
|
-
date:
|
11
|
+
date: 1980-01-02 00:00:00.000000000 Z
|
13
12
|
dependencies:
|
14
13
|
- !ruby/object:Gem::Dependency
|
15
14
|
name: crass
|
@@ -82,7 +81,7 @@ metadata:
|
|
82
81
|
bug_tracker_uri: https://github.com/flavorjones/loofah/issues
|
83
82
|
changelog_uri: https://github.com/flavorjones/loofah/blob/main/CHANGELOG.md
|
84
83
|
documentation_uri: https://www.rubydoc.info/gems/loofah/
|
85
|
-
|
84
|
+
funding_uri: https://github.com/sponsors/flavorjones
|
86
85
|
rdoc_options: []
|
87
86
|
require_paths:
|
88
87
|
- lib
|
@@ -97,8 +96,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
97
96
|
- !ruby/object:Gem::Version
|
98
97
|
version: '0'
|
99
98
|
requirements: []
|
100
|
-
rubygems_version: 3.
|
101
|
-
signing_key:
|
99
|
+
rubygems_version: 3.6.8
|
102
100
|
specification_version: 4
|
103
101
|
summary: Loofah is a general library for manipulating and transforming HTML/XML documents
|
104
102
|
and fragments, built on top of Nokogiri.
|