loofah 2.1.0 → 2.2.0

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of loofah might be problematic. Click here for more details.

checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 13e19d4b2c9e5b356dbb76d7948604f0d8dbcce3
4
- data.tar.gz: 6a74873ffe59ebca17b477993527d0766eb234ac
3
+ metadata.gz: 6656f9e5edc815b2c5ee676d1c4fb818b2dc03f4
4
+ data.tar.gz: 7bea1d04f8af479fd825c7adf687f0ca0c624830
5
5
  SHA512:
6
- metadata.gz: ccdaffd6dd04b57d35941082b95f0605ff8778d167b4d37fca6fa9654b7734c1eb820507e604c395d97a0bfa6dd8196db32200fa6e0ac8779a969252424ca9fe
7
- data.tar.gz: 43650a3a8b6c437f3bf323b1aa288ed1b0aa3e51cc0710ff508e2443a066c82c50fdccbaadd9e30910176f8fc1488c8613b5d75c591534aab2fc4b6e2cfca8a3
6
+ metadata.gz: 42f030b7228867ebf322c9d8e286349e1288ef3d60f90fe404b0d9250cc626ea6fad84ff1325cd2754ea4a7fdf80802a4bdae5a9b7121ac312e56d96c280d1a3
7
+ data.tar.gz: 8a67c56281a65b6e89d8623f40423ae41ed2628eeb0a90193196cfb87aeb4efccbe23c961b05ab26a247bac0117a55b68dea97ab6b67076e272ebad8471e33cb
@@ -1,10 +1,32 @@
1
1
  # Changelog
2
2
 
3
+ ## 2.2.0 / 2018-02-11
4
+
5
+ Features:
6
+
7
+ * Support HTML5 `<main>` tag. #133 (Thanks, @MothOnMars!)
8
+ * Recognize HTML5 block elements. #136 (Thanks, @MothOnMars!)
9
+ * Support SVG `<symbol>` tag. #131 (Thanks, @baopham!)
10
+ * Support for whitelisting CSS functions, initially just `calc` and `rgb`. #122/#123/#129 (Thanks, @NikoRoberts!)
11
+ * Whitelist CSS property `list-style-type`. #68/#137/#142 (Thanks, @andela-ysanni and @NikoRoberts!)
12
+
13
+ Bugfixes:
14
+
15
+ * Properly handle nested `script` tags. #127.
16
+
17
+
18
+ ## 2.1.1 / 2017-09-24
19
+
20
+ Bugfixes:
21
+
22
+ * Removed warning for unused variable. #124 (Thanks, @y-yagi!)
23
+
24
+
3
25
  ## 2.1.0 / 2017-09-24
4
26
 
5
27
  Notes:
6
28
 
7
- * Re-implemented CSS parsing and sanitization using the {crass}[https://github.com/rgrove/crass] library. #91
29
+ * Re-implemented CSS parsing and sanitization using the [crass](https://github.com/rgrove/crass) library. #91
8
30
 
9
31
 
10
32
  Features:
data/Gemfile CHANGED
@@ -15,7 +15,7 @@ gem "hoe-gemspec", ">=0", :group => [:development, :test]
15
15
  gem "hoe-debugging", ">=0", :group => [:development, :test]
16
16
  gem "hoe-bundler", ">=0", :group => [:development, :test]
17
17
  gem "hoe-git", ">=0", :group => [:development, :test]
18
- gem "concourse", ">=0.14.0", :group => [:development, :test]
18
+ gem "concourse", ">=0.15.0", :group => [:development, :test]
19
19
  gem "rdoc", "~>4.0", :group => [:development, :test]
20
20
  gem "hoe", "~>3.16", :group => [:development, :test]
21
21
 
@@ -2,7 +2,7 @@ The MIT License
2
2
 
3
3
  The MIT License
4
4
 
5
- Copyright (c) 2009 -- 2014 by Mike Dalessio, Bryan Helmkamp
5
+ Copyright (c) 2009 -- 2018 by Mike Dalessio, Bryan Helmkamp
6
6
 
7
7
  Permission is hereby granted, free of charge, to any person obtaining a copy
8
8
  of this software and associated documentation files (the "Software"), to deal
@@ -3,7 +3,7 @@ CHANGELOG.md
3
3
  Gemfile
4
4
  MIT-LICENSE.txt
5
5
  Manifest.txt
6
- README.rdoc
6
+ README.md
7
7
  Rakefile
8
8
  benchmark/benchmark.rb
9
9
  benchmark/fragment.html
@@ -0,0 +1,361 @@
1
+ # Loofah
2
+
3
+ * https://github.com/flavorjones/loofah
4
+ * http://rubydoc.info/github/flavorjones/loofah/master/frames
5
+ * http://librelist.com/browser/loofah
6
+
7
+ ## Status
8
+
9
+ |System|Status|
10
+ |--|--|
11
+ | Concourse | [![Concourse CI](https://ci.nokogiri.org/api/v1/teams/nokogiri-core/pipelines/loofah/jobs/ruby-2.5/badge)](https://ci.nokogiri.org/teams/nokogiri-core/pipelines/loofah?groups=master) |
12
+ | Code Climate | [![Code Climate](https://codeclimate.com/github/flavorjones/loofah.svg)](https://codeclimate.com/github/flavorjones/loofah) |
13
+ | Version Eye | [![Version Eye](https://www.versioneye.com/ruby/loofah/badge.png)](https://www.versioneye.com/ruby/loofah) |
14
+
15
+
16
+ ## Description
17
+
18
+ Loofah is a general library for manipulating and transforming HTML/XML
19
+ documents and fragments. It's built on top of Nokogiri and libxml2, so
20
+ it's fast and has a nice API.
21
+
22
+ Loofah excels at HTML sanitization (XSS prevention). It includes some
23
+ nice HTML sanitizers, which are based on HTML5lib's whitelist, so it
24
+ most likely won't make your codes less secure. (These statements have
25
+ not been evaluated by Netexperts.)
26
+
27
+ ActiveRecord extensions for sanitization are available in the
28
+ [`loofah-activerecord` gem](https://github.com/flavorjones/loofah-activerecord).
29
+
30
+
31
+ ## Features
32
+
33
+ * Easily write custom scrubbers for HTML/XML leveraging the sweetness of Nokogiri (and HTML5lib's whitelists).
34
+ * Common HTML sanitizing tasks are built-in:
35
+ * _Strip_ unsafe tags, leaving behind only the inner text.
36
+ * _Prune_ unsafe tags and their subtrees, removing all traces that they ever existed.
37
+ * _Escape_ unsafe tags and their subtrees, leaving behind lots of <tt>&lt;</tt> and <tt>&gt;</tt> entities.
38
+ * _Whitewash_ the markup, removing all attributes and namespaced nodes.
39
+ * Common HTML transformation tasks are built-in:
40
+ * Add the _nofollow_ attribute to all hyperlinks.
41
+ * Format markup as plain text, with or without sensible whitespace handling around block elements.
42
+ * Replace Rails's `strip_tags` and `sanitize` view helper methods.
43
+
44
+
45
+ ## Compare and Contrast
46
+
47
+ Loofah is one of two known Ruby XSS/sanitization solutions that
48
+ guarantees well-formed and valid markup (the other is Sanitize, which
49
+ also uses Nokogiri).
50
+
51
+ Loofah works on XML, XHTML and HTML documents.
52
+
53
+ Also, it's pretty fast. Here is a benchmark comparing Loofah to other
54
+ commonly-used libraries (ActionView, Sanitize, HTML5lib and HTMLfilter):
55
+
56
+ * https://gist.github.com/170193
57
+
58
+ Lastly, Loofah is extensible. It's super-easy to write your own custom
59
+ scrubbers for whatever document manipulation you need. You don't like
60
+ the built-in scrubbers? Build your own, like a boss.
61
+
62
+
63
+ ## The Basics
64
+
65
+ Loofah wraps [Nokogiri](http://nokogiri.org) in a loving
66
+ embrace. Nokogiri is an excellent HTML/XML parser. If you don't know
67
+ how Nokogiri works, you might want to pause for a moment and go check
68
+ it out. I'll wait.
69
+
70
+ Loofah presents the following classes:
71
+
72
+ * `Loofah::HTML::Document` and `Loofah::HTML::DocumentFragment`
73
+ * `Loofah::XML::Document` and `Loofah::XML::DocumentFragment`
74
+ * `Loofah::Scrubber`
75
+
76
+ The documents and fragments are subclasses of the similar Nokogiri classes.
77
+
78
+ The Scrubber represents the document manipulation, either by wrapping
79
+ a block,
80
+
81
+ ``` ruby
82
+ span2div = Loofah::Scrubber.new do |node|
83
+ node.name = "div" if node.name == "span"
84
+ end
85
+ ```
86
+
87
+ or by implementing a method.
88
+
89
+
90
+ ### Side Note: Fragments vs Documents
91
+
92
+ Generally speaking, unless you expect to have a DOCTYPE and a single
93
+ root node, you don't have a *document*, you have a *fragment*. For
94
+ HTML, another rule of thumb is that *documents* have `html` and `body`
95
+ tags, and *fragments* usually do not.
96
+
97
+ HTML fragments should be parsed with Loofah.fragment. The result won't
98
+ be wrapped in `html` or `body` tags, won't have a DOCTYPE declaration,
99
+ `head` elements will be silently ignored, and multiple root nodes are
100
+ allowed.
101
+
102
+ XML fragments should be parsed with Loofah.xml_fragment. The result
103
+ won't have a DOCTYPE declaration, and multiple root nodes are allowed.
104
+
105
+ HTML documents should be parsed with Loofah.document. The result will
106
+ have a DOCTYPE declaration, along with `html`, `head` and `body` tags.
107
+
108
+ XML documents should be parsed with Loofah.xml_document. The result
109
+ will have a DOCTYPE declaration and a single root node.
110
+
111
+
112
+ ### Loofah::HTML::Document and Loofah::HTML::DocumentFragment
113
+
114
+ These classes are subclasses of Nokogiri::HTML::Document and
115
+ Nokogiri::HTML::DocumentFragment, so you get all the markup
116
+ fixer-uppery and API goodness of Nokogiri.
117
+
118
+ The module methods Loofah.document and Loofah.fragment will parse an
119
+ HTML document and an HTML fragment, respectively.
120
+
121
+ ``` ruby
122
+ Loofah.document(unsafe_html).is_a?(Nokogiri::HTML::Document) # => true
123
+ Loofah.fragment(unsafe_html).is_a?(Nokogiri::HTML::DocumentFragment) # => true
124
+ ```
125
+
126
+ Loofah injects a `scrub!` method, which takes either a symbol (for
127
+ built-in scrubbers) or a Loofah::Scrubber object (for custom
128
+ scrubbers), and modifies the document in-place.
129
+
130
+ Loofah overrides `to_s` to return HTML:
131
+
132
+ ``` ruby
133
+ unsafe_html = "ohai! <div>div is safe</div> <script>but script is not</script>"
134
+
135
+ doc = Loofah.fragment(unsafe_html).scrub!(:prune)
136
+ doc.to_s # => "ohai! <div>div is safe</div> "
137
+ ```
138
+
139
+ and `text` to return plain text:
140
+
141
+ ``` ruby
142
+ doc.text # => "ohai! div is safe "
143
+ ```
144
+
145
+ Also, `to_text` is available, which does the right thing with
146
+ whitespace around block-level elements.
147
+
148
+ ``` ruby
149
+ doc = Loofah.fragment("<h1>Title</h1><div>Content</div>")
150
+ doc.text # => "TitleContent" # probably not what you want
151
+ doc.to_text # => "\nTitle\n\nContent\n" # better
152
+ ```
153
+
154
+ ### Loofah::XML::Document and Loofah::XML::DocumentFragment
155
+
156
+ These classes are subclasses of Nokogiri::XML::Document and
157
+ Nokogiri::XML::DocumentFragment, so you get all the markup
158
+ fixer-uppery and API goodness of Nokogiri.
159
+
160
+ The module methods Loofah.xml_document and Loofah.xml_fragment will
161
+ parse an XML document and an XML fragment, respectively.
162
+
163
+ ``` ruby
164
+ Loofah.xml_document(bad_xml).is_a?(Nokogiri::XML::Document) # => true
165
+ Loofah.xml_fragment(bad_xml).is_a?(Nokogiri::XML::DocumentFragment) # => true
166
+ ```
167
+
168
+ ### Nodes and NodeSets
169
+
170
+ Nokogiri::XML::Node and Nokogiri::XML::NodeSet also get a `scrub!`
171
+ method, which makes it easy to scrub subtrees.
172
+
173
+ The following code will apply the `employee_scrubber` only to the
174
+ `employee` nodes (and their subtrees) in the document:
175
+
176
+ ``` ruby
177
+ Loofah.xml_document(bad_xml).xpath("//employee").scrub!(employee_scrubber)
178
+ ```
179
+
180
+ And this code will only scrub the first `employee` node and its subtree:
181
+
182
+ ``` ruby
183
+ Loofah.xml_document(bad_xml).at_xpath("//employee").scrub!(employee_scrubber)
184
+ ```
185
+
186
+ ### Loofah::Scrubber
187
+
188
+ A Scrubber wraps up a block (or method) that is run on a document node:
189
+
190
+ ``` ruby
191
+ # change all <span> tags to <div> tags
192
+ span2div = Loofah::Scrubber.new do |node|
193
+ node.name = "div" if node.name == "span"
194
+ end
195
+ ```
196
+
197
+ This can then be run on a document:
198
+
199
+ ``` ruby
200
+ Loofah.fragment("<span>foo</span><p>bar</p>").scrub!(span2div).to_s
201
+ # => "<div>foo</div><p>bar</p>"
202
+ ```
203
+
204
+ Scrubbers can be run on a document in either a top-down traversal (the
205
+ default) or bottom-up. Top-down scrubbers can optionally return
206
+ Scrubber::STOP to terminate the traversal of a subtree. Read below and
207
+ in the Loofah::Scrubber class for more detailed usage.
208
+
209
+ Here's an XML example:
210
+
211
+ ``` ruby
212
+ # remove all <employee> tags that have a "deceased" attribute set to true
213
+ bring_out_your_dead = Loofah::Scrubber.new do |node|
214
+ if node.name == "employee" and node["deceased"] == "true"
215
+ node.remove
216
+ Loofah::Scrubber::STOP # don't bother with the rest of the subtree
217
+ end
218
+ end
219
+ Loofah.xml_document(File.read('plague.xml')).scrub!(bring_out_your_dead)
220
+ ```
221
+
222
+ === Built-In HTML Scrubbers
223
+
224
+ Loofah comes with a set of sanitizing scrubbers that use HTML5lib's
225
+ whitelist algorithm:
226
+
227
+ ``` ruby
228
+ doc.scrub!(:strip) # replaces unknown/unsafe tags with their inner text
229
+ doc.scrub!(:prune) # removes unknown/unsafe tags and their children
230
+ doc.scrub!(:escape) # escapes unknown/unsafe tags, like this: &lt;script&gt;
231
+ doc.scrub!(:whitewash) # removes unknown/unsafe/namespaced tags and their children,
232
+ # and strips all node attributes
233
+ ```
234
+
235
+ Loofah also comes with some common transformation tasks:
236
+
237
+ ``` ruby
238
+ doc.scrub!(:nofollow) # adds rel="nofollow" attribute to links
239
+ doc.scrub!(:unprintable) # removes unprintable characters from text nodes
240
+ ```
241
+
242
+ See Loofah::Scrubbers for more details and example usage.
243
+
244
+
245
+ ### Chaining Scrubbers
246
+
247
+ You can chain scrubbers:
248
+
249
+ ``` ruby
250
+ Loofah.fragment("<span>hello</span> <script>alert('OHAI')</script>") \
251
+ .scrub!(:prune) \
252
+ .scrub!(span2div).to_s
253
+ # => "<div>hello</div> "
254
+ ```
255
+
256
+ ### Shorthand
257
+
258
+ The class methods Loofah.scrub_fragment and Loofah.scrub_document are
259
+ shorthand.
260
+
261
+ ``` ruby
262
+ Loofah.scrub_fragment(unsafe_html, :prune)
263
+ Loofah.scrub_document(unsafe_html, :prune)
264
+ Loofah.scrub_xml_fragment(bad_xml, custom_scrubber)
265
+ Loofah.scrub_xml_document(bad_xml, custom_scrubber)
266
+ ```
267
+
268
+ are the same thing as (and arguably semantically clearer than):
269
+
270
+ ``` ruby
271
+ Loofah.fragment(unsafe_html).scrub!(:prune)
272
+ Loofah.document(unsafe_html).scrub!(:prune)
273
+ Loofah.xml_fragment(bad_xml).scrub!(custom_scrubber)
274
+ Loofah.xml_document(bad_xml).scrub!(custom_scrubber)
275
+ ```
276
+
277
+
278
+ ### View Helpers
279
+
280
+ Loofah has two "view helpers": Loofah::Helpers.sanitize and
281
+ Loofah::Helpers.strip_tags, both of which are drop-in replacements for
282
+ the Rails ActionView helpers of the same name.
283
+ These are no longer required automatically. You must require `loofah/helpers`.
284
+
285
+
286
+ ## Requirements
287
+
288
+ * Nokogiri >= 1.5.9
289
+
290
+
291
+ ## Installation
292
+
293
+ Unsurprisingly:
294
+
295
+ * gem install loofah
296
+
297
+
298
+ ## Support
299
+
300
+ The bug tracker is available here:
301
+
302
+ * https://github.com/flavorjones/loofah/issues
303
+
304
+ And the mailing list is on librelist:
305
+
306
+ * loofah@librelist.com / http://librelist.com
307
+
308
+ And the IRC channel is \#loofah on freenode.
309
+
310
+
311
+ ## Security
312
+
313
+ Some tools may incorrectly report loofah is a potential security
314
+ vulnerability. Loofah depends on Nokogiri, and it's possible to use
315
+ Nokogiri in a dangerous way (by enabling its DTDLOAD option and
316
+ disabling its NONET option). This dangerous Nokogiri configuration,
317
+ which is sometimes used by other components, can create an XML
318
+ External Entity (XXE) vulnerability if the XML data is not trusted.
319
+ However, loofah never enables this dangerous Nokogiri configuration;
320
+ loofah never enables DTDLOAD, and it never disables NONET.
321
+
322
+
323
+ ## Related Links
324
+
325
+ * Nokogiri: http://nokogiri.org
326
+ * libxml2: http://xmlsoft.org
327
+ * html5lib: https://code.google.com/p/html5lib
328
+
329
+
330
+ ## Authors
331
+
332
+ * [Mike Dalessio](http://mike.daless.io) ([@flavorjones](https://twitter.com/flavorjones))
333
+ * Bryan Helmkamp
334
+
335
+ Featuring code contributed by:
336
+
337
+ * Aaron Patterson
338
+ * John Barnette
339
+ * Josh Owens
340
+ * Paul Dix
341
+ * Luke Melia
342
+
343
+ And a big shout-out to Corey Innis for the name, and feedback on the API.
344
+
345
+
346
+ ## Thank You
347
+
348
+ The following people have generously donated via the Pledgie[http://pledgie.com] badge on the {Loofah github page}[https://github.com/flavorjones/loofah]:
349
+
350
+ * Bill Harding
351
+
352
+
353
+ ## Historical Note
354
+
355
+ This library was formerly known as Dryopteris, which was a very bad
356
+ name that nobody could spell properly.
357
+
358
+
359
+ ## License
360
+
361
+ Distributed under the MIT License. See `MIT-LICENSE.txt` for details.
data/Rakefile CHANGED
@@ -28,7 +28,7 @@ Hoe.spec "loofah" do
28
28
  extra_dev_deps << ["hoe-debugging", ">=0"]
29
29
  extra_dev_deps << ["hoe-bundler", ">=0"]
30
30
  extra_dev_deps << ["hoe-git", ">=0"]
31
- extra_dev_deps << ["concourse", ">=0.14.0"]
31
+ extra_dev_deps << ["concourse", ">=0.15.0"]
32
32
  end
33
33
 
34
34
  task :gemspec do
@@ -27,7 +27,7 @@ require 'loofah/html/document_fragment'
27
27
  #
28
28
  module Loofah
29
29
  # The version of Loofah you are using
30
- VERSION = '2.1.0'
30
+ VERSION = '2.2.0'
31
31
 
32
32
  class << self
33
33
  # Shortcut for Loofah::HTML::Document.parse