loofah 2.1.0 → 2.2.0

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of loofah might be problematic. Click here for more details.

metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: loofah
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.1.0
4
+ version: 2.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Mike Dalessio
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2017-09-24 00:00:00.000000000 Z
12
+ date: 2018-02-11 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: nokogiri
@@ -157,14 +157,14 @@ dependencies:
157
157
  requirements:
158
158
  - - ">="
159
159
  - !ruby/object:Gem::Version
160
- version: 0.14.0
160
+ version: 0.15.0
161
161
  type: :development
162
162
  prerelease: false
163
163
  version_requirements: !ruby/object:Gem::Requirement
164
164
  requirements:
165
165
  - - ">="
166
166
  - !ruby/object:Gem::Version
167
- version: 0.14.0
167
+ version: 0.15.0
168
168
  - !ruby/object:Gem::Dependency
169
169
  name: rdoc
170
170
  requirement: !ruby/object:Gem::Requirement
@@ -193,19 +193,7 @@ dependencies:
193
193
  - - "~>"
194
194
  - !ruby/object:Gem::Version
195
195
  version: '3.16'
196
- description: |-
197
- Loofah is a general library for manipulating and transforming HTML/XML
198
- documents and fragments. It's built on top of Nokogiri and libxml2, so
199
- it's fast and has a nice API.
200
-
201
- Loofah excels at HTML sanitization (XSS prevention). It includes some
202
- nice HTML sanitizers, which are based on HTML5lib's whitelist, so it
203
- most likely won't make your codes less secure. (These statements have
204
- not been evaluated by Netexperts.)
205
-
206
- ActiveRecord extensions for sanitization are available in the
207
- `loofah-activerecord` gem (see
208
- https://github.com/flavorjones/loofah-activerecord).
196
+ description: ''
209
197
  email:
210
198
  - mike.dalessio@gmail.com
211
199
  - bryan@brynary.com
@@ -215,14 +203,14 @@ extra_rdoc_files:
215
203
  - CHANGELOG.md
216
204
  - MIT-LICENSE.txt
217
205
  - Manifest.txt
218
- - README.rdoc
206
+ - README.md
219
207
  files:
220
208
  - ".gemtest"
221
209
  - CHANGELOG.md
222
210
  - Gemfile
223
211
  - MIT-LICENSE.txt
224
212
  - Manifest.txt
225
- - README.rdoc
213
+ - README.md
226
214
  - Rakefile
227
215
  - benchmark/benchmark.rb
228
216
  - benchmark/fragment.html
@@ -254,7 +242,7 @@ files:
254
242
  - test/unit/test_helpers.rb
255
243
  - test/unit/test_scrubber.rb
256
244
  - test/unit/test_scrubbers.rb
257
- homepage: https://github.com/flavorjones/loofah
245
+ homepage:
258
246
  licenses:
259
247
  - MIT
260
248
  metadata: {}
@@ -279,6 +267,5 @@ rubyforge_project:
279
267
  rubygems_version: 2.6.12
280
268
  signing_key:
281
269
  specification_version: 4
282
- summary: Loofah is a general library for manipulating and transforming HTML/XML documents
283
- and fragments
270
+ summary: ''
284
271
  test_files: []
@@ -1,314 +0,0 @@
1
- = Loofah {<img src="https://travis-ci.org/flavorjones/loofah.svg?branch=master" alt="Build Status" />}[https://travis-ci.org/flavorjones/loofah]
2
-
3
- * https://github.com/flavorjones/loofah
4
- * http://rubydoc.info/github/flavorjones/loofah/master/frames
5
- * http://librelist.com/browser/loofah
6
-
7
- == Description
8
-
9
- Loofah is a general library for manipulating and transforming HTML/XML
10
- documents and fragments. It's built on top of Nokogiri and libxml2, so
11
- it's fast and has a nice API.
12
-
13
- Loofah excels at HTML sanitization (XSS prevention). It includes some
14
- nice HTML sanitizers, which are based on HTML5lib's whitelist, so it
15
- most likely won't make your codes less secure. (These statements have
16
- not been evaluated by Netexperts.)
17
-
18
- ActiveRecord extensions for sanitization are available in the
19
- `loofah-activerecord` gem (see
20
- https://github.com/flavorjones/loofah-activerecord).
21
-
22
- == Features
23
-
24
- * Easily write custom scrubbers for HTML/XML leveraging the sweetness of Nokogiri (and HTML5lib's whitelists).
25
- * Common HTML sanitizing tasks are built-in:
26
- * _Strip_ unsafe tags, leaving behind only the inner text.
27
- * _Prune_ unsafe tags and their subtrees, removing all traces that they ever existed.
28
- * _Escape_ unsafe tags and their subtrees, leaving behind lots of <tt>&lt;</tt> and <tt>&gt;</tt> entities.
29
- * _Whitewash_ the markup, removing all attributes and namespaced nodes.
30
- * Common HTML transformation tasks are built-in:
31
- * Add the _nofollow_ attribute to all hyperlinks.
32
- * Format markup as plain text, with or without sensible whitespace handling around block elements.
33
- * Replace Rails's +strip_tags+ and +sanitize+ view helper methods.
34
-
35
- == Compare and Contrast
36
-
37
- Loofah is one of two known Ruby XSS/sanitization solutions that
38
- guarantees well-formed and valid markup (the other is Sanitize, which
39
- also uses Nokogiri).
40
-
41
- Loofah works on XML, XHTML and HTML documents.
42
-
43
- Also, it's pretty fast. Here is a benchmark comparing Loofah to other
44
- commonly-used libraries (ActionView, Sanitize, HTML5lib and HTMLfilter):
45
-
46
- * https://gist.github.com/170193
47
-
48
- Lastly, Loofah is extensible. It's super-easy to write your own custom
49
- scrubbers for whatever document manipulation you need. You don't like
50
- the built-in scrubbers? Build your own, like a boss.
51
-
52
- == The Basics
53
-
54
- Loofah wraps Nokogiri[http://nokogiri.org] in a loving
55
- embrace. Nokogiri[http://nokogiri.org] is an excellent HTML/XML
56
- parser. If you don't know how Nokogiri[http://nokogiri.org] works, you
57
- might want to pause for a moment and go check it out. I'll wait.
58
-
59
- Loofah presents the following classes:
60
-
61
- * Loofah::HTML::Document and Loofah::HTML::DocumentFragment
62
- * Loofah::XML::Document and Loofah::XML::DocumentFragment
63
- * Loofah::Scrubber
64
-
65
- The documents and fragments are subclasses of the similar Nokogiri classes.
66
-
67
- The Scrubber represents the document manipulation, either by wrapping
68
- a block,
69
-
70
- span2div = Loofah::Scrubber.new do |node|
71
- node.name = "div" if node.name == "span"
72
- end
73
-
74
- or by implementing a method.
75
-
76
- === Side Note: Fragments vs Documents
77
-
78
- Generally speaking, unless you expect to have a DOCTYPE and a single
79
- root node, you don't have a *document*, you have a *fragment*. For
80
- HTML, another rule of thumb is that *documents* have +html+ and +body+
81
- tags, and *fragments* usually do not.
82
-
83
- HTML fragments should be parsed with Loofah.fragment. The result won't
84
- be wrapped in +html+ or +body+ tags, won't have a DOCTYPE declaration,
85
- +head+ elements will be silently ignored, and multiple root nodes are
86
- allowed.
87
-
88
- XML fragments should be parsed with Loofah.xml_fragment. The result
89
- won't have a DOCTYPE declaration, and multiple root nodes are allowed.
90
-
91
- HTML documents should be parsed with Loofah.document. The result will
92
- have a DOCTYPE declaration, along with +html+, +head+ and +body+ tags.
93
-
94
- XML documents should be parsed with Loofah.xml_document. The result
95
- will have a DOCTYPE declaration and a single root node.
96
-
97
- === Loofah::HTML::Document and Loofah::HTML::DocumentFragment
98
-
99
- These classes are subclasses of Nokogiri::HTML::Document and
100
- Nokogiri::HTML::DocumentFragment, so you get all the markup
101
- fixer-uppery and API goodness of Nokogiri.
102
-
103
- The module methods Loofah.document and Loofah.fragment will parse an
104
- HTML document and an HTML fragment, respectively.
105
-
106
- Loofah.document(unsafe_html).is_a?(Nokogiri::HTML::Document) # => true
107
- Loofah.fragment(unsafe_html).is_a?(Nokogiri::HTML::DocumentFragment) # => true
108
-
109
- Loofah injects a +scrub!+ method, which takes either a symbol (for
110
- built-in scrubbers) or a Loofah::Scrubber object (for custom
111
- scrubbers), and modifies the document in-place.
112
-
113
- Loofah overrides +to_s+ to return HTML:
114
-
115
- unsafe_html = "ohai! <div>div is safe</div> <script>but script is not</script>"
116
-
117
- doc = Loofah.fragment(unsafe_html).scrub!(:strip)
118
- doc.to_s # => "ohai! <div>div is safe</div> "
119
-
120
- and +text+ to return plain text:
121
-
122
- doc.text # => "ohai! div is safe "
123
-
124
- Also, +to_text+ is available, which does the right thing with
125
- whitespace around block-level elements.
126
-
127
- doc = Loofah.fragment("<h1>Title</h1><div>Content</div>")
128
- doc.text # => "TitleContent" # probably not what you want
129
- doc.to_text # => "\nTitle\n\nContent\n" # better
130
-
131
- === Loofah::XML::Document and Loofah::XML::DocumentFragment
132
-
133
- These classes are subclasses of Nokogiri::XML::Document and
134
- Nokogiri::XML::DocumentFragment, so you get all the markup
135
- fixer-uppery and API goodness of Nokogiri.
136
-
137
- The module methods Loofah.xml_document and Loofah.xml_fragment will
138
- parse an XML document and an XML fragment, respectively.
139
-
140
- Loofah.xml_document(bad_xml).is_a?(Nokogiri::XML::Document) # => true
141
- Loofah.xml_fragment(bad_xml).is_a?(Nokogiri::XML::DocumentFragment) # => true
142
-
143
- === Nodes and NodeSets
144
-
145
- Nokogiri::XML::Node and Nokogiri::XML::NodeSet also get a +scrub!+
146
- method, which makes it easy to scrub subtrees.
147
-
148
- The following code will apply the +employee_scrubber+ only to the
149
- +employee+ nodes (and their subtrees) in the document:
150
-
151
- Loofah.xml_document(bad_xml).xpath("//employee").scrub!(employee_scrubber)
152
-
153
- And this code will only scrub the first +employee+ node and its subtree:
154
-
155
- Loofah.xml_document(bad_xml).at_xpath("//employee").scrub!(employee_scrubber)
156
-
157
- === Loofah::Scrubber
158
-
159
- A Scrubber wraps up a block (or method) that is run on a document node:
160
-
161
- # change all <span> tags to <div> tags
162
- span2div = Loofah::Scrubber.new do |node|
163
- node.name = "div" if node.name == "span"
164
- end
165
-
166
- This can then be run on a document:
167
-
168
- Loofah.fragment("<span>foo</span><p>bar</p>").scrub!(span2div).to_s
169
- # => "<div>foo</div><p>bar</p>"
170
-
171
- Scrubbers can be run on a document in either a top-down traversal (the
172
- default) or bottom-up. Top-down scrubbers can optionally return
173
- Scrubber::STOP to terminate the traversal of a subtree. Read below and
174
- in the Loofah::Scrubber class for more detailed usage.
175
-
176
- Here's an XML example:
177
-
178
- # remove all <employee> tags that have a "deceased" attribute set to true
179
- bring_out_your_dead = Loofah::Scrubber.new do |node|
180
- if node.name == "employee" and node["deceased"] == "true"
181
- node.remove
182
- Loofah::Scrubber::STOP # don't bother with the rest of the subtree
183
- end
184
- end
185
- Loofah.xml_document(File.read('plague.xml')).scrub!(bring_out_your_dead)
186
-
187
- === Built-In HTML Scrubbers
188
-
189
- Loofah comes with a set of sanitizing scrubbers that use HTML5lib's
190
- whitelist algorithm:
191
-
192
- doc.scrub!(:strip) # replaces unknown/unsafe tags with their inner text
193
- doc.scrub!(:prune) # removes unknown/unsafe tags and their children
194
- doc.scrub!(:escape) # escapes unknown/unsafe tags, like this: &lt;script&gt;
195
- doc.scrub!(:whitewash) # removes unknown/unsafe/namespaced tags and their children,
196
- # and strips all node attributes
197
-
198
- Loofah also comes with some common transformation tasks:
199
-
200
- doc.scrub!(:nofollow) # adds rel="nofollow" attribute to links
201
- doc.scrub!(:unprintable) # removes unprintable characters from text nodes
202
-
203
- See Loofah::Scrubbers for more details and example usage.
204
-
205
- === Chaining Scrubbers
206
-
207
- You can chain scrubbers:
208
-
209
- Loofah.fragment("<span>hello</span> <script>alert('OHAI')</script>") \
210
- .scrub!(:prune) \
211
- .scrub!(span2div).to_s
212
- # => "<div>hello</div> "
213
-
214
- === Shorthand
215
-
216
- The class methods Loofah.scrub_fragment and Loofah.scrub_document are
217
- shorthand.
218
-
219
- Loofah.scrub_fragment(unsafe_html, :prune)
220
- Loofah.scrub_document(unsafe_html, :prune)
221
- Loofah.scrub_xml_fragment(bad_xml, custom_scrubber)
222
- Loofah.scrub_xml_document(bad_xml, custom_scrubber)
223
-
224
- are the same thing as (and arguably semantically clearer than):
225
-
226
- Loofah.fragment(unsafe_html).scrub!(:prune)
227
- Loofah.document(unsafe_html).scrub!(:prune)
228
- Loofah.xml_fragment(bad_xml).scrub!(custom_scrubber)
229
- Loofah.xml_document(bad_xml).scrub!(custom_scrubber)
230
-
231
- === View Helpers
232
-
233
- Loofah has two "view helpers": Loofah::Helpers.sanitize and
234
- Loofah::Helpers.strip_tags, both of which are drop-in replacements for
235
- the Rails ActionView helpers of the same name.
236
- These are no longer required automatically. You must require `loofah/helpers`.
237
-
238
- == Requirements
239
-
240
- * Nokogiri >= 1.4.4
241
-
242
- == Installation
243
-
244
- Unsurprisingly:
245
-
246
- * gem install loofah
247
-
248
- == Support
249
-
250
- The bug tracker is available here:
251
-
252
- * https://github.com/flavorjones/loofah/issues
253
-
254
- And the mailing list is on librelist:
255
-
256
- * loofah@librelist.com / http://librelist.com
257
-
258
- And the IRC channel is \#loofah on freenode.
259
-
260
- == Related Links
261
-
262
- * Nokogiri: http://nokogiri.org
263
- * libxml2: http://xmlsoft.org
264
- * html5lib: https://code.google.com/p/html5lib
265
-
266
- == Authors
267
-
268
- * {Mike Dalessio}[http://mike.daless.io] (@flavorjones[https://twitter.com/flavorjones])
269
- * Bryan Helmkamp
270
-
271
- Featuring code contributed by:
272
-
273
- * Aaron Patterson
274
- * John Barnette
275
- * Josh Owens
276
- * Paul Dix
277
- * Luke Melia
278
-
279
- And a big shout-out to Corey Innis for the name, and feedback on the API.
280
-
281
- == Thank You
282
-
283
- The following people have generously donated via the Pledgie[http://pledgie.com] badge on the {Loofah github page}[https://github.com/flavorjones/loofah]:
284
-
285
- * Bill Harding
286
-
287
- == Historical Note
288
-
289
- This library was formerly known as Dryopteris, which was a very bad
290
- name that nobody could spell properly.
291
-
292
- == License
293
-
294
- The MIT License
295
-
296
- Copyright (c) 2009 -- 2014 by Mike Dalessio, Bryan Helmkamp
297
-
298
- Permission is hereby granted, free of charge, to any person obtaining a copy
299
- of this software and associated documentation files (the "Software"), to deal
300
- in the Software without restriction, including without limitation the rights
301
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
302
- copies of the Software, and to permit persons to whom the Software is
303
- furnished to do so, subject to the following conditions:
304
-
305
- The above copyright notice and this permission notice shall be included in
306
- all copies or substantial portions of the Software.
307
-
308
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
309
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
310
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
311
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
312
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
313
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
314
- THE SOFTWARE.