sanitize 2.0.3 → 2.0.4

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of sanitize might be problematic. Click here for more details.

checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 80fc1e1df6ce565080a348c635645027839d8fbf
4
+ data.tar.gz: ae3afc1372f0db565647ab7f46310531a0f8b940
5
+ SHA512:
6
+ metadata.gz: fc144309e028fcc172f3e4066dd34789c1a1b7f8abd341009052daa12672710a1e6ca52befd7f0cf39308c56fca70317e11c459f4b086d10c071d744477da70d
7
+ data.tar.gz: 962f5668cf47a557cc17d3f6eca26bcee0df23f93f76d11014fb53f5650560a5a8bbc171b63ddb00d926fa847154958f27d93c9b2e0a46cd4cc585f27cf4a309
data/HISTORY.md CHANGED
@@ -1,10 +1,21 @@
1
1
  Sanitize History
2
2
  ================================================================================
3
3
 
4
+ Version 2.0.4 (2013-06-12)
5
+ --------------------------
6
+
7
+ * Added `Sanitize.clean_document`, which sanitizes a full HTML document rather
8
+ than just a fragment. [Ben Anderson]
9
+
10
+ * Nokogiri dependency bumped to 1.6.x.
11
+
12
+ * Dropped support for Ruby versions older than 1.9.2.
13
+
14
+
4
15
  Version 2.0.3 (2011-07-01)
5
16
  --------------------------
6
17
 
7
- * Loosened the Nokogiri dependency to allow Nokogiri 1.5.x.
18
+ * Loosened the Nokogiri dependency to allow Nokogiri 1.5.x.
8
19
 
9
20
 
10
21
  Version 2.0.2 (2011-05-21)
@@ -90,7 +101,7 @@ Version 1.1.0 (2009-10-11)
90
101
  * Added an `:output` config setting to allow the output format to be
91
102
  specified. Supported formats are `:xhtml` (the default) and `:html` (which
92
103
  outputs HTML4).
93
- * Changed protocol regex to ensure Sanitize doesn't kill URLs with colons in
104
+ * Changed protocol regex to ensure Sanitize doesn't kill URLs with colons in
94
105
  path segments. [Peter Cooper]
95
106
 
96
107
 
data/LICENSE CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (c) 2011 Ryan Grove <ryan@wonko.com>
1
+ Copyright (c) 2013 Ryan Grove <ryan@wonko.com>
2
2
 
3
3
  Permission is hereby granted, free of charge, to any person obtaining a copy of
4
4
  this software and associated documentation files (the 'Software'), to deal in
data/README.rdoc CHANGED
@@ -14,15 +14,12 @@ of fragile regular expressions, Sanitize has no trouble dealing with malformed
14
14
  or maliciously-formed HTML, and will always output valid HTML or XHTML.
15
15
 
16
16
  *Author*:: Ryan Grove (mailto:ryan@wonko.com)
17
- *Version*:: 2.0.3 (2011-07-01)
18
- *Copyright*:: Copyright (c) 2011 Ryan Grove. All rights reserved.
17
+ *Version*:: 2.0.4 (2013-06-12)
18
+ *Copyright*:: Copyright (c) 2013 Ryan Grove. All rights reserved.
19
19
  *License*:: MIT License (http://opensource.org/licenses/mit-license.php)
20
20
  *Website*:: http://github.com/rgrove/sanitize
21
21
 
22
- == Requires
23
-
24
- * Nokogiri >= 1.4.4
25
- * libxml2 >= 2.7.2
22
+ {<img src="https://secure.travis-ci.org/rgrove/sanitize.png?branch=master" alt="Build Status" />}[http://travis-ci.org/rgrove/sanitize]
26
23
 
27
24
  == Installation
28
25
 
@@ -47,6 +44,12 @@ behind.
47
44
 
48
45
  Sanitize.clean(html) # => 'foo'
49
46
 
47
+ ...
48
+
49
+ # or sanitize an entire HTML document (example assumes _html_ is whitelisted)
50
+ html = '<!DOCTYPE html><html><b><a href="http://foo.com/">foo</a></b><img src="http://foo.com/bar.jpg"></html>'
51
+ Sanitize.clean_document(html) # => '<!DOCTYPE html>\n<html>foo</html>\n'
52
+
50
53
  == Configuration
51
54
 
52
55
  In addition to the ultra-safe default settings, Sanitize comes with three other
@@ -289,25 +292,25 @@ your own hands.
289
292
  The following example demonstrates how to create a depth-first Sanitize
290
293
  transformer that will safely whitelist valid YouTube video embeds without having
291
294
  to blindly allow other kinds of embedded content, which would be the case if you
292
- tried to do this by just whitelisting all <iframe> elements:
295
+ tried to do this by just whitelisting all <code><iframe></code> elements:
293
296
 
294
297
  lambda do |env|
295
298
  node = env[:node]
296
299
  node_name = env[:node_name]
297
-
300
+
298
301
  # Don't continue if this node is already whitelisted or is not an element.
299
302
  return if env[:is_whitelisted] || !node.element?
300
303
 
301
304
  # Don't continue unless the node is an iframe.
302
305
  return unless node_name == 'iframe'
303
-
306
+
304
307
  # Verify that the video URL is actually a valid YouTube video URL.
305
- return unless node['src'] =~ /\Ahttp:\/\/(?:www\.)?youtube\.com\//
306
-
308
+ return unless node['src'] =~ /\Ahttps?:\/\/(?:www\.)?youtube(?:-nocookie)?\.com\//
309
+
307
310
  # We're now certain that this is a YouTube embed, but we still need to run
308
311
  # it through a special Sanitize step to ensure that no unwanted elements or
309
312
  # attributes that don't belong in a YouTube embed can sneak in.
310
- Sanitize.clean_node!(node.parent, {
313
+ Sanitize.clean_node!(node, {
311
314
  :elements => %w[iframe],
312
315
 
313
316
  :attributes => {
@@ -327,22 +330,24 @@ Sanitize was created and is maintained by Ryan Grove (ryan@wonko.com).
327
330
 
328
331
  The following lovely people have also contributed to Sanitize:
329
332
 
330
- * Wilson Bilkovich (wilson@supremetyrant.com)
331
- * Peter Cooper (git@peterc.org)
332
- * Gabe da Silveira (gabe@websaviour.com)
333
- * Nicholas Evans (owlmanatt@gmail.com)
334
- * Adam Hooper (adam@adamhooper.com)
335
- * Mutwin Kraus (mutle@blogage.de)
336
- * Eaden McKee (eadz@eadz.co.nz)
337
- * Dev Purkayastha (dev.purkayastha@gmail.com)
338
- * David Reese (work@whatcould.com)
339
- * Ardie Saeidi (ardalan.saeidi@gmail.com)
340
- * Rafael Souza (me@rafaelss.com)
341
- * Ben Wanicur (bwanicur@verticalresponse.com)
333
+ * Ben Anderson
334
+ * Wilson Bilkovich
335
+ * Peter Cooper
336
+ * Gabe da Silveira
337
+ * Nicholas Evans
338
+ * Nils Gemeinhardt
339
+ * Adam Hooper
340
+ * Mutwin Kraus
341
+ * Eaden McKee
342
+ * Dev Purkayastha
343
+ * David Reese
344
+ * Ardie Saeidi
345
+ * Rafael Souza
346
+ * Ben Wanicur
342
347
 
343
348
  == License
344
349
 
345
- Copyright (c) 2011 Ryan Grove (ryan@wonko.com)
350
+ Copyright (c) 2013 Ryan Grove (ryan@wonko.com)
346
351
 
347
352
  Permission is hereby granted, free of charge, to any person obtaining a copy of
348
353
  this software and associated documentation files (the 'Software'), to deal in
@@ -1,16 +1,16 @@
1
1
  #--
2
- # Copyright (c) 2011 Ryan Grove <ryan@wonko.com>
3
- #
2
+ # Copyright (c) 2013 Ryan Grove <ryan@wonko.com>
3
+ #
4
4
  # Permission is hereby granted, free of charge, to any person obtaining a copy
5
5
  # of this software and associated documentation files (the 'Software'), to deal
6
6
  # in the Software without restriction, including without limitation the rights
7
7
  # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
8
8
  # copies of the Software, and to permit persons to whom the Software is
9
9
  # furnished to do so, subject to the following conditions:
10
- #
10
+ #
11
11
  # The above copyright notice and this permission notice shall be included in all
12
12
  # copies or substantial portions of the Software.
13
- #
13
+ #
14
14
  # THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
15
15
  # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
16
16
  # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
@@ -1,16 +1,16 @@
1
1
  #--
2
- # Copyright (c) 2011 Ryan Grove <ryan@wonko.com>
3
- #
2
+ # Copyright (c) 2013 Ryan Grove <ryan@wonko.com>
3
+ #
4
4
  # Permission is hereby granted, free of charge, to any person obtaining a copy
5
5
  # of this software and associated documentation files (the 'Software'), to deal
6
6
  # in the Software without restriction, including without limitation the rights
7
7
  # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
8
8
  # copies of the Software, and to permit persons to whom the Software is
9
9
  # furnished to do so, subject to the following conditions:
10
- #
10
+ #
11
11
  # The above copyright notice and this permission notice shall be included in all
12
12
  # copies or substantial portions of the Software.
13
- #
13
+ #
14
14
  # THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
15
15
  # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
16
16
  # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
@@ -1,16 +1,16 @@
1
1
  #--
2
- # Copyright (c) 2011 Ryan Grove <ryan@wonko.com>
3
- #
2
+ # Copyright (c) 2013 Ryan Grove <ryan@wonko.com>
3
+ #
4
4
  # Permission is hereby granted, free of charge, to any person obtaining a copy
5
5
  # of this software and associated documentation files (the 'Software'), to deal
6
6
  # in the Software without restriction, including without limitation the rights
7
7
  # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
8
8
  # copies of the Software, and to permit persons to whom the Software is
9
9
  # furnished to do so, subject to the following conditions:
10
- #
10
+ #
11
11
  # The above copyright notice and this permission notice shall be included in all
12
12
  # copies or substantial portions of the Software.
13
- #
13
+ #
14
14
  # THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
15
15
  # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
16
16
  # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
@@ -1,5 +1,5 @@
1
1
  #--
2
- # Copyright (c) 2011 Ryan Grove <ryan@wonko.com>
2
+ # Copyright (c) 2013 Ryan Grove <ryan@wonko.com>
3
3
  #
4
4
  # Permission is hereby granted, free of charge, to any person obtaining a copy
5
5
  # of this software and associated documentation files (the 'Software'), to deal
@@ -14,7 +14,7 @@ class Sanitize; module Transformers
14
14
  @whitespace_elements = Set.new(config[:whitespace_elements])
15
15
 
16
16
  if config[:remove_contents].is_a?(Array)
17
- @remove_element_contents.merge(config[:remove_contents])
17
+ @remove_element_contents.merge(config[:remove_contents].map(&:to_s))
18
18
  else
19
19
  @remove_all_contents = !!config[:remove_contents]
20
20
  end
@@ -1,3 +1,3 @@
1
1
  class Sanitize
2
- VERSION = '2.0.3'
2
+ VERSION = '2.0.4'
3
3
  end
data/lib/sanitize.rb CHANGED
@@ -1,6 +1,6 @@
1
1
  # encoding: utf-8
2
2
  #--
3
- # Copyright (c) 2011 Ryan Grove <ryan@wonko.com>
3
+ # Copyright (c) 2013 Ryan Grove <ryan@wonko.com>
4
4
  #
5
5
  # Permission is hereby granted, free of charge, to any person obtaining a copy
6
6
  # of this software and associated documentation files (the 'Software'), to deal
@@ -59,6 +59,19 @@ class Sanitize
59
59
  Sanitize.new(config).clean!(html)
60
60
  end
61
61
 
62
+ # Performs a Sanitize#clean using a full-document HTML parser instead of
63
+ # the default fragment parser. This will add a DOCTYPE and html tag
64
+ # unless they are already present
65
+ def self.clean_document(html, config = {})
66
+ Sanitize.new(config).clean_document(html)
67
+ end
68
+
69
+ # Performs Sanitize#clean_document in place, returning _html_, or +nil+ if no
70
+ # changes were made.
71
+ def self.clean_document!(html, config = {})
72
+ Sanitize.new(config).clean_document!(html)
73
+ end
74
+
62
75
  # Sanitizes the specified Nokogiri::XML::Node and all its children.
63
76
  def self.clean_node!(node, config = {})
64
77
  Sanitize.new(config).clean_node!(node)
@@ -96,8 +109,8 @@ class Sanitize
96
109
 
97
110
  # Performs clean in place, returning _html_, or +nil+ if no changes were
98
111
  # made.
99
- def clean!(html)
100
- fragment = Nokogiri::HTML::DocumentFragment.parse(html)
112
+ def clean!(html, parser = Nokogiri::HTML::DocumentFragment)
113
+ fragment = parser.parse(html)
101
114
  clean_node!(fragment)
102
115
 
103
116
  output_method_params = {:encoding => @config[:output_encoding], :indent => 0}
@@ -116,6 +129,22 @@ class Sanitize
116
129
  return result == html ? nil : html[0, html.length] = result
117
130
  end
118
131
 
132
+ def clean_document(html)
133
+ unless html.nil?
134
+ clean_document!(html.dup) || html
135
+ end
136
+ end
137
+
138
+ def clean_document!(html)
139
+ if !@config[:elements].include?('html') && !@config[:remove_contents]
140
+ raise 'You must have the HTML element whitelisted to call #clean_document unless remove_contents is set to true'
141
+ # otherwise Nokogiri will raise for having multiple root nodes when
142
+ # it moves its children to the root document context
143
+ end
144
+
145
+ clean!(html, Nokogiri::HTML::Document)
146
+ end
147
+
119
148
  # Sanitizes the specified Nokogiri::XML::Node and all its children.
120
149
  def clean_node!(node)
121
150
  raise ArgumentError unless node.is_a?(Nokogiri::XML::Node)
metadata CHANGED
@@ -1,62 +1,63 @@
1
- --- !ruby/object:Gem::Specification
1
+ --- !ruby/object:Gem::Specification
2
2
  name: sanitize
3
- version: !ruby/object:Gem::Version
4
- prerelease:
5
- version: 2.0.3
3
+ version: !ruby/object:Gem::Version
4
+ version: 2.0.4
6
5
  platform: ruby
7
- authors:
6
+ authors:
8
7
  - Ryan Grove
9
8
  autorequire:
10
9
  bindir: bin
11
10
  cert_chain: []
12
-
13
- date: 2011-07-02 00:00:00 Z
14
- dependencies:
15
- - !ruby/object:Gem::Dependency
11
+ date: 2013-06-12 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
16
14
  name: nokogiri
17
- prerelease: false
18
- requirement: &id001 !ruby/object:Gem::Requirement
19
- none: false
20
- requirements:
21
- - - ">="
22
- - !ruby/object:Gem::Version
23
- version: 1.4.4
24
- - - <
25
- - !ruby/object:Gem::Version
26
- version: "1.6"
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ~>
18
+ - !ruby/object:Gem::Version
19
+ version: 1.6.0
27
20
  type: :runtime
28
- version_requirements: *id001
29
- - !ruby/object:Gem::Dependency
30
- name: minitest
31
21
  prerelease: false
32
- requirement: &id002 !ruby/object:Gem::Requirement
33
- none: false
34
- requirements:
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
35
24
  - - ~>
36
- - !ruby/object:Gem::Version
25
+ - !ruby/object:Gem::Version
26
+ version: 1.6.0
27
+ - !ruby/object:Gem::Dependency
28
+ name: minitest
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - '>='
32
+ - !ruby/object:Gem::Version
37
33
  version: 2.0.0
38
34
  type: :development
39
- version_requirements: *id002
40
- - !ruby/object:Gem::Dependency
41
- name: rake
42
35
  prerelease: false
43
- requirement: &id003 !ruby/object:Gem::Requirement
44
- none: false
45
- requirements:
46
- - - ~>
47
- - !ruby/object:Gem::Version
48
- version: 0.8.0
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - '>='
39
+ - !ruby/object:Gem::Version
40
+ version: 2.0.0
41
+ - !ruby/object:Gem::Dependency
42
+ name: rake
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - '>='
46
+ - !ruby/object:Gem::Version
47
+ version: '0.9'
49
48
  type: :development
50
- version_requirements: *id003
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - '>='
53
+ - !ruby/object:Gem::Version
54
+ version: '0.9'
51
55
  description:
52
56
  email: ryan@wonko.com
53
57
  executables: []
54
-
55
58
  extensions: []
56
-
57
59
  extra_rdoc_files: []
58
-
59
- files:
60
+ files:
60
61
  - HISTORY.md
61
62
  - LICENSE
62
63
  - README.rdoc
@@ -71,30 +72,25 @@ files:
71
72
  - lib/sanitize.rb
72
73
  homepage: https://github.com/rgrove/sanitize/
73
74
  licenses: []
74
-
75
+ metadata: {}
75
76
  post_install_message:
76
77
  rdoc_options: []
77
-
78
- require_paths:
78
+ require_paths:
79
79
  - lib
80
- required_ruby_version: !ruby/object:Gem::Requirement
81
- none: false
82
- requirements:
83
- - - ">="
84
- - !ruby/object:Gem::Version
85
- version: 1.8.7
86
- required_rubygems_version: !ruby/object:Gem::Requirement
87
- none: false
88
- requirements:
89
- - - ">="
90
- - !ruby/object:Gem::Version
91
- version: "0"
80
+ required_ruby_version: !ruby/object:Gem::Requirement
81
+ requirements:
82
+ - - '>='
83
+ - !ruby/object:Gem::Version
84
+ version: 1.9.2
85
+ required_rubygems_version: !ruby/object:Gem::Requirement
86
+ requirements:
87
+ - - '>='
88
+ - !ruby/object:Gem::Version
89
+ version: 1.2.0
92
90
  requirements: []
93
-
94
- rubyforge_project: riposte
95
- rubygems_version: 1.8.5
91
+ rubyforge_project:
92
+ rubygems_version: 2.0.0
96
93
  signing_key:
97
- specification_version: 3
94
+ specification_version: 4
98
95
  summary: Whitelist-based HTML sanitizer.
99
96
  test_files: []
100
-