sanitize 1.2.0.dev.20091104 → 1.2.0
Sign up to get free protection for your applications and to get access to all the features.
Potentially problematic release.
This version of sanitize might be problematic. Click here for more details.
- data/HISTORY +7 -5
- data/LICENSE +1 -1
- data/README.rdoc +55 -5
- data/lib/sanitize.rb +4 -5
- data/lib/sanitize/config.rb +3 -1
- data/lib/sanitize/config/basic.rb +1 -1
- data/lib/sanitize/config/relaxed.rb +1 -3
- data/lib/sanitize/config/restricted.rb +1 -1
- data/lib/sanitize/version.rb +1 -1
- metadata +5 -5
data/HISTORY
CHANGED
@@ -1,13 +1,15 @@
|
|
1
1
|
Sanitize History
|
2
2
|
================================================================================
|
3
3
|
|
4
|
-
Version 1.2.0
|
4
|
+
Version 1.2.0 (2010-01-17)
|
5
|
+
* Requires Nokogiri ~> 1.4.1.
|
5
6
|
* Added support for transformers, which allow you to filter and alter nodes
|
6
7
|
using your own custom logic, on top of (or instead of) Sanitize's core
|
7
|
-
filter. See the README for details.
|
8
|
-
*
|
9
|
-
|
10
|
-
|
8
|
+
filter. See the README for details and examples.
|
9
|
+
* Added Sanitize.clean_node!, which sanitizes a Nokogiri::XML::Node and all
|
10
|
+
its children.
|
11
|
+
* Added elements <h1> through <h6> to the Relaxed whitelist. [Suggested by
|
12
|
+
David Reese]
|
11
13
|
|
12
14
|
Version 1.1.0 (2009-10-11)
|
13
15
|
* Migrated from Hpricot to Nokogiri. Requires libxml2 >= 2.7.2 [Adam Hooper]
|
data/LICENSE
CHANGED
data/README.rdoc
CHANGED
@@ -15,14 +15,14 @@ or maliciously-formed HTML. When in doubt, Sanitize always errs on the side of
|
|
15
15
|
caution.
|
16
16
|
|
17
17
|
*Author*:: Ryan Grove (mailto:ryan@wonko.com)
|
18
|
-
*Version*:: 1.2.0
|
19
|
-
*Copyright*:: Copyright (c)
|
18
|
+
*Version*:: 1.2.0 (2010-01-17)
|
19
|
+
*Copyright*:: Copyright (c) 2010 Ryan Grove. All rights reserved.
|
20
20
|
*License*:: MIT License (http://opensource.org/licenses/mit-license.php)
|
21
21
|
*Website*:: http://github.com/rgrove/sanitize
|
22
22
|
|
23
23
|
== Requires
|
24
24
|
|
25
|
-
* Nokogiri
|
25
|
+
* Nokogiri ~> 1.4.1
|
26
26
|
* libxml2 >= 2.7.2
|
27
27
|
|
28
28
|
== Installation
|
@@ -33,7 +33,7 @@ Latest stable release:
|
|
33
33
|
|
34
34
|
Latest development version:
|
35
35
|
|
36
|
-
gem install sanitize
|
36
|
+
gem install sanitize --prerelease
|
37
37
|
|
38
38
|
== Usage
|
39
39
|
|
@@ -213,6 +213,56 @@ way. A returned Hash may contain the following items, all of which are optional:
|
|
213
213
|
Array of specific Nokogiri::XML::Node objects to whitelist, anywhere in the
|
214
214
|
document, regardless of the current Sanitize config.
|
215
215
|
|
216
|
+
==== Example: Transformer to whitelist YouTube video embeds
|
217
|
+
|
218
|
+
The following example demonstrates how to create a Sanitize transformer that
|
219
|
+
will safely whitelist valid YouTube video embeds without having to blindly allow
|
220
|
+
other kinds of embedded content, which would be the case if you tried to do this
|
221
|
+
by just whitelisting all <code><object></code>, <code><embed></code>, and
|
222
|
+
<code><param></code> elements:
|
223
|
+
|
224
|
+
lambda do |env|
|
225
|
+
node = env[:node]
|
226
|
+
node_name = node.name.to_s.downcase
|
227
|
+
parent = node.parent
|
228
|
+
|
229
|
+
# Since the transformer receives the deepest nodes first, we look for a
|
230
|
+
# <param> element or an <embed> element whose parent is an <object>.
|
231
|
+
return nil unless (node_name == 'param' || node_name == 'embed') &&
|
232
|
+
parent.name.to_s.downcase == 'object'
|
233
|
+
|
234
|
+
if node_name == 'param'
|
235
|
+
# Quick XPath search to find the <param> node that contains the video URL.
|
236
|
+
return nil unless movie_node = parent.search('param[@name="movie"]')[0]
|
237
|
+
url = movie_node['value']
|
238
|
+
else
|
239
|
+
# Since this is an <embed>, the video URL is in the "src" attribute. No
|
240
|
+
# extra work needed.
|
241
|
+
url = node['src']
|
242
|
+
end
|
243
|
+
|
244
|
+
# Verify that the video URL is actually a valid YouTube video URL.
|
245
|
+
return nil unless url =~ /^http:\/\/(?:www\.)?youtube\.com\/v\//
|
246
|
+
|
247
|
+
# We're now certain that this is a YouTube embed, but we still need to run
|
248
|
+
# it through a special Sanitize step to ensure that no unwanted elements or
|
249
|
+
# attributes that don't belong in a YouTube embed can sneak in.
|
250
|
+
Sanitize.clean_node!(parent, {
|
251
|
+
:elements => ['embed', 'object', 'param'],
|
252
|
+
:attributes => {
|
253
|
+
'embed' => ['allowfullscreen', 'allowscriptaccess', 'height', 'src', 'type', 'width'],
|
254
|
+
'object' => ['height', 'width'],
|
255
|
+
'param' => ['name', 'value']
|
256
|
+
}
|
257
|
+
})
|
258
|
+
|
259
|
+
# Now that we're sure that this is a valid YouTube embed and that there are
|
260
|
+
# no unwanted elements or attributes hidden inside it, we can tell Sanitize
|
261
|
+
# to whitelist the current node (<param> or <embed>) and its parent
|
262
|
+
# (<object>).
|
263
|
+
{:whitelist_nodes => [node, parent]}
|
264
|
+
end
|
265
|
+
|
216
266
|
== Contributors
|
217
267
|
|
218
268
|
The following lovely people have contributed to Sanitize in the form of patches
|
@@ -229,7 +279,7 @@ or ideas that later became code:
|
|
229
279
|
|
230
280
|
== License
|
231
281
|
|
232
|
-
Copyright (c)
|
282
|
+
Copyright (c) 2010 Ryan Grove <ryan@wonko.com>
|
233
283
|
|
234
284
|
Permission is hereby granted, free of charge, to any person obtaining a copy of
|
235
285
|
this software and associated documentation files (the 'Software'), to deal in
|
data/lib/sanitize.rb
CHANGED
@@ -1,6 +1,6 @@
|
|
1
1
|
# encoding: utf-8
|
2
2
|
#--
|
3
|
-
# Copyright (c)
|
3
|
+
# Copyright (c) 2010 Ryan Grove <ryan@wonko.com>
|
4
4
|
#
|
5
5
|
# Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
6
|
# of this software and associated documentation files (the 'Software'), to deal
|
@@ -105,9 +105,8 @@ class Sanitize
|
|
105
105
|
|
106
106
|
result = output_method.call(output_method_params)
|
107
107
|
|
108
|
-
#
|
109
|
-
#
|
110
|
-
# now we have to hack around it to prevent errors.
|
108
|
+
# Ensure that the result is always a UTF-8 string in Ruby 1.9, no matter
|
109
|
+
# what. Nokogiri seems to return empty strings as ASCII for some reason.
|
111
110
|
result.force_encoding('utf-8') if RUBY_VERSION >= '1.9'
|
112
111
|
|
113
112
|
return result == html ? nil : html[0, html.length] = result
|
@@ -209,7 +208,7 @@ class Sanitize
|
|
209
208
|
transformer_node
|
210
209
|
elsif transform.is_a?(Hash)
|
211
210
|
if transform[:whitelist_nodes].is_a?(Array)
|
212
|
-
@whitelist_nodes += transform[:whitelist_nodes]
|
211
|
+
@whitelist_nodes += transform[:whitelist_nodes]
|
213
212
|
@whitelist_nodes.uniq!
|
214
213
|
end
|
215
214
|
|
data/lib/sanitize/config.rb
CHANGED
@@ -1,5 +1,5 @@
|
|
1
1
|
#--
|
2
|
-
# Copyright (c)
|
2
|
+
# Copyright (c) 2010 Ryan Grove <ryan@wonko.com>
|
3
3
|
#
|
4
4
|
# Permission is hereby granted, free of charge, to any person obtaining a copy
|
5
5
|
# of this software and associated documentation files (the 'Software'), to deal
|
@@ -49,6 +49,8 @@ class Sanitize
|
|
49
49
|
# to allow relative URLs sans protocol.
|
50
50
|
:protocols => {},
|
51
51
|
|
52
|
+
# Transformers allow you to filter or alter nodes using custom logic. See
|
53
|
+
# README.rdoc for details and examples.
|
52
54
|
:transformers => []
|
53
55
|
}
|
54
56
|
end
|
@@ -1,5 +1,5 @@
|
|
1
1
|
#--
|
2
|
-
# Copyright (c)
|
2
|
+
# Copyright (c) 2010 Ryan Grove <ryan@wonko.com>
|
3
3
|
#
|
4
4
|
# Permission is hereby granted, free of charge, to any person obtaining a copy
|
5
5
|
# of this software and associated documentation files (the 'Software'), to deal
|
@@ -1,5 +1,5 @@
|
|
1
1
|
#--
|
2
|
-
# Copyright (c)
|
2
|
+
# Copyright (c) 2010 Ryan Grove <ryan@wonko.com>
|
3
3
|
#
|
4
4
|
# Permission is hereby granted, free of charge, to any person obtaining a copy
|
5
5
|
# of this software and associated documentation files (the 'Software'), to deal
|
@@ -20,8 +20,6 @@
|
|
20
20
|
# SOFTWARE.
|
21
21
|
#++
|
22
22
|
|
23
|
-
|
24
|
-
|
25
23
|
class Sanitize
|
26
24
|
module Config
|
27
25
|
RELAXED = {
|
@@ -1,5 +1,5 @@
|
|
1
1
|
#--
|
2
|
-
# Copyright (c)
|
2
|
+
# Copyright (c) 2010 Ryan Grove <ryan@wonko.com>
|
3
3
|
#
|
4
4
|
# Permission is hereby granted, free of charge, to any person obtaining a copy
|
5
5
|
# of this software and associated documentation files (the 'Software'), to deal
|
data/lib/sanitize/version.rb
CHANGED
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: sanitize
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.2.0
|
4
|
+
version: 1.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Ryan Grove
|
@@ -9,7 +9,7 @@ autorequire:
|
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
11
|
|
12
|
-
date:
|
12
|
+
date: 2010-01-17 00:00:00 -08:00
|
13
13
|
default_executable:
|
14
14
|
dependencies:
|
15
15
|
- !ruby/object:Gem::Dependency
|
@@ -20,7 +20,7 @@ dependencies:
|
|
20
20
|
requirements:
|
21
21
|
- - ~>
|
22
22
|
- !ruby/object:Gem::Version
|
23
|
-
version: 1.4.
|
23
|
+
version: 1.4.1
|
24
24
|
version:
|
25
25
|
- !ruby/object:Gem::Dependency
|
26
26
|
name: bacon
|
@@ -77,9 +77,9 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
77
77
|
version:
|
78
78
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
79
79
|
requirements:
|
80
|
-
- - "
|
80
|
+
- - ">="
|
81
81
|
- !ruby/object:Gem::Version
|
82
|
-
version:
|
82
|
+
version: "0"
|
83
83
|
version:
|
84
84
|
requirements: []
|
85
85
|
|