sanitize 1.2.0 → 1.2.1.dev.20100122

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of sanitize might be problematic. Click here for more details.

data/HISTORY CHANGED
@@ -1,6 +1,15 @@
1
1
  Sanitize History
2
2
  ================================================================================
3
3
 
4
+ Version 1.2.1 (git)
5
+ * Added a :remove_contents config setting. If set to true, Sanitize will
6
+ remove the contents of filtered nodes in addition to the nodes themselves.
7
+ * The environment hash passed into transformers now includes a :node_name item
8
+ containing the lowercase name of the current HTML node (e.g. "div").
9
+ * Returning anything other than a Hash or nil from a transformer will now
10
+ raise a meaningful Sanitize::Error exception rather than an unintended
11
+ NameError.
12
+
4
13
  Version 1.2.0 (2010-01-17)
5
14
  * Requires Nokogiri ~> 1.4.1.
6
15
  * Added support for transformers, which allow you to filter and alter nodes
data/README.rdoc CHANGED
@@ -15,7 +15,7 @@ or maliciously-formed HTML. When in doubt, Sanitize always errs on the side of
15
15
  caution.
16
16
 
17
17
  *Author*:: Ryan Grove (mailto:ryan@wonko.com)
18
- *Version*:: 1.2.0 (2010-01-17)
18
+ *Version*:: 1.2.1.dev (git)
19
19
  *Copyright*:: Copyright (c) 2010 Ryan Grove. All rights reserved.
20
20
  *License*:: MIT License (http://opensource.org/licenses/mit-license.php)
21
21
  *Website*:: http://github.com/rgrove/sanitize
@@ -33,7 +33,7 @@ Latest stable release:
33
33
 
34
34
  Latest development version:
35
35
 
36
- gem install sanitize --prerelease
36
+ gem install sanitize --pre
37
37
 
38
38
  == Usage
39
39
 
@@ -89,17 +89,17 @@ configuration:
89
89
  :attributes => {'a' => ['href', 'title'], 'span' => ['class']},
90
90
  :protocols => {'a' => {'href' => ['http', 'https', 'mailto']}})
91
91
 
92
- ==== :elements
92
+ ==== :add_attributes (Hash)
93
93
 
94
- Array of element names to allow. Specify all names in lowercase.
94
+ Attributes to add to specific elements. If the attribute already exists, it will
95
+ be replaced with the value specified here. Specify all element names and
96
+ attributes in lowercase.
95
97
 
96
- :elements => [
97
- 'a', 'b', 'blockquote', 'br', 'cite', 'code', 'dd', 'dl', 'dt', 'em',
98
- 'i', 'li', 'ol', 'p', 'pre', 'q', 'small', 'strike', 'strong', 'sub',
99
- 'sup', 'u', 'ul'
100
- ]
98
+ :add_attributes => {
99
+ 'a' => {'rel' => 'nofollow'}
100
+ }
101
101
 
102
- ==== :attributes
102
+ ==== :attributes (Hash)
103
103
 
104
104
  Attributes to allow for specific elements. Specify all element names and
105
105
  attributes in lowercase.
@@ -118,17 +118,28 @@ If you'd like to allow certain attributes on all elements, use the symbol
118
118
  'a' => ['href', 'title']
119
119
  }
120
120
 
121
- ==== :add_attributes
121
+ ==== :allow_comments (boolean)
122
122
 
123
- Attributes to add to specific elements. If the attribute already exists, it will
124
- be replaced with the value specified here. Specify all element names and
125
- attributes in lowercase.
123
+ Whether or not to allow HTML comments. Allowing comments is strongly
124
+ discouraged, since IE allows script execution within conditional comments. The
125
+ default value is <code>false</code>.
126
126
 
127
- :add_attributes => {
128
- 'a' => {'rel' => 'nofollow'}
129
- }
127
+ ==== :elements (Array)
128
+
129
+ Array of element names to allow. Specify all names in lowercase.
130
+
131
+ :elements => [
132
+ 'a', 'b', 'blockquote', 'br', 'cite', 'code', 'dd', 'dl', 'dt', 'em',
133
+ 'i', 'li', 'ol', 'p', 'pre', 'q', 'small', 'strike', 'strong', 'sub',
134
+ 'sup', 'u', 'ul'
135
+ ]
136
+
137
+ ==== :output (Symbol)
130
138
 
131
- ==== :protocols
139
+ Output format. Supported formats are <code>:html</code> and <code>:xhtml</code>,
140
+ defaulting to <code>:xhtml</code>.
141
+
142
+ ==== :protocols (Hash)
132
143
 
133
144
  URL protocols to allow in specific attributes. If an attribute is listed here
134
145
  and contains a protocol other than those specified (or if it contains no
@@ -146,6 +157,16 @@ include the symbol <code>:relative</code> in the protocol array:
146
157
  'a' => {'href' => ['http', 'https', :relative]}
147
158
  }
148
159
 
160
+ ==== :remove_contents (boolean)
161
+
162
+ If set to <code>true</code>, Sanitize will remove the contents of any filtered
163
+ nodes in addition to the nodes themselves. By default, Sanitize leaves the safe
164
+ parts of a node's contents behind when the node is removed.
165
+
166
+ ==== :transformers
167
+
168
+ See below.
169
+
149
170
  === Transformers
150
171
 
151
172
  Transformers allow you to filter and alter nodes using your own custom logic, on
@@ -170,6 +191,9 @@ Hash that contains the following items:
170
191
  [<code>:node</code>]
171
192
  A Nokogiri::XML::Node object representing an HTML element.
172
193
 
194
+ [<code>:node_name</code>]
195
+ The name of the current HTML node, always lowercase (e.g. "div" or "span").
196
+
173
197
  ==== Processing
174
198
 
175
199
  Each transformer has full access to the Nokogiri::XML::Node that's passed into
@@ -223,7 +247,7 @@ by just whitelisting all <code><object></code>, <code><embed></code>, and
223
247
 
224
248
  lambda do |env|
225
249
  node = env[:node]
226
- node_name = node.name.to_s.downcase
250
+ node_name = env[:node_name]
227
251
  parent = node.parent
228
252
 
229
253
  # Since the transformer receives the deepest nodes first, we look for a
data/lib/sanitize.rb CHANGED
@@ -144,7 +144,10 @@ class Sanitize
144
144
 
145
145
  # Delete any element that isn't in the whitelist.
146
146
  unless transform[:whitelist] || @config[:elements].include?(name)
147
- node.children.each { |n| node.add_previous_sibling(n) }
147
+ unless @config[:remove_contents]
148
+ node.children.each { |n| node.add_previous_sibling(n) }
149
+ end
150
+
148
151
  node.unlink
149
152
  return
150
153
  end
@@ -200,8 +203,9 @@ class Sanitize
200
203
 
201
204
  @config[:transformers].inject(node) do |transformer_node, transformer|
202
205
  transform = transformer.call({
203
- :config => @config,
204
- :node => transformer_node
206
+ :config => @config,
207
+ :node => transformer_node,
208
+ :node_name => transformer_node.name.downcase
205
209
  })
206
210
 
207
211
  if transform.nil?
@@ -224,4 +228,6 @@ class Sanitize
224
228
 
225
229
  return output
226
230
  end
231
+
232
+ class Error < StandardError; end
227
233
  end
@@ -49,6 +49,11 @@ class Sanitize
49
49
  # to allow relative URLs sans protocol.
50
50
  :protocols => {},
51
51
 
52
+ # If this is true, Sanitize will remove the contents of any filtered nodes
53
+ # in addition to the nodes themselves. By default, Sanitize leaves the
54
+ # safe parts of a node's contents behind when the node is removed.
55
+ :remove_contents => false,
56
+
52
57
  # Transformers allow you to filter or alter nodes using custom logic. See
53
58
  # README.rdoc for details and examples.
54
59
  :transformers => []
@@ -1,3 +1,3 @@
1
1
  class Sanitize
2
- VERSION = '1.2.0'
2
+ VERSION = '1.2.1.dev.20100122'
3
3
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: sanitize
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.2.0
4
+ version: 1.2.1.dev.20100122
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ryan Grove
@@ -9,7 +9,7 @@ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
11
 
12
- date: 2010-01-17 00:00:00 -08:00
12
+ date: 2010-01-22 00:00:00 -08:00
13
13
  default_executable:
14
14
  dependencies:
15
15
  - !ruby/object:Gem::Dependency
@@ -77,9 +77,9 @@ required_ruby_version: !ruby/object:Gem::Requirement
77
77
  version:
78
78
  required_rubygems_version: !ruby/object:Gem::Requirement
79
79
  requirements:
80
- - - ">="
80
+ - - ">"
81
81
  - !ruby/object:Gem::Version
82
- version: "0"
82
+ version: 1.3.1
83
83
  version:
84
84
  requirements: []
85
85