sanitize 4.0.1 → 4.1.0

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of sanitize might be problematic. Click here for more details.

checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 91001a237cb7416e59ff83e47bac6087fde5fcb5
4
- data.tar.gz: 8e93cb61a245815af2ff13a1e42815e794df3a0b
3
+ metadata.gz: db5b47018757eca02968cca236fae793558f28a3
4
+ data.tar.gz: 2556f1ebfb26b038190f334a39aff61315cd57b0
5
5
  SHA512:
6
- metadata.gz: 466d73fcaf8c6e0481297451267db12a6e9729560faadc6a2be2f95dcb032a647fbe70068e4907c8b34a3314631fa0b8b90ecd61f0aea82b78e19111739b8bf4
7
- data.tar.gz: faea3da32aaa91b4792d205eb59f74fcfd999d66db7a35e51c48e14017f3bdfc873461fc9208752243df23f945e50602cb6a5478c424ac914a6f6972a72ee7e3
6
+ metadata.gz: 2ddad668c07d8440a7e3bd4ccf4a066f6c1f0a7d9aeb8a0f43bd04569c8b5dd7959130d50c0725f902fa6f1bf8ed930b4d00ee4fa977f3364702d53387463b39
7
+ data.tar.gz: 677279ef0d92a8e1d96e1ac28cfa12a5e4934887952a75dc891a5f37014352b8369f5effc1e480a10ae10363724840dbb1a9b5ec7b60b317a501ec370811d5b4
data/HISTORY.md CHANGED
@@ -1,16 +1,22 @@
1
- Sanitize History
2
- ================================================================================
1
+ # Sanitize History
3
2
 
4
- Version 4.0.1 (2015-12-09)
5
- --------------------------
3
+ ## 4.1.0 (2016-06-17)
4
+
5
+ * Added a new CSS config setting, `:import_url_validator`. This is a Proc or
6
+ other callable object that will be called with each `@import` URL, and should
7
+ return `true` to allow the URL or `false` to remove it. [@nikz - #153][153]
8
+
9
+ [153]:https://github.com/rgrove/sanitize/pull/153/
10
+
11
+
12
+ ## 4.0.1 (2015-12-09)
6
13
 
7
14
  * Unpinned the Nokogumbo dependency. [@rubys - #141][141]
8
15
 
9
16
  [141]:https://github.com/rgrove/sanitize/pull/141
10
17
 
11
18
 
12
- Version 4.0.0 (2015-04-20)
13
- --------------------------
19
+ ## 4.0.0 (2015-04-20)
14
20
 
15
21
  ### Potentially breaking changes
16
22
 
@@ -50,8 +56,7 @@ Version 4.0.0 (2015-04-20)
50
56
  [111]:https://github.com/rgrove/sanitize/issues/111
51
57
 
52
58
 
53
- Version 3.1.2 (2015-02-22)
54
- --------------------------
59
+ ## 3.1.2 (2015-02-22)
55
60
 
56
61
  * Fixed: Deleting a node in a custom transformer could trigger a memory leak
57
62
  in Nokogiri if that node's children were later reparented, which the built-in
@@ -61,8 +66,7 @@ Version 3.1.2 (2015-02-22)
61
66
  [129]:https://github.com/rgrove/sanitize/issues/129
62
67
 
63
68
 
64
- Version 3.1.1 (2015-02-04)
65
- --------------------------
69
+ ## 3.1.1 (2015-02-04)
66
70
 
67
71
  * Fixed: `#document` and `#fragment` failed on frozen strings, and could
68
72
  unintentionally modify unfrozen strings if they used an encoding other than
@@ -72,8 +76,7 @@ Version 3.1.1 (2015-02-04)
72
76
  [128]:https://github.com/rgrove/sanitize/pull/128
73
77
 
74
78
 
75
- Version 3.1.0 (2014-12-22)
76
- --------------------------
79
+ ## 3.1.0 (2014-12-22)
77
80
 
78
81
  * Added the following CSS properties to the relaxed config. [@ehudc - #120][120]
79
82
 
@@ -90,8 +93,7 @@ Version 3.1.0 (2014-12-22)
90
93
  [120]:https://github.com/rgrove/sanitize/pull/120
91
94
 
92
95
 
93
- Version 3.0.4 (2014-12-12)
94
- --------------------------
96
+ ## 3.0.4 (2014-12-12)
95
97
 
96
98
  * Fixed: Harmless whitespace preceding a URL protocol (such as " http://")
97
99
  caused the URL to be removed even when the protocol was whitelisted.
@@ -100,8 +102,7 @@ Version 3.0.4 (2014-12-12)
100
102
  [126]:https://github.com/rgrove/sanitize/pull/126
101
103
 
102
104
 
103
- Version 3.0.3 (2014-10-29)
104
- --------------------------
105
+ ## 3.0.3 (2014-10-29)
105
106
 
106
107
  * Fixed: Some CSS selectors weren't parsed correctly inside the body of a
107
108
  `@media` block, causing them to be removed even when whitelist rules should
@@ -110,16 +111,14 @@ Version 3.0.3 (2014-10-29)
110
111
  [121]:https://github.com/rgrove/sanitize/issues/121
111
112
 
112
113
 
113
- Version 3.0.2 (2014-09-02)
114
- --------------------------
114
+ ## 3.0.2 (2014-09-02)
115
115
 
116
116
  * Updated Nokogumbo to 1.1.12, because 1.1.11 silently reverted the change we
117
117
  were trying to pick up in the last release. Now issue [#114][114] is
118
118
  _actually_ fixed.
119
119
 
120
120
 
121
- Version 3.0.1 (2014-09-02)
122
- --------------------------
121
+ ## 3.0.1 (2014-09-02)
123
122
 
124
123
  * Updated Nokogumbo to 1.1.11 to pick up a fix for a Gumbo bug in which certain
125
124
  HTML character entities, such as `Ö`, were parsed incorrectly, leaving
@@ -128,8 +127,7 @@ Version 3.0.1 (2014-09-02)
128
127
  [114]:https://github.com/rgrove/sanitize/issues/114
129
128
 
130
129
 
131
- Version 3.0.0 (2014-06-21)
132
- --------------------------
130
+ ## 3.0.0 (2014-06-21)
133
131
 
134
132
  As of this version, Sanitize adheres strictly to the [SemVer 2.0.0][semver]
135
133
  versioning standard. This release contains API and output changes that are
@@ -228,8 +226,7 @@ Sanitize.fragment(html, Sanitize::Config.merge(Sanitize::Config::BASIC,
228
226
  [n1008]:https://github.com/sparklemotion/nokogiri/issues/1008
229
227
 
230
228
 
231
- Version 2.1.0 (2014-01-13)
232
- --------------------------
229
+ ## 2.1.0 (2014-01-13)
233
230
 
234
231
  * Added support for whitelisting arbitrary HTML5 `data-*` attributes. Use the
235
232
  symbol `:data` instead of an attribute name in the `:attributes` config to
@@ -244,16 +241,14 @@ Version 2.1.0 (2014-01-13)
244
241
  [87]:https://github.com/rgrove/sanitize/pull/87
245
242
 
246
243
 
247
- Version 2.0.6 (2013-07-10)
248
- --------------------------
244
+ ## 2.0.6 (2013-07-10)
249
245
 
250
246
  * Fixed: Version 2.0.5 inadvertently included some work-in-progress changes that
251
247
  shouldn't have made their way into the master branch. This is what happens
252
248
  when I release before coffee instead of after.
253
249
 
254
250
 
255
- Version 2.0.5 (2013-07-10)
256
- --------------------------
251
+ ## 2.0.5 (2013-07-10)
257
252
 
258
253
  * Loosened the Nokogiri dependency back to >= 1.4.4 to allow Sanitize to coexist
259
254
  in newer Rubies with other libraries that restrict Nokogiri to 1.5.x for 1.8.7
@@ -261,8 +256,7 @@ Version 2.0.5 (2013-07-10)
261
256
  life easier for people who need those other libs.
262
257
 
263
258
 
264
- Version 2.0.4 (2013-06-12)
265
- --------------------------
259
+ ## 2.0.4 (2013-06-12)
266
260
 
267
261
  * Added `Sanitize.clean_document`, which sanitizes a full HTML document rather
268
262
  than just a fragment. [Ben Anderson]
@@ -272,14 +266,12 @@ Version 2.0.4 (2013-06-12)
272
266
  * Dropped support for Ruby versions older than 1.9.2.
273
267
 
274
268
 
275
- Version 2.0.3 (2011-07-01)
276
- --------------------------
269
+ ## 2.0.3 (2011-07-01)
277
270
 
278
271
  * Loosened the Nokogiri dependency to allow Nokogiri 1.5.x.
279
272
 
280
273
 
281
- Version 2.0.2 (2011-05-21)
282
- --------------------------
274
+ ## 2.0.2 (2011-05-21)
283
275
 
284
276
  * Fixed a bug in which a protocol like "java\script:" would be translated to
285
277
  "java%5Cscript:" and allowed through the filter when relative URLs were
@@ -287,44 +279,50 @@ Version 2.0.2 (2011-05-21)
287
279
  undesired behavior.
288
280
 
289
281
 
290
- Version 2.0.1 (2011-03-16)
291
- --------------------------
282
+ ## 2.0.1 (2011-03-16)
292
283
 
293
284
  * Updated the protocol regex to anchor at the beginning of the string rather
294
285
  than the beginning of a line. [Eaden McKee]
295
286
 
296
287
 
297
- Version 2.0.0 (2011-01-15)
298
- --------------------------
288
+ ## 2.0.0 (2011-01-15)
299
289
 
300
290
  * The environment data passed into transformers and the return values expected
301
291
  from transformers have changed. Old transformers will need to be updated.
302
292
  See the README for details.
293
+
303
294
  * Transformers now receive nodes of all types, not just element nodes.
295
+
304
296
  * Sanitize's own core filtering logic is now implemented as a set of always-on
305
297
  transformers.
298
+
306
299
  * The default value for the `:output` config is now `:html`. Previously it was
307
300
  `:xhtml`.
301
+
308
302
  * Added a `:whitespace_elements` config, which specifies elements (such as
309
303
  `<br>` and `<p>`) that should be replaced with whitespace when removed in
310
304
  order to preserve readability. See the README for the default list of
311
305
  elements that will be replaced with whitespace when removed.
306
+
312
307
  * Added a `:transformers_breadth` config, which may be used to specify
313
308
  transformers that should traverse nodes in a breadth-first mode rather than
314
309
  the default depth-first mode.
310
+
315
311
  * Added the `abbr`, `dfn`, `kbd`, `mark`, `s`, `samp`, `time`, and `var`
316
312
  elements to the whitelists for the basic and relaxed configs.
313
+
317
314
  * Added the `bdo`, `del`, `figcaption`, `figure`, `hgroup`, `ins`, `rp`, `rt`,
318
315
  `ruby`, and `wbr` elements to the whitelist for the relaxed config.
316
+
319
317
  * The `dir`, `lang`, and `title` attributes are now whitelisted for all
320
318
  elements in the relaxed config.
319
+
321
320
  * Bumped minimum Nokogiri version to 1.4.4 to avoid a bug in 1.4.2+
322
321
  (issue #315) that caused `</body></html>` to be appended to the CDATA inside
323
322
  unterminated script and style elements.
324
323
 
325
324
 
326
- Version 1.2.1 (2010-04-20)
327
- --------------------------
325
+ ## 1.2.1 (2010-04-20)
328
326
 
329
327
  * Added a `:remove_contents` config setting. If set to `true`, Sanitize will
330
328
  remove the contents of all non-whitelisted elements in addition to the
@@ -332,41 +330,46 @@ Version 1.2.1 (2010-04-20)
332
330
  remove the contents of only those elements (when filtered), and leave the
333
331
  contents of other filtered elements. [Thanks to Rafael Souza for the array
334
332
  option]
333
+
335
334
  * Added an `:output_encoding` config setting to allow the character encoding
336
335
  for HTML output to be specified. The default is utf-8.
336
+
337
337
  * The environment hash passed into transformers now includes a `:node_name`
338
338
  item containing the lowercase name of the current HTML node (e.g. "div").
339
+
339
340
  * Returning anything other than a Hash or nil from a transformer will now
340
341
  raise a meaningful `Sanitize::Error` exception rather than an unintended
341
342
  `NameError`.
342
343
 
343
344
 
344
- Version 1.2.0 (2010-01-17)
345
- --------------------------
345
+ ## 1.2.0 (2010-01-17)
346
346
 
347
347
  * Requires Nokogiri ~> 1.4.1.
348
+
348
349
  * Added support for transformers, which allow you to filter and alter nodes
349
350
  using your own custom logic, on top of (or instead of) Sanitize's core
350
351
  filter. See the README for details and examples.
352
+
351
353
  * Added `Sanitize.clean_node!`, which sanitizes a `Nokogiri::XML::Node` and
352
354
  all its children.
355
+
353
356
  * Added elements `<h1>` through `<h6>` to the Relaxed whitelist. [Suggested by
354
357
  David Reese]
355
358
 
356
359
 
357
- Version 1.1.0 (2009-10-11)
358
- --------------------------
360
+ ## 1.1.0 (2009-10-11)
359
361
 
360
362
  * Migrated from Hpricot to Nokogiri. Requires libxml2 >= 2.7.2 [Adam Hooper]
363
+
361
364
  * Added an `:output` config setting to allow the output format to be
362
365
  specified. Supported formats are `:xhtml` (the default) and `:html` (which
363
366
  outputs HTML4).
367
+
364
368
  * Changed protocol regex to ensure Sanitize doesn't kill URLs with colons in
365
369
  path segments. [Peter Cooper]
366
370
 
367
371
 
368
- Version 1.0.8 (2009-04-23)
369
- --------------------------
372
+ ## 1.0.8 (2009-04-23)
370
373
 
371
374
  * Added a workaround for an Hpricot bug that prevents attribute names from
372
375
  being downcased in recent versions of Hpricot. This was exploitable to
@@ -374,48 +377,48 @@ Version 1.0.8 (2009-04-23)
374
377
  Wanicur]
375
378
 
376
379
 
377
- Version 1.0.7 (2009-04-11)
378
- --------------------------
380
+ ## 1.0.7 (2009-04-11)
379
381
 
380
382
  * Requires Hpricot 0.8.1+, which is finally compatible with Ruby 1.9.1.
383
+
381
384
  * Fixed a bug that caused named character entities containing digits (like
382
385
  `&sup2;`) to be escaped when they shouldn't have been. [Reported by
383
386
  Sebastian Steinmetz]
384
387
 
385
388
 
386
- Version 1.0.6 (2009-02-23)
387
- --------------------------
389
+ ## 1.0.6 (2009-02-23)
388
390
 
389
391
  * Removed htmlentities gem dependency.
392
+
390
393
  * Existing well-formed character entity references in the input string are now
391
394
  preserved rather than being decoded and re-encoded.
395
+
392
396
  * The `'` character is now encoded as `&#39;` instead of `&apos;` to prevent
393
397
  problems in IE6.
398
+
394
399
  * You can now specify the symbol `:all` in place of an element name in the
395
400
  attributes config hash to allow certain attributes on all elements. [Thanks
396
401
  to Mutwin Kraus]
397
402
 
398
403
 
399
- Version 1.0.5 (2009-02-05)
400
- --------------------------
404
+ ## 1.0.5 (2009-02-05)
401
405
 
402
406
  * Fixed a bug introduced in version 1.0.3 that prevented non-whitelisted
403
407
  protocols from being cleaned when relative URLs were allowed. [Reported by
404
408
  Dev Purkayastha]
409
+
405
410
  * Fixed "undefined method `parent='" exceptions caused by parser changes in
406
411
  edge Hpricot.
407
412
 
408
413
 
409
- Version 1.0.4 (2009-01-16)
410
- --------------------------
414
+ ## 1.0.4 (2009-01-16)
411
415
 
412
416
  * Fixed a bug that made it possible to sneak a non-whitelisted element through
413
417
  by repeating it several times in a row. All versions of Sanitize prior to
414
418
  1.0.4 are vulnerable. [Reported by Cristobal]
415
419
 
416
420
 
417
- Version 1.0.3 (2009-01-15)
418
- --------------------------
421
+ ## 1.0.3 (2009-01-15)
419
422
 
420
423
  * Fixed a bug whereby incomplete Unicode or hex entities could be used to
421
424
  prevent non-whitelisted protocols from being cleaned. Since IE6 and Opera
@@ -424,25 +427,23 @@ Version 1.0.3 (2009-01-15)
424
427
  Sanitize prior to 1.0.3.
425
428
 
426
429
 
427
- Version 1.0.2 (2009-01-04)
428
- --------------------------
430
+ ## 1.0.2 (2009-01-04)
429
431
 
430
432
  * Fixed a bug that caused an exception to be thrown when parsing a valueless
431
433
  attribute that's expected to contain a URL.
432
434
 
433
435
 
434
- Version 1.0.1 (2009-01-01)
435
- --------------------------
436
+ ## 1.0.1 (2009-01-01)
436
437
 
437
438
  * You can now specify `:relative` in a protocol config array to allow
438
439
  attributes containing relative URLs with no protocol. The Basic and Relaxed
439
440
  configs have been updated to allow relative URLs.
441
+
440
442
  * Added a workaround for an Hpricot bug that causes HTML entities for
441
443
  non-ASCII characters to be replaced by question marks, and all other
442
444
  entities to be destructively decoded.
443
445
 
444
446
 
445
- Version 1.0.0 (2008-12-25)
446
- --------------------------
447
+ ## 1.0.0 (2008-12-25)
447
448
 
448
449
  * First release.
data/README.md CHANGED
@@ -381,6 +381,18 @@ Names of CSS [at-rules][at-rules] to allow that may have associated blocks
381
381
  containing style rules. At-rules like `media` and `keyframes` fall into this
382
382
  category. Names should be specified in lowercase.
383
383
 
384
+ ##### :css => :import_url_validator
385
+
386
+ This is a `Proc` (or other callable object) that will be called and passed
387
+ the URL specified for any `@import` [at-rules][at-rules].
388
+
389
+ You can use this to limit what can be imported, for example something
390
+ like the following to limit `@import` to Google Fonts URLs:
391
+
392
+ ```ruby
393
+ Proc.new { |url| url.start_with?("https://fonts.googleapis.com") }
394
+ ```
395
+
384
396
  ##### :css => :properties (Array or Set)
385
397
 
386
398
  Whitelist of CSS property names to allow. Names should be specified in
@@ -80,6 +80,7 @@ class Sanitize; class CSS
80
80
  @at_rules = Set.new(@config[:at_rules])
81
81
  @at_rules_with_properties = Set.new(@config[:at_rules_with_properties])
82
82
  @at_rules_with_styles = Set.new(@config[:at_rules_with_styles])
83
+ @import_url_validator = @config[:import_url_validator]
83
84
  end
84
85
 
85
86
  # Sanitizes inline CSS style properties.
@@ -219,6 +220,7 @@ class Sanitize; class CSS
219
220
  rule[:block] = tree!(props)
220
221
 
221
222
  elsif @at_rules.include?(name)
223
+ return nil if name == "import" && !import_url_allowed?(rule)
222
224
  return nil if rule.has_key?(:block)
223
225
  else
224
226
  return nil
@@ -227,6 +229,19 @@ class Sanitize; class CSS
227
229
  rule
228
230
  end
229
231
 
232
+ # Passes the URL value of an @import rule to a block to ensure
233
+ # it's an allowed URL
234
+ def import_url_allowed?(rule)
235
+ return true unless @import_url_validator
236
+
237
+ url_token = rule[:tokens].detect { |t| t[:node] == :url || t[:node] == :string }
238
+
239
+ # don't allow @imports with no URL value
240
+ return false unless url_token && (import_url = url_token[:value])
241
+
242
+ @import_url_validator.call(import_url)
243
+ end
244
+
230
245
  # Sanitizes a CSS property node. Returns the sanitized node, or `nil` if the
231
246
  # current config doesn't allow this property.
232
247
  def property!(prop)
@@ -1,5 +1,5 @@
1
1
  # encoding: utf-8
2
2
 
3
3
  class Sanitize
4
- VERSION = '4.0.1'
4
+ VERSION = '4.1.0'
5
5
  end
@@ -325,6 +325,73 @@ describe 'Sanitize::CSS' do
325
325
  ].strip
326
326
  end
327
327
  end
328
+
329
+ describe "when validating @import rules" do
330
+
331
+ describe "with no validation proc specified" do
332
+ before do
333
+ @scss = Sanitize::CSS.new(Sanitize::Config.merge(Sanitize::Config::RELAXED[:css], {
334
+ :at_rules => ['import']
335
+ }))
336
+ end
337
+
338
+ it "should allow any URL value" do
339
+ css = %[
340
+ @import url('https://somesite.com/something.css');
341
+ ].strip
342
+
343
+ @scss.stylesheet(css).strip.must_equal %[
344
+ @import url('https://somesite.com/something.css');
345
+ ].strip
346
+ end
347
+ end
348
+
349
+ describe "with a validation proc specified" do
350
+ before do
351
+ google_font_validator = Proc.new { |url| url.start_with?("https://fonts.googleapis.com") }
352
+
353
+ @scss = Sanitize::CSS.new(Sanitize::Config.merge(Sanitize::Config::RELAXED[:css], {
354
+ :at_rules => ['import'], :import_url_validator => google_font_validator
355
+ }))
356
+ end
357
+
358
+ it "should allow a google fonts url" do
359
+ css = %[
360
+ @import 'https://fonts.googleapis.com/css?family=Indie+Flower';
361
+ @import url('https://fonts.googleapis.com/css?family=Indie+Flower');
362
+ ].strip
363
+
364
+ @scss.stylesheet(css).strip.must_equal %[
365
+ @import 'https://fonts.googleapis.com/css?family=Indie+Flower';
366
+ @import url('https://fonts.googleapis.com/css?family=Indie+Flower');
367
+ ].strip
368
+ end
369
+
370
+ it "should not allow a nasty url" do
371
+ css = %[
372
+ @import 'https://fonts.googleapis.com/css?family=Indie+Flower';
373
+ @import 'https://nastysite.com/nasty_hax0r.css';
374
+ @import url('https://nastysite.com/nasty_hax0r.css');
375
+ ].strip
376
+
377
+ @scss.stylesheet(css).strip.must_equal %[
378
+ @import 'https://fonts.googleapis.com/css?family=Indie+Flower';
379
+ ].strip
380
+ end
381
+
382
+ it "should not allow a blank url" do
383
+ css = %[
384
+ @import 'https://fonts.googleapis.com/css?family=Indie+Flower';
385
+ @import '';
386
+ @import url('');
387
+ ].strip
388
+
389
+ @scss.stylesheet(css).strip.must_equal %[
390
+ @import 'https://fonts.googleapis.com/css?family=Indie+Flower';
391
+ ].strip
392
+ end
393
+ end
394
+ end
328
395
  end
329
396
  end
330
397
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: sanitize
3
3
  version: !ruby/object:Gem::Version
4
- version: 4.0.1
4
+ version: 4.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ryan Grove
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-12-09 00:00:00.000000000 Z
11
+ date: 2016-07-17 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: crass
@@ -165,7 +165,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
165
165
  version: 1.2.0
166
166
  requirements: []
167
167
  rubyforge_project:
168
- rubygems_version: 2.4.8
168
+ rubygems_version: 2.5.1
169
169
  signing_key:
170
170
  specification_version: 4
171
171
  summary: Whitelist-based HTML and CSS sanitizer.