sanitize 1.0.2 → 1.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (4) hide show
  1. data/HISTORY +7 -0
  2. data/README.rdoc +1 -1
  3. data/lib/sanitize.rb +11 -3
  4. metadata +2 -2
data/HISTORY CHANGED
@@ -1,6 +1,13 @@
1
1
  Sanitize History
2
2
  ================================================================================
3
3
 
4
+ Version 1.0.3 (2009-01-15)
5
+ * Fixed a bug whereby incomplete Unicode or hex entities could be used to
6
+ prevent non-whitelisted protocols from being cleaned. Since IE6 and Opera
7
+ still decode the incomplete entities, users of those browsers may be
8
+ vulnerable to malicious script injection on websites using versions of
9
+ Sanitize prior to 1.0.3.
10
+
4
11
  Version 1.0.2 (2009-01-04)
5
12
  * Fixed a bug that caused an exception to be thrown when parsing a valueless
6
13
  attribute that's expected to contain a URL.
data/README.rdoc CHANGED
@@ -15,7 +15,7 @@ or maliciously-formed HTML. When in doubt, Sanitize always errs on the side of
15
15
  caution.
16
16
 
17
17
  *Author*:: Ryan Grove (mailto:ryan@wonko.com)
18
- *Version*:: 1.0.2 (2009-01-04)
18
+ *Version*:: 1.0.3 (2009-01-15)
19
19
  *Copyright*:: Copyright (c) 2009 Ryan Grove. All rights reserved.
20
20
  *License*:: MIT License (http://opensource.org/licenses/mit-license.php)
21
21
  *Website*:: http://github.com/rgrove/sanitize
data/lib/sanitize.rb CHANGED
@@ -38,6 +38,14 @@ require 'sanitize/config/relaxed'
38
38
  require 'sanitize/monkeypatch/hpricot'
39
39
 
40
40
  class Sanitize
41
+
42
+ # Matches an attribute value that could be treated by a browser as a URL
43
+ # with a protocol prefix, such as "http:" or "javascript:". Any string of one
44
+ # or more characters followed by a colon is considered a match, even if the
45
+ # colon is encoded as an entity and even if it's an incomplete entity (which
46
+ # IE6 and Opera will still parse).
47
+ REGEX_PROTOCOL = /^([^:]+)(?:\:|&#0*58|&#x0*3a)(?:[^0-9a-f]|$)/i
48
+
41
49
  #--
42
50
  # Class Methods
43
51
  #++
@@ -50,7 +58,7 @@ class Sanitize
50
58
  end
51
59
 
52
60
  # Performs Sanitize#clean in place, returning _html_, or +nil+ if no changes
53
- # were necessary.
61
+ # were made.
54
62
  def self.clean!(html, config = {})
55
63
  sanitize = Sanitize.new(config)
56
64
  sanitize.clean!(html)
@@ -72,7 +80,7 @@ class Sanitize
72
80
  end
73
81
 
74
82
  # Performs clean in place, returning _html_, or +nil+ if no changes were
75
- # necessary.
83
+ # made.
76
84
  def clean!(html)
77
85
  fragment = Hpricot(html)
78
86
 
@@ -107,7 +115,7 @@ class Sanitize
107
115
  next false unless protocol.has_key?(key)
108
116
  next true if value.nil?
109
117
 
110
- if value.to_s.downcase =~ /^([^:]+)(?:\:|&#0*58;|&#x0*3a;)/
118
+ if value.to_s.downcase =~ REGEX_PROTOCOL
111
119
  !protocol[key].include?($1.downcase)
112
120
  else
113
121
  !protocol[key].include?(:relative)
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: sanitize
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.2
4
+ version: 1.0.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ryan Grove
@@ -9,7 +9,7 @@ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
11
 
12
- date: 2009-01-04 00:00:00 -08:00
12
+ date: 2009-01-15 00:00:00 -08:00
13
13
  default_executable:
14
14
  dependencies:
15
15
  - !ruby/object:Gem::Dependency