owasp-esapi-ruby 0.30.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (56) hide show
  1. data/.document +5 -0
  2. data/AUTHORS +5 -0
  3. data/ChangeLog +69 -0
  4. data/ISSUES +0 -0
  5. data/LICENSE +24 -0
  6. data/README +51 -0
  7. data/Rakefile +63 -0
  8. data/VERSION +1 -0
  9. data/lib/codec/base_codec.rb +99 -0
  10. data/lib/codec/css_codec.rb +101 -0
  11. data/lib/codec/encoder.rb +330 -0
  12. data/lib/codec/html_codec.rb +424 -0
  13. data/lib/codec/javascript_codec.rb +119 -0
  14. data/lib/codec/mysql_codec.rb +131 -0
  15. data/lib/codec/oracle_codec.rb +46 -0
  16. data/lib/codec/os_codec.rb +78 -0
  17. data/lib/codec/percent_codec.rb +53 -0
  18. data/lib/codec/pushable_string.rb +114 -0
  19. data/lib/codec/vbscript_codec.rb +64 -0
  20. data/lib/codec/xml_codec.rb +173 -0
  21. data/lib/esapi.rb +68 -0
  22. data/lib/exceptions.rb +37 -0
  23. data/lib/executor.rb +20 -0
  24. data/lib/owasp-esapi-ruby.rb +13 -0
  25. data/lib/sanitizer/xss.rb +59 -0
  26. data/lib/validator/base_rule.rb +90 -0
  27. data/lib/validator/date_rule.rb +92 -0
  28. data/lib/validator/email.rb +29 -0
  29. data/lib/validator/float_rule.rb +76 -0
  30. data/lib/validator/generic_validator.rb +26 -0
  31. data/lib/validator/integer_rule.rb +61 -0
  32. data/lib/validator/string_rule.rb +146 -0
  33. data/lib/validator/validator_error_list.rb +48 -0
  34. data/lib/validator/zipcode.rb +27 -0
  35. data/spec/codec/css_codec_spec.rb +61 -0
  36. data/spec/codec/html_codec_spec.rb +87 -0
  37. data/spec/codec/javascript_codec_spec.rb +45 -0
  38. data/spec/codec/mysql_codec_spec.rb +44 -0
  39. data/spec/codec/oracle_codec_spec.rb +23 -0
  40. data/spec/codec/os_codec_spec.rb +51 -0
  41. data/spec/codec/percent_codec_spec.rb +34 -0
  42. data/spec/codec/vbcript_codec_spec.rb +23 -0
  43. data/spec/codec/xml_codec_spec.rb +83 -0
  44. data/spec/owasp_esapi_encoder_spec.rb +226 -0
  45. data/spec/owasp_esapi_executor_spec.rb +9 -0
  46. data/spec/owasp_esapi_ruby_email_validator_spec.rb +39 -0
  47. data/spec/owasp_esapi_ruby_xss_sanitizer_spec.rb +66 -0
  48. data/spec/owasp_esapi_ruby_zipcode_validator_spec.rb +42 -0
  49. data/spec/spec_helper.rb +10 -0
  50. data/spec/validator/base_rule_spec.rb +29 -0
  51. data/spec/validator/date_rule_spec.rb +40 -0
  52. data/spec/validator/float_rule_spec.rb +31 -0
  53. data/spec/validator/integer_rule_spec.rb +51 -0
  54. data/spec/validator/string_rule_spec.rb +103 -0
  55. data/spec/validator_skeleton.rb +150 -0
  56. metadata +235 -0
@@ -0,0 +1,5 @@
1
+ README.rdoc
2
+ lib/**/*.rb
3
+ bin/*
4
+ features/**/*.feature
5
+ LICENSE
data/AUTHORS ADDED
@@ -0,0 +1,5 @@
1
+ Owasp Esapi Ruby core
2
+ ---------------------
3
+
4
+ * Paolo Perego <thesp0nge@owasp.org>
5
+ * Sal Scotto <sal.scotto@gmail.com>
@@ -0,0 +1,69 @@
1
+ 2011-03-02 12:47:57 -0500 Sal Scotto Renamed validators to rule, the container class Validator will be the delegate ot those classes. Also fixed rake file
2
+ 2011-03-02 12:37:56 -0500 Sal Scotto Added nokogiri dependency. Nokogiri will be used for HTML/CSS scanning
3
+ 2011-02-28 20:35:29 -0500 Sal Scotto Added an int and float validators.
4
+ 2011-02-28 17:20:51 -0500 Sal Scotto Remove old date validator code, that is now superceeded by new DateValidator object
5
+ 2011-02-28 17:19:49 -0500 Sal Scotto Added date validator. you pass it a dateformat string and it will return a valid Time object.
6
+ 2011-02-28 16:08:45 -0500 Sal Scotto Remove old validator spec file
7
+ 2011-02-28 11:24:34 +0100 Paolo Perego Merge remote branch 'washu/master'
8
+ 2011-02-13 09:54:46 -0500 Paolo Perego Added a baseline validator spec
9
+ 2011-02-27 12:18:20 -0500 Sal Scotto Added base validator rule and string validator rule
10
+ 2011-02-26 13:51:27 -0500 Sal Scotto Fixed up a funny looking doc entry
11
+ 2011-02-26 13:42:25 -0500 Sal Scotto Added in last of the codecs. Ive also gone back and updated the rdoc for all the codecs and the encoder. Formatting and whitespace clean was also performed as well asn upper level formatting and rodc inclusions. I have cpied a good bit of the java esapi docs for class headers, methods since I implmented them to give the same results as it would be in the java world
12
+ 2011-02-26 09:44:38 -0500 Sal Scotto Added mysql and oracle codecs
13
+ 2011-02-26 09:30:34 -0500 Sal Scotto moved percent codec
14
+ 2011-02-26 09:28:58 -0500 Sal Scotto moved some codecs around
15
+ 2011-02-26 09:27:36 -0500 Sal Scotto update percent codec
16
+ 2011-02-24 23:55:24 -0500 Sal Scotto Stubbing in the executor class
17
+ 2011-02-24 23:54:38 -0500 Sal Scotto Added a vbscript codec
18
+ 2011-02-24 17:52:52 -0500 Sal Scotto Stubbed in vbscript_codec
19
+ 2011-02-24 17:50:54 -0500 Sal Scotto Fixed up more codec to more ruby stylish
20
+ 2011-02-23 22:56:05 -0500 Sal Scotto added in more test examples
21
+ 2011-02-23 22:08:28 -0500 Sal Scotto more encoder tests
22
+ 2011-02-23 20:00:27 -0500 Sal Scotto Changed the overally convuluted tests into dynamic tests do each sequence makes a dyanimc test now
23
+ 2011-02-21 10:38:39 -0500 Sal Scotto added os and javascript codecs. Added in spec file for thos codecs and updated encoder spec. TODO: add in some convience methods for encode_for_os and encode_for_js. Refactored some things inside pushable string to be more ruby like in method names. Will keep going over code and refactoing as time permits. Still need a vbscript, oracle, and mysql codecs
24
+ 2011-02-20 11:19:04 -0500 Sal Scotto Updated codecs for whitespace
25
+ 2011-02-20 11:18:23 -0500 Sal Scotto Renamed url_codec to percent_codec
26
+ 2011-02-20 10:54:04 -0500 Sal Scotto Added URL codec and test cases
27
+ 2011-02-19 23:22:36 -0500 Sal Scotto Added a HTML entity codec. Added a spec file to test the encoder Added a spec fiel for the codec Cleaned up encoder code and added mroe docs
28
+ 2011-02-19 16:17:40 -0500 Sal Scotto Finished cleaning up encoding stuff, strings should be pushed to UTF_8 as they are scanned for processing
29
+ 2011-02-19 11:00:58 -0500 Sal Scotto Fixed css codec to properly add a space after encoding a value to terminate properly
30
+ 2011-02-19 10:44:42 -0500 Sal Scotto Added some more documentation to teh code
31
+ 2011-02-19 09:53:31 -0500 Sal Scotto Added the Encoder Added a top level ESPI module definition that will be used to get references to the currecntly configured esapi setup Added an encoder spec, currently it has enough setup to test css as the only codec available Added an exceptions module, will house the various exception classes that can be raised
32
+ 2011-02-19 08:22:55 -0500 Sal Scotto Merge branch 'master' of https://github.com/thesp0nge/owasp-esapi-ruby
33
+ 2011-02-18 09:51:58 +0100 Paolo Perego Working on validating EU date formatted
34
+ 2011-02-18 00:16:22 -0500 Sal Scotto Added a CSS codec. Flow should go from Validator --> execute all relevant codecs to decode/encode the inputs BEFORE Applying all other rules. More codecs to come i.e. Base64, HTMLEntity, Hex, JavaScript, XMLEntity, Os specific i.e. Windows,Unix and Database level codecs to force escapes
35
+ 2011-02-17 19:27:50 -0500 Sal Scotto Merge branch 'master' of https://github.com/thesp0nge/owasp-esapi-ruby
36
+ 2011-02-17 18:02:55 +0100 Paolo Perego Now also dates written in US long format are recognized
37
+ 2011-02-17 09:14:39 +0100 Paolo Perego Now date validates MMM DD, YYY Added an ISSUE file to track remotely issues
38
+ 2011-02-17 08:05:01 +0100 Paolo Perego Added a ChangeLog and written some more stuff into README Zipcode had a wrong optional argument check that caused a null pointer exception. Date now validates good 'MM/DD/YYYY'
39
+ 2011-02-16 21:32:07 -0500 Sal Scotto Merge branch 'master' of https://github.com/thesp0nge/owasp-esapi-ruby
40
+ 2011-02-16 19:18:12 +0100 Paolo Perego Work over validators
41
+ 2011-02-16 09:47:00 +0100 Paolo Perego Fixed boolean operators
42
+ 2011-02-16 09:21:23 +0100 Paolo Perego Changed validator method from validate to valid? Added basic date validator
43
+ 2011-02-15 14:18:29 +0100 Paolo Perego Fixed typo
44
+ 2011-02-15 13:06:13 +0100 Paolo Perego Owasp Esapi Ruby will require at least 1.9.2 ruby version due to the usage of regex patterns only available with the new regex engine
45
+ 2011-02-15 12:59:06 +0100 Paolo Perego Now generic_validator handles validation method and both email than zipcode validators are run against it
46
+ 2011-02-15 11:56:08 +0100 Paolo Perego Removed a redundant method since matcher is an attr_accessor
47
+ 2011-02-15 01:53:09 -0800 Paolo Perego Added Daniele and Sal email addresses
48
+ 2011-02-15 09:08:07 +0100 Paolo Perego Added a generic validator class with a validate method. All specific validator will inehrit code from this class.
49
+ 2011-02-15 08:28:32 +0100 Paolo Perego Added a generic validator class with a validate method. All specific validator will inehrit code from this class.
50
+ 2011-02-15 08:25:39 +0100 Paolo Perego Modified boolean validation test
51
+ 2011-02-15 08:23:18 +0100 Paolo Perego Version bumped to 0.5.0. It means approx 5% of the work done.
52
+ 2011-02-15 08:21:42 +0100 Paolo Perego Renamed Sal Scotto rspec file with a filename that does not include it into running tasks (I want to see true failing tests). Let's use this good rspec as skeleton. Added an email address pattern rspec file. Implemented email address pattern validation.
53
+ 2011-02-14 18:41:49 -0500 Sal Scotto Merge branch 'master' of https://github.com/thesp0nge/owasp-esapi-ruby
54
+ 2011-02-14 18:26:36 +0100 Paolo Perego Fixed an initialization issue in XSS Added some Zip code spec Renamed Sal's validator skeleton not to be included in rake spec task
55
+ 2011-02-14 18:23:26 +0100 Paolo Perego Fixed (C) statement. Added a private filtering routine called by the public API
56
+ 2011-02-14 17:05:59 +0100 Paolo Perego (C) must be given to Owasp foundation
57
+ 2011-02-13 09:54:46 -0500 Paolo Perego Added a baseline validator spec
58
+ 2011-02-14 16:47:14 +0100 Paolo Perego Modified namespace. Now it's Owasp::Esapi
59
+ 2011-02-14 07:30:46 -0500 Sal Scotto Merge branch 'master' of https://github.com/thesp0nge/owasp-esapi-ruby
60
+ 2011-02-14 09:20:02 +0100 Paolo Perego Zipcode validator now works with Italian regular expression, must fix the US one
61
+ 2011-02-14 09:19:01 +0100 Paolo Perego Added AUTHORS file. Zipcode validator now works with Italian regular expression. Not the US one right now
62
+ 2011-02-13 16:56:26 +0100 Paolo Perego Renamed XSS sanitizer in a proper namespace. Added more test cases and created a basic (and not working right now) zip code validator.
63
+ 2011-02-13 09:54:46 -0500 Sal Scotto Added a baseline validator spec
64
+ 2011-02-12 17:35:01 +0100 Paolo Perego First real commit with 2 xss rspec and first xss sanitizing implementation. This is *just the beginning*
65
+ 2011-01-18 12:47:01 +0100 Paolo Perego Added _site and pixelmator file
66
+ 2011-01-14 14:58:47 +0100 Paolo Perego Added kickstarting info for Owasp Summit
67
+ 2010-06-01 13:25:38 +0200 Paolo Perego Some Typos
68
+ 2010-05-31 12:29:52 +0200 Paolo Perego Licensed as "new BSD" project with a starting README information
69
+ 2010-05-31 12:21:17 +0200 Paolo Perego Initial commit to owasp-esapi-ruby.
data/ISSUES ADDED
File without changes
data/LICENSE ADDED
@@ -0,0 +1,24 @@
1
+ Copyright (c) 2010-2011, The OWASP Foundation
2
+ All rights reserved.
3
+
4
+ Redistribution and use in source and binary forms, with or without
5
+ modification, are permitted provided that the following conditions are met:
6
+ * Redistributions of source code must retain the above copyright
7
+ notice, this list of conditions and the following disclaimer.
8
+ * Redistributions in binary form must reproduce the above copyright
9
+ notice, this list of conditions and the following disclaimer in the
10
+ documentation and/or other materials provided with the distribution.
11
+ * Neither the name of the <organization> nor the
12
+ names of its contributors may be used to endorse or promote products
13
+ derived from this software without specific prior written permission.
14
+
15
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
16
+ ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
17
+ WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
18
+ DISCLAIMED. IN NO EVENT SHALL <COPYRIGHT HOLDER> BE LIABLE FOR ANY
19
+ DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
20
+ (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
21
+ LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
22
+ ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
23
+ (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
24
+ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
data/README ADDED
@@ -0,0 +1,51 @@
1
+ = The Owasp ESAPI Ruby project
2
+
3
+ == Introduction
4
+
5
+ The Owasp ESAPI Ruby is a port for outstanding release quality Owasp ESAPI
6
+ project to the Ruby programming language.
7
+
8
+ Ruby is now a famous programming language due to its Rails framework developed by David Heinemeier Hansson (http://twitter.com/dhh) that simplify the creation of a web application using a convention over configuration approach to simplify programmers' life.
9
+
10
+ Despite Rails diffusion, there are a lot of Web framework out there that allow people to write web apps in Ruby (merb, sinatra, vintage) [http://accidentaltechnologist.com/ruby/10-alternative-ruby-web-frameworks/]. Owasp Esapi Ruby wants to bring all Ruby deevelopers a gem full of Secure APIs they can use whatever the framework they choose.
11
+
12
+ == Why supporting only Ruby 1.9.2 and beyond?
13
+
14
+ The OWASP Esapi Ruby gem will require at least version 1.9.2 of Ruby interpreter to make sure to have full advantages of the newer language APIs.
15
+
16
+ In particular version 1.9.2 introduces radical changes in the following areas:
17
+
18
+ === Regular expression engine
19
+ (to be written)
20
+
21
+ === UTF-8 support
22
+ Unicode support in 1.9.2 is much better and provides better support for character set encoding/decoding
23
+ * All strings have an additional chunk of info attached: Encoding
24
+ * String#size takes encoding into account – returns the encoded character count
25
+ * You can get the raw datasize
26
+ * Indexed access is by encoded data – characters, not bytes
27
+ * You can change encoding by force but it doesn’t convert the data
28
+
29
+ === Dates and Time
30
+ From "Programming Ruby 1.9"
31
+
32
+ "As of Ruby 1.9.2, the range of dates that can be represented is no longer limited by the under- lying operating system’s time representation (so there’s no year 2038 problem). As a result, the year passed to the methods gm, local, new, mktime, and utc must now include the century—a year of 90 now represents 90 and not 1990."
33
+
34
+ == Roadmap
35
+
36
+ Please see ChangeLog file.
37
+
38
+ == Note on Patches/Pull Requests
39
+
40
+ * Fork the project.
41
+ * Create documentation with rake yard task
42
+ * Make your feature addition or bug fix.
43
+ * Add tests for it. This is important so I don't break it in a
44
+ future version unintentionally.
45
+ * Commit, do not mess with rakefile, version, or history.
46
+ (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
47
+ * Send me a pull request. Bonus points for topic branches.
48
+
49
+ == Copyright
50
+
51
+ Copyright (c) 2011 the OWASP Foundation. See LICENSE for details.
@@ -0,0 +1,63 @@
1
+ require 'rubygems'
2
+ require 'rake'
3
+
4
+ begin
5
+ require 'jeweler'
6
+ Jeweler::Tasks.new do |gem|
7
+ gem.name = "owasp-esapi-ruby"
8
+ gem.summary = %Q{Owasp Enterprise Security APIs for Ruby language}
9
+ gem.description = File.read(File.join(File.dirname(__FILE__), 'README'))
10
+ gem.email = "thesp0nge@owasp.org"
11
+ gem.version = File.read(File.join(File.dirname(__FILE__), 'VERSION'))
12
+ gem.homepage = "http://github.com/thesp0nge/owasp-esapi-ruby"
13
+ gem.authors = File.read(File.join(File.dirname(__FILE__), 'AUTHORS'))
14
+ gem.required_ruby_version = '>= 1.9.2'
15
+ gem.add_development_dependency "rspec", ">= 1.2.9"
16
+ gem.add_development_dependency "yard", ">= 0"
17
+ gem.add_development_dependency "nokogiri",">= 1.4.4"
18
+ gem.add_dependency "nokogiri",">= 1.4.4"
19
+
20
+ # gem is a Gem::Specification... see http://www.rubygems.org/read/chapter/20 for additional settings
21
+ end
22
+ Jeweler::GemcutterTasks.new
23
+ rescue LoadError
24
+ puts "Jeweler (or a dependency) not available. Install it with: gem install jeweler"
25
+ end
26
+
27
+ require 'rspec/core/rake_task'
28
+ RSpec::Core::RakeTask.new(:spec) do |t|
29
+ t.pattern = "./spec/**/*_spec.rb"
30
+ # Put spec opts in a file named .rspec in root
31
+ end
32
+
33
+ # require 'spec/rake/spectask'
34
+ # Spec::Rake::SpecTask.new(:spec) do |spec|
35
+ # spec.libs << 'lib' << 'spec'
36
+ # spec.spec_files = FileList['spec/**/*_spec.rb']
37
+ # end
38
+
39
+ # Spec::Rake::SpecTask.new(:rcov) do |spec|
40
+ # spec.libs << 'lib' << 'spec'
41
+ # spec.pattern = 'spec/**/*_spec.rb'
42
+ # spec.rcov = true
43
+ # end
44
+
45
+ task :spec => :check_dependencies
46
+
47
+ task :default => :spec
48
+
49
+ begin
50
+ require 'yard'
51
+ YARD::Rake::YardocTask.new
52
+ rescue LoadError
53
+ task :yardoc do
54
+ abort "YARD is not available. In order to run yardoc, you must: sudo gem install yard"
55
+ end
56
+ end
57
+
58
+ namespace :prepare do
59
+ desc 'Generate ChangeLog'
60
+ task :changelog do
61
+ system ('git log --format="%ai %cn %s" > ChangeLog')
62
+ end
63
+ end
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.30.0
@@ -0,0 +1,99 @@
1
+ # The Codec interface defines a set of methods for encoding and decoding application level encoding schemes,
2
+ # * such as HTML entity encoding and percent encoding (aka URL encoding). Codecs are used in output encoding
3
+ # * and canonicalization. The design of these codecs allows for character-by-character decoding, which is
4
+ # * necessary to detect double-encoding and the use of multiple encoding schemes, both of which are techniques
5
+ # * used by attackers to bypass validation and bury encoded attacks in data.
6
+
7
+ class Fixnum
8
+ def to_h
9
+ to_s(16)
10
+ end
11
+ end
12
+ class Bignum
13
+ def to_h
14
+ to_s(16)
15
+ end
16
+ end
17
+
18
+ module Owasp
19
+ module Esapi
20
+ # The Codec module, houses Codec implementations
21
+ module Codec
22
+ class BaseCodec
23
+ # start range of valid code points
24
+ START_CODE_POINT = 0x000
25
+ # ending range of valid code points
26
+ END_CODE_POINT = 0x10fff
27
+
28
+ @@hex_codes = [] #:nodoc:
29
+ for c in (0..255) do
30
+ if (c >= 0x30 and c <= 0x39) or (c >= 0x41 and c <= 0x5A) or (c >= 0x61 and c <= 0x7A)
31
+ @@hex_codes[c] = nil
32
+ else
33
+ @@hex_codes[c] = c.to_h
34
+ end
35
+ end
36
+
37
+ # Encode a String so that it can be safely used in a specific context.
38
+ # immune is an arry or string that contains character tobe ignore
39
+ def encode(immune, input)
40
+ return nil if input.nil?
41
+ encoded_string = ''
42
+ encoded_string.encode!(Encoding::UTF_8)
43
+ input.encode(Encoding::UTF_8).chars do |c|
44
+ encoded_string << encode_char(immune,c)
45
+ end
46
+ encoded_string
47
+ end
48
+
49
+ # Default implementation that should be overridden in specific codecs.
50
+ def encode_char(immune, input)
51
+ input
52
+ end
53
+
54
+ # Helper method for codecs to get the hex value of a character
55
+ def hex(c)
56
+ return nil if c.nil?
57
+ b = c[0].ord
58
+ if b < 0xff
59
+ @@hex_codes[b]
60
+ else
61
+ b.to_h
62
+ end
63
+ end
64
+
65
+ # Decode a String that was encoded using the encode method in this Class
66
+ def decode(input)
67
+ decoded_string = ''
68
+ seekable = PushableString.new(input.dup)
69
+ while seekable.next?
70
+ t = decode_char(seekable)
71
+ if t.nil?
72
+ decoded_string << seekable.next
73
+ else
74
+ decoded_string << t
75
+ end
76
+ end
77
+ decoded_string
78
+ end
79
+
80
+ # Returns the decoded version of the next character from the input string and advances the
81
+ # current character in the PushableString. If the current character is not encoded, this
82
+ # method MUST reset the PushableString.
83
+ def decode_char(input)
84
+ input
85
+ end
86
+
87
+ # Basic min method
88
+ def min(a,b) #:nodoc:
89
+ if a > b
90
+ return b
91
+ else
92
+ return a
93
+ end
94
+ end
95
+
96
+ end
97
+ end
98
+ end
99
+ end
@@ -0,0 +1,101 @@
1
+ #
2
+ # Implementation of the Codec interface for backslash encoding used in CSS.
3
+
4
+ module Owasp
5
+ module Esapi
6
+ module Codec
7
+ class CssCodec < BaseCodec
8
+
9
+ # Returns backslash encoded character.
10
+ def encode_char(immune, input)
11
+ # check immune
12
+ return input if immune.include?(input)
13
+ # check for alpha numeric
14
+ hex = hex(input)
15
+ # add a space at end to terminate under css
16
+ return "\\#{hex} " unless hex.nil? or hex.empty?
17
+ input
18
+ end
19
+
20
+ # decode a character from the PushableString
21
+ # We follow the rules defined for CSS by w3
22
+ # http://www.w3.org/TR/CSS21/syndata.html#characters
23
+ # All CSS syntax is case-insensitive within the ASCII range (i.e., [a-z] and [A-Z] are equivalent), except for parts that are not under the control of CSS. For example,
24
+ # the case-sensitivity of values of the HTML attributes "id" and "class", of font names, and of URIs lies outside the scope of this specification.
25
+ # Note in particular that element names are case-insensitive in HTML, but case-sensitive in XML. In CSS, identifiers (including element names, classes, and IDs in selectors)
26
+ # can contain only the characters [a-zA-Z0-9] and ISO 10646 characters U+00A0 and higher, plus the hyphen (-) and the underscore (_); they cannot start with a digit,
27
+ # two hyphens, or a hyphen followed by a digit. Identifiers can also contain escaped characters and any ISO 10646 character as a numeric code (see next item). For instance,
28
+ # the identifier "B&W?" may be written as "B\&W\?" or "B\26 W\3F". Note that Unicode is code-by-code equivalent to ISO 10646 (see [UNICODE] and [ISO10646]).
29
+ # In CSS 2.1, a backslash (\) character can indicate one of three types of character escape. Inside a CSS comment, a backslash stands for itself, and if a backslash is
30
+ # immediately followed by the end of the style sheet, it also stands for itself (i.e., a DELIM token).
31
+ #
32
+ # First, inside a string, a backslash followed by a newline is ignored (i.e., the string is deemed not to contain either the backslash or the newline).
33
+ # Outside a string, a backslash followed by a newline stands for itself (i.e., a DELIM followed by a newline).
34
+ # <P>
35
+ # Second, it cancels the meaning of special CSS characters. Any character (except a hexadecimal digit, linefeed, carriage return, or form feed) can be escaped
36
+ # with a backslash to remove its special meaning. For example, "\"" is a string consisting of one double quote. Style sheet preprocessors must not remove these backslashes
37
+ # from a style sheet since that would change the style sheet's meaning.
38
+ # <P>
39
+ # Third, backslash escapes allow authors to refer to characters they cannot easily put in a document. In this case, the backslash is followed by at most six
40
+ # hexadecimal digits (0..9A..F), which stand for the ISO 10646 ([ISO10646]) character with that number, which must not be zero. (It is undefined in CSS 2.1 what happens
41
+ # if a style sheet does contain a character with Unicode codepoint zero.) If a character in the range [0-9a-fA-F] follows the hexadecimal number, the end of the number
42
+ # needs to be made clear. There are two ways to do that:
43
+ # 1. with a space (or other white space character): "\26 B" ("&B"). In this case, user agents should treat a "CR/LF" pair (U+000D/U+000A) as a single white space character.
44
+ # 2. by providing exactly 6 hexadecimal digits: "\000026B" ("&B")
45
+ # In fact, these two methods may be combined. Only one white space character is ignored after a hexadecimal escape. Note that this means that a "real" space after the
46
+ # escape sequence must be doubled.If the number is outside the range allowed by Unicode (e.g., "\110000" is above the maximum 10FFFF allowed in current Unicode), the UA
47
+ # may replace the escape with the "replacement character" (U+FFFD). If the character is to be displayed, the UA should show a visible symbol, such as a
48
+ # "missing character" glyph (cf. 15.2, point 5). Note: Backslash escapes are always considered to be part of an identifier or a string (i.e., "\7B" is not punctuation,
49
+ # even though "{" is, and "\32" is allowed at the start of a class name, even though "2" is not). The identifier "te\st" is exactly the same identifier as "test".
50
+ def decode_char(input)
51
+
52
+ input.mark
53
+ first = input.next
54
+ if first.nil? or !first.eql?('\\')
55
+ input.reset
56
+ return nil
57
+ end
58
+ second = input.next
59
+ if second.nil?
60
+ input.reset
61
+ return nil
62
+ end
63
+ # rule execution
64
+ fallthrough = false
65
+ if second == "\r"
66
+ # speical whitespace cases
67
+ if input.peek?("\n")
68
+ input.next
69
+ fallthrough = true
70
+ end
71
+ end
72
+ # handle the skip ahead. Ruby case doesnt allow for fall through so we inlined the small setup
73
+ return decode_char(input) if second == "\n" || second == "\f" || second == "\u0000" || fallthrough
74
+ # non hex test
75
+ return second if !input.hex?(second)
76
+ # check for 6 hex digits for rule 3
77
+ tmp = second
78
+ for i in 1..5 do
79
+ c = input.next
80
+ if c.nil? or c =~ /\s/
81
+ break
82
+ end
83
+ if input.hex?(c)
84
+ tmp << c
85
+ else
86
+ input.push(c)
87
+ end
88
+ end
89
+ # check the codepoint and if outside of range, return teh replacement
90
+ begin
91
+ i = tmp.hex
92
+ return i.chr(Encoding::UTF_8) if i >= START_CODE_POINT and i <= END_CODE_POINT
93
+ return "\ufffd"
94
+ rescue Exception => e
95
+ raise EncodingError.new("Received an exception while parsing a string verified to be hex")
96
+ end
97
+ end
98
+ end
99
+ end
100
+ end
101
+ end
@@ -0,0 +1,330 @@
1
+ # The Encoder interface contains a number of methods for decoding input and encoding output
2
+ # so that it will be safe for a variety of interpreters. To prevent
3
+ # double-encoding, callers should make sure input does not already contain encoded characters
4
+ # by calling canonicalize. Validator implementations should call canonicalize on user input
5
+ # <b>before</b> validating to prevent encoded attacks.
6
+ # All of the methods must use a "whitelist" or "positive" security model.
7
+ # For the encoding methods, this means that all characters should be encoded, except for a specific list of
8
+ # "immune" characters that are known to be safe.
9
+ # The Encoder performs two key functions, encoding and decoding. These functions rely
10
+ # on a set of codecs that can be found in the org.owasp.esapi.codecs package. These include:
11
+ # * CSS Escaping<
12
+ # * HTMLEntity Encoding
13
+ # * JavaScript Escaping
14
+ # * MySQL Escaping
15
+ # * Oracle Escaping
16
+ # * Percent Encoding (aka URL Encoding)
17
+ # * Unix Escaping
18
+ # * VBScript Escaping
19
+ # * Windows Encoding
20
+
21
+ require 'cgi'
22
+ require 'base64'
23
+ require 'codec/base_codec'
24
+ require 'codec/pushable_string'
25
+ require 'codec/base_codec'
26
+ require 'codec/css_codec'
27
+ require 'codec/html_codec'
28
+ require 'codec/percent_codec'
29
+ require 'codec/javascript_codec'
30
+ require 'codec/os_codec'
31
+ require 'codec/vbscript_codec'
32
+ require 'codec/oracle_codec'
33
+ require 'codec/mysql_codec'
34
+ require 'codec/xml_codec'
35
+
36
+ module Owasp
37
+ module Esapi
38
+ class Encoder
39
+ #
40
+ # == Immune Character feilds
41
+ #
42
+ IMMUNE_CSS = [ ]
43
+ IMMUNE_HTMLATTR = [ ',', '.', '-', '_' ]
44
+ IMMUNE_HTML = [ ',', '.', '-', '_', ' ' ]
45
+ IMMUNE_JAVASCRIPT = [ ',', '.', '_' ]
46
+ IMMUNE_VBSCRIPT = [ ',', '.', '_' ]
47
+ IMMUNE_XML = [ ',', '.', '-', '_', ' ' ]
48
+ IMMUNE_SQL = [ ' ' ]
49
+ IMMUNE_OS = [ '-' ]
50
+ IMMUNE_XMLATTR = [ ',', '.', '-', '_' ]
51
+ IMMUNE_XPATH = [ ',', '.', '-', '_', ' ' ]
52
+ PASSWORD_SPECIALS = "!$*-.=?@_"
53
+ # == Standard Characetr Sets
54
+ CHAR_LCASE = "abcdefghijklmnopqrstuvwxyz"
55
+ CHAR_UCASE = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
56
+ CHAR_DIGITS = "0123456789"
57
+ CHAR_SPECIALS = "!$*+-.=?@^_|~"
58
+ CHAR_LETTERS = "#{CHAR_LCASE}#{CHAR_UCASE}"
59
+ CHAR_ALPHANUMERIC = "#{CHAR_LETTERS}#{CHAR_DIGITS}"
60
+
61
+ # Create the encoder, optionally pass in a list of codecs to use
62
+ def initialize(configured_codecs = nil)
63
+ # codec list
64
+ @codecs = []
65
+ # default codecs
66
+ @html_codec = Owasp::Esapi::Codec::HtmlCodec.new
67
+ @percent_codec = Owasp::Esapi::Codec::PercentCodec.new
68
+ @js_codec = Owasp::Esapi::Codec::JavascriptCodec.new
69
+ @vb_codec = Owasp::Esapi::Codec::VbScriptCodec.new
70
+ @css_codec = Owasp::Esapi::Codec::CssCodec.new
71
+ @xml_codec = Owasp::Esapi::Codec::XmlCodec.new
72
+ unless configured_codecs.nil?
73
+ configured_codecs.each do |c|
74
+ @codecs << c
75
+ end
76
+ else
77
+ # setup some defaults codecs
78
+ @codecs << @html_codec
79
+ @codecs << @percent_codec
80
+ @codecs << @js_codec
81
+ end
82
+ end
83
+
84
+ # This method is equivalent to calling sanitize(input, true)
85
+ def canonicalize(input)
86
+ # if the input is nil, just return nil
87
+ return nil if input.nil?
88
+
89
+ # check teh ESAPI config and figure out if we want strict encoding
90
+ sanitize(input,Owasp::Esapi.security_config.ids?)
91
+ end
92
+
93
+ # Sanitization is simply the operation of reducing a possibly encoded
94
+ # string down to its simplest form. This is important, because attackers
95
+ # frequently use encoding to change their input in a way that will bypass
96
+ # validation filters, but still be interpreted properly by the target of
97
+ # the attack. Note that data encoded more than once is not something that a
98
+ # normal user would generate and should be regarded as an attack.
99
+ # Everyone says[http://cwe.mitre.org/data/definitions/180.html] you shouldn't do validation
100
+ # without canonicalizing the data first. This is easier said than done. The canonicalize method can
101
+ # be used to simplify just about any input down to its most basic form. Note that sanitization doesn't
102
+ # handle Unicode issues, it focuses on higher level encoding and escaping schemes. In addition to simple
103
+ # decoding, sanitize also handles:
104
+ # * Perverse but legal variants of escaping schemes
105
+ # * Multiple escaping (%2526 or &#x26;lt;)
106
+ # * Mixed escaping (%26lt;)
107
+ # * Nested escaping (%%316 or &%6ct;)
108
+ # * All combinations of multiple, mixed, and nested encoding/escaping (%2&#x35;3c or &#x2526gt;)
109
+ #
110
+ # Although ESAPI is able to canonicalize multiple, mixed, or nested encoding, it's safer to not accept
111
+ # this stuff in the first place. In ESAPI, the default is "strict" mode that throws an IntrusionException
112
+ # if it receives anything not single-encoded with a single scheme. Even if you disable "strict" mode,
113
+ # you'll still get warning messages in the log about each multiple encoding and mixed encoding received.
114
+ #
115
+ def sanitize(input, strict)
116
+ # check input again, as someone may just wana call sanitize
117
+ return nil if input.nil?
118
+ working = input
119
+ found_codec = nil
120
+ mixed_count = 1
121
+ found_count = 0
122
+ clean = false
123
+ while !clean
124
+ clean = true
125
+ @codecs.each do |codec|
126
+ old = working
127
+ working = codec.decode(working)
128
+ if !old.eql?(working)
129
+ if !found_codec.nil? and found_codec != codec
130
+ mixed_count += 1
131
+ end
132
+ found_codec = codec
133
+ if clean
134
+ found_count += 1
135
+ end
136
+ clean = false
137
+ end
138
+ end
139
+ end
140
+ # test for strict encoding, and indicate mixed and multiple errors
141
+ if found_count >= 2 and mixed_count > 1
142
+ if strict
143
+ raise Owasp::Esapi::IntrustionException.new("Input validation failure", "Multiple (#{found_count}x) and mixed encoding (#{mixed_count}x) detected in #{input}")
144
+ else
145
+ Owasp::Esapi.logger.warn("Multiple (#{found_count}x) and mixed encoding (#{mixed_count}x) detected in #{input}")
146
+ end
147
+ elsif found_count >= 2
148
+ if strict
149
+ raise Owasp::Esapi::IntrustionException.new("Input validation failure", "Multiple (#{found_count}x) detected in #{input}")
150
+ else
151
+ Owasp::Esapi.logger.warn("Multiple (#{found_count}x) detected in #{input}")
152
+ end
153
+ elsif mixed_count > 1
154
+ if strict
155
+ raise Owasp::Esapi::IntrustionException.new("Input validation failure", "Mixed encoding (#{mixed_count}x) detected in #{input}")
156
+ else
157
+ Owasp::Esapi.logger.warn("Mixed encoding (#{mixed_count}x) detected in #{input}")
158
+ end
159
+ end
160
+ working
161
+ end
162
+
163
+ # Encode for Base64. using the url safe input set
164
+ def encode_for_base64(input)
165
+ return nil if input.nil?
166
+ Base64.urlsafe_encode64(input)
167
+ end
168
+
169
+ # Decode data encoded with BASE-64 encoding.
170
+ # it assumes url safe encoding sets
171
+ def decode_for_base64(input)
172
+ return nil if input.nil?
173
+ Base64.urlsafe_decode64(input)
174
+ end
175
+
176
+ def encode_for_ldap(input)
177
+ end
178
+ def encode_for_dn(input)
179
+ end
180
+
181
+ # Encode for use in a URL. This method performs URL encoding[http://en.wikipedia.org/wiki/Percent-encoding]
182
+ # on the entire string.
183
+ def encode_for_url(input)
184
+ return nil if input.nil?
185
+ CGI::escape(input)
186
+ end
187
+
188
+ # Decode from URL. First canonicalize and detect any double-encoding.
189
+ # If this check passes, then the data is decoded using URL decoding.
190
+ def decode_for_url(input)
191
+ return nil if input.nil?
192
+ clean = sanitize(input)
193
+ CGI::unescape(input,Owasp::Esapi.security_config.encoding)
194
+ end
195
+
196
+ # Encode data for use in Cascading Style Sheets (CSS) content.
197
+ # CSS Syntax[http://www.w3.org/TR/CSS21/syndata.html#escaped-characters] (w3.org)
198
+ def encode_for_css(input)
199
+ return nil if input.nil?
200
+ @css_codec.encode(IMMUNE_CSS,input)
201
+ end
202
+
203
+ # Encode data for insertion inside a data value or function argument in JavaScript. Including user data
204
+ # directly inside a script is quite dangerous. Great care must be taken to prevent including user data
205
+ # directly into script code itself, as no amount of encoding will prevent attacks there.
206
+ #
207
+ # Please note there are some JavaScript functions that can never safely receive untrusted data
208
+ # as input – even if the user input is encoded.
209
+ #
210
+ # For example:
211
+ #
212
+ # <script>
213
+ # window.setInterval('<%= EVEN IF YOU ENCODE UNTRUSTED DATA YOU ARE XSSED HERE %>');
214
+ # </script>
215
+ #
216
+ def encode_for_javascript(input)
217
+ return nil if input.nil?
218
+ @js_codec.encode(IMMUNE_JAVASCRIPT,input)
219
+ end
220
+
221
+ # Encode data for use in HTML using HTML entity encoding
222
+ # <p>
223
+ # Note that the following characters:
224
+ # 00-08, 0B-0C, 0E-1F, and 7F-9F
225
+ # cannot be used in HTML.
226
+ #
227
+ # * HTML Encodings[http://en.wikipedia.org/wiki/Character_encodings_in_HTML] (wikipedia.org)
228
+ # * SGML Specification[http://www.w3.org/TR/html4/sgml/sgmldecl.html] (w3.org)
229
+ # * XML Specification[http://www.w3.org/TR/REC-xml/#charsets] (w3.org)
230
+ def encode_for_html(input)
231
+ return nil if input.nil?
232
+ @html_codec.encode(IMMUNE_HTML,input)
233
+ end
234
+
235
+ # Decodes HTML entities.
236
+ def dencode_for_html(input)
237
+ return nil if input.nil?
238
+ @html_codec.decode(input)
239
+ end
240
+
241
+ # Encode data for use in HTML attributes.
242
+ def encode_for_html_attr(input)
243
+ return nil if input.nil?
244
+ @html_codec.encode(IMMUNE_HTMLATTR,input)
245
+ end
246
+
247
+ # Encode for an operating system command shell according to the configured OS codec
248
+ #
249
+ # Please note the following recommendations before choosing to use this method:
250
+ #
251
+ # 1. It is strongly recommended that applications avoid making direct OS system calls if possible as such calls are not portable, and they are potentially unsafe. Please use language provided features if at all possible, rather than native OS calls to implement the desired feature.
252
+ # 2. If an OS call cannot be avoided, then it is recommended that the program to be invoked be invoked directly (e.g., Kernel.system("nameofcommand","parameterstocommand")) as this avoids the use of the command shell. The "parameterstocommand" should of course be validated before passing them to the OS command.
253
+ # 3. If you must use this method, then we recommend validating all user supplied input passed to the command shell as well, in addition to using this method in order to make the command shell invocation safe.
254
+ #
255
+ # An example use of this method would be: Kernel.system("dir" ,encode_for_os(WindowsCodec, "parameter(s)tocommandwithuserinput");
256
+ def encode_for_os(codec,input)
257
+ return nil if input.nil?
258
+ codec.encode(IMMUNE_OS,input)
259
+ end
260
+
261
+ # Encode data for insertion inside a data value in a Visual Basic script. Putting user data directly
262
+ # inside a script is quite dangerous. Great care must be taken to prevent putting user data
263
+ # directly into script code itself, as no amount of encoding will prevent attacks there.
264
+ #
265
+ # This method is not recommended as VBScript is only supported by Internet Explorer
266
+ def encode_for_vbscript(input)
267
+ return nil if input.nil?
268
+ @vb_codec.encode(IMMUNE_VBSCRIPT,input)
269
+ end
270
+
271
+ # Encode input for use in a SQL query, according to the selected codec
272
+ # (appropriate codecs include the MySQLCodec and OracleCodec).
273
+ #
274
+ # This method is not recommended. The use of the PreparedStatement
275
+ # interface is the preferred approach. However, if for some reason
276
+ # this is impossible, then this method is provided as a weaker
277
+ # alternative.
278
+ #
279
+ # The best approach is to make sure any single-quotes are double-quoted.
280
+ def encode_for_sql(codec,input)
281
+ return nil if input.nil?
282
+ codec.encode(IMMUNE_SQL,input)
283
+ end
284
+
285
+ # Encode data for use in an XPath query.
286
+ #
287
+ # NB: The reference implementation encodes almost everything and may over-encode.
288
+ #
289
+ # The difficulty with XPath encoding is that XPath has no built in mechanism for escaping
290
+ # characters. It is possible to use XQuery in a parameterized way to
291
+ # prevent injection.
292
+ #
293
+ # For more information, refer to this article[http://www.ibm.com/developerworks/xml/library/x-xpathinjection.html]
294
+ # which specifies the following list of characters as the most dangerous: ^&"*';<>().
295
+ #
296
+ # This[http://www.packetstormsecurity.org/papers/bypass/Blind_XPath_Injection_20040518.pdf] paper suggests disallowing ' and " in queries.<p>
297
+ # * XPath Injection[http://www.ibm.com/developerworks/xml/library/x-xpathinjection.html] (ibm.com)
298
+ # * Blind XPath Injection[http://www.packetstormsecurity.org/papers/bypass/Blind_XPath_Injection_20040518.pdf] (packetstormsecurity.org)
299
+ def encode_for_xpath(input)
300
+ return nil if input.nil?
301
+ @xml_codec.encode(IMMUNE_XPATH,input)
302
+ end
303
+
304
+ # Encode data for use in an XML element. The implementation should follow the
305
+ # XML Encoding Standard[http://www.w3schools.com/xml/xml_encoding.asp] from the W3C.
306
+ # <p>
307
+ # The use of a real XML parser is strongly encouraged. However, in the
308
+ # hopefully rare case that you need to make sure that data is safe for
309
+ # inclusion in an XML document and cannot use a parse, this method provides
310
+ # a safe mechanism to do so.
311
+ def encode_for_xml(input)
312
+ return nil if input.nil?
313
+ @xml_codec.encode(IMMUNE_XML,input)
314
+ end
315
+
316
+ # Encode data for use in an XML attribute. The implementation should follow
317
+ # the XML Encoding Standard[http://www.w3schools.com/xml/xml_encoding.asp] from the W3C.
318
+ # <p>
319
+ # The use of a real XML parser is highly encouraged. However, in the
320
+ # hopefully rare case that you need to make sure that data is safe for
321
+ # inclusion in an XML document and cannot use a parse, this method provides
322
+ # a safe mechanism to do so.
323
+ def encode_for_xml_attr(input)
324
+ return nil if input.nil?
325
+ @xml_codec.encode(IMMUNE_XMLATTR,input)
326
+ end
327
+
328
+ end
329
+ end
330
+ end