httpbl 0.1.3

Sign up to get free protection for your applications and to get access to all the features.
Files changed (8) hide show
  1. data/CHANGELOG +1 -0
  2. data/LICENSE +21 -0
  3. data/Manifest +7 -0
  4. data/README +147 -0
  5. data/Rakefile +7 -0
  6. data/httpbl.gemspec +34 -0
  7. data/lib/httpbl.rb +59 -0
  8. metadata +76 -0
data/CHANGELOG ADDED
@@ -0,0 +1 @@
1
+ v0.1.3. First public test release, not ready for production
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License
2
+
3
+ Copyright (c) 2009 Brandon Palmen
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/Manifest ADDED
@@ -0,0 +1,7 @@
1
+ README
2
+ lib/httpbl.rb
3
+ Rakefile
4
+ httpbl.gemspec
5
+ CHANGELOG
6
+ LICENSE
7
+ Manifest
data/README ADDED
@@ -0,0 +1,147 @@
1
+ HttpBL
2
+ ===========
3
+
4
+ HttpBL is drop-in IP-filtering middleware for Rails 2.3+ and other Rack-based
5
+ applications. It resolves information about each request's source IP address
6
+ from the Http:BL service at http://projecthoneypot.org, and denies access to
7
+ clients whose IP addresses are associated with suspicious behavior like impolite
8
+ crawling, comment-spamming, dictionary attacks, and email-harvesting.
9
+
10
+ * Deny access to IP addresses that are associated with suspicious
11
+ behavior which exceeds a customizable threshold.
12
+ * Expire blocked IPs that have not been associated with suspicious
13
+ behavior after a customizable period of days.
14
+ * Identify common search engines by IP address (not User-Agent), and
15
+ disallow access to a specific subset.
16
+
17
+ Installation
18
+ ------------
19
+
20
+ gem install bpalmen-httpbl
21
+
22
+ Basic Usage
23
+ ------------
24
+
25
+ HttpBL is Rack middleware, and can be used with any Rack-based application. First,
26
+ you must obtain an API key for the Http:BL service at http://projecthoneypot.org
27
+
28
+ To add HttpBL to your middleware stack, simply add the following to config.ru:
29
+
30
+ require 'httpbl'
31
+
32
+ use HttpBL, :api_key => "YOUR API KEY"
33
+
34
+ For Rails 2.3+ add the following to environment.rb:
35
+
36
+ require 'httpbl'
37
+
38
+ config.middleware.use HttpBL, :api_key => "YOUR API KEY"
39
+
40
+ Advanced Usage
41
+ -------------
42
+
43
+ To insert HttpBL at the top of the Rails rackstack:
44
+ (use 'rake middleware' to confirm that Rack::Lock is at the top of the stack)
45
+
46
+ config.middleware.insert_before(Rack::Lock, HttpBL, :api_key => "YOUR API KEY")
47
+
48
+ To customize HttpBL's filtering behavior, use the available options:
49
+
50
+ use HttpBL, :api_key => "YOUR API KEY",
51
+ :deny_types => [1, 2, 4],
52
+ :threat_level_threshold => 0,
53
+ :age_threshold => 5,
54
+ :blocked_search_engines => [0],
55
+
56
+ Available Options:
57
+
58
+ The following options (shown with default values) are available to
59
+ customize the particular types of suspicious activity you wish to thwart:
60
+
61
+ :deny_types => [1, 2, 4, 8, 16, 32, 64, 128]
62
+
63
+ Project Honeypot classifies suspicious behavior as belonging to
64
+ certain types, which are identified in the API's response to
65
+ each IP lookup. You can tell HttpBL to only deny certain kinds
66
+ of behavior by changing this to a subset of those possible.
67
+
68
+ As of March 2009, only types 1, 2, and 4 have been specified,
69
+ but additional types are reserved for the future and HttpBL checks
70
+ against all of the anticipated type codes by default. Thus,
71
+ there may be a very small performance advantage to setting
72
+ :deny_types => [1, 2, 4] simply to exclude checks for codes
73
+ that aren't (yet) being used; however, this will have to be
74
+ updated if more codes come into use, whereas the default
75
+ requires no further attention.
76
+
77
+ The current types are:
78
+ 1: Suspicious
79
+ 2: Harvester
80
+ 4: Comment Spammer
81
+
82
+ :threat_level_threshold => 2
83
+
84
+ The threat level reported by Project Honeypot is based on a
85
+ logarithmic scale, approximated by:
86
+ 1: 1 spam
87
+ 25: 100 spam
88
+ 50: 10,000 spam
89
+ 100: 1,000,000 spam.
90
+ in which spam is pronounced spam even in the plural.
91
+
92
+ Choosing a threat level threshold can be tricky business if
93
+ one isn't sure how accurate the measure of threat is, since it
94
+ would be improper to block legitimate traffic by mistake. Because
95
+ the email addresses that Project Honeypot uses as spam-bait are unique,
96
+ artificial, and well-hidden, NO email should be sent to those addresses
97
+ at all, and it is fair to assume that even the low threat level
98
+ associated with just a few spam is still significant.
99
+
100
+ With that in mind, the default threshold is 2; if you want to
101
+ filter more aggressively, set :threat_level_threshold => 0
102
+
103
+ :age_threshold => 10
104
+
105
+ This sets the number of days that IP addresses that have been
106
+ associated with suspicous activity must wait to regain access after
107
+ the suspicious activity has ceased. Keeping this at a sane value will
108
+ allow IPs that are reassigned or cleaned up to expire from the blacklist.
109
+
110
+ If you want to be more aggressive (require a longer cool-off-period),
111
+ set :age_threshold => 30; if you want to let IPs back in after just a
112
+ few days, set :age_threshold => 5
113
+
114
+ :blocked_search_engines => []
115
+
116
+ Because Project Honeypot identifies search engine traffic by IP
117
+ address, this filter may be used to exclude certain robots from your
118
+ site. If one presumes that request-IPs are at least marginally more
119
+ difficult to spoof than User-Agent strings, this filter may be marginally
120
+ more effective than some other robot detection systems.
121
+
122
+ If there are particular search engines that you would like to exclude
123
+ from your site, set :blocked_search_engines => [0, ... ] where the codes
124
+ defined by http://projecthoneypot.org/httpbl_api are:
125
+
126
+ 0: Misc
127
+ 1: AltaVista
128
+ 2: Ask
129
+ 3: Baidu
130
+ 4: Excite
131
+ 5: Google
132
+ 6: Looksmart
133
+ 7: Lycos
134
+ 8: MSN
135
+ 9: Yahoo
136
+ 10: Cuil
137
+ 11: InfoSeek
138
+
139
+ :dns_timeout => 0.5
140
+
141
+ DNS requests to the Http:BL service should NEVER take this long, but if
142
+ they do, you can modify this setting to prevent the application from
143
+ hanging until a system default timeout. Of course, setting this timeout
144
+ too low will essentially disable the filter (but 0 is a bad idea), if responses
145
+ can't be returned from the API before the request is permitted, by default.
146
+ Best not to mess with it unless you know what you're doing - it's a safety
147
+ mechanism.
data/Rakefile ADDED
@@ -0,0 +1,7 @@
1
+ require 'echoe'
2
+ Echoe.new('httpbl') do |p|
3
+ p.author = "Brandon Palmen"
4
+ p.summary = "A Rack middleware IP filter that uses Http:BL to exclude suspicious robots."
5
+ p.url = "http://github.com/bpalmen/httpbl"
6
+ p.runtime_dependencies = ["rack"]
7
+ end
data/httpbl.gemspec ADDED
@@ -0,0 +1,34 @@
1
+ # -*- encoding: utf-8 -*-
2
+
3
+ Gem::Specification.new do |s|
4
+ s.name = %q{httpbl}
5
+ s.version = "0.1.3"
6
+
7
+ s.required_rubygems_version = Gem::Requirement.new(">= 1.2") if s.respond_to? :required_rubygems_version=
8
+ s.authors = ["Brandon Palmen"]
9
+ s.date = %q{2009-03-22}
10
+ s.description = %q{A Rack middleware IP filter that uses Http:BL to exclude suspicious robots.}
11
+ s.email = %q{}
12
+ s.extra_rdoc_files = ["README", "lib/httpbl.rb", "CHANGELOG", "LICENSE"]
13
+ s.files = ["README", "lib/httpbl.rb", "Rakefile", "httpbl.gemspec", "CHANGELOG", "LICENSE", "Manifest"]
14
+ s.has_rdoc = true
15
+ s.homepage = %q{http://github.com/bpalmen/httpbl}
16
+ s.rdoc_options = ["--line-numbers", "--inline-source", "--title", "Httpbl", "--main", "README"]
17
+ s.require_paths = ["lib"]
18
+ s.rubyforge_project = %q{httpbl}
19
+ s.rubygems_version = %q{1.3.1}
20
+ s.summary = %q{A Rack middleware IP filter that uses Http:BL to exclude suspicious robots.}
21
+
22
+ if s.respond_to? :specification_version then
23
+ current_version = Gem::Specification::CURRENT_SPECIFICATION_VERSION
24
+ s.specification_version = 2
25
+
26
+ if Gem::Version.new(Gem::RubyGemsVersion) >= Gem::Version.new('1.2.0') then
27
+ s.add_runtime_dependency(%q<rack>, [">= 0"])
28
+ else
29
+ s.add_dependency(%q<rack>, [">= 0"])
30
+ end
31
+ else
32
+ s.add_dependency(%q<rack>, [">= 0"])
33
+ end
34
+ end
data/lib/httpbl.rb ADDED
@@ -0,0 +1,59 @@
1
+ # The Httpbl middleware
2
+
3
+ class HttpBL
4
+ autoload :Resolv, 'resolv'
5
+
6
+ def initialize(app, options = {})
7
+ @app = app
8
+ @options = {:blocked_search_engines => [],
9
+ :age_threshold => 10,
10
+ :threat_level_threshold => 2,
11
+ # 8..128 aren't used as of 3/2009, but might be used in the future
12
+ :deny_types => [1, 2, 4, 8, 16, 32, 64, 128],
13
+ # DONT set this to 0
14
+ :dns_timeout => 0.5
15
+ }.merge(options)
16
+ raise "Missing :api_key for Http:BL middleware" unless @options[:api_key]
17
+ end
18
+
19
+ def call(env)
20
+ dup._call(env)
21
+ end
22
+
23
+ def _call(env)
24
+ request = Rack::Request.new(env)
25
+ bl_status = resolve(request.ip)
26
+ if bl_status and blocked?(bl_status)
27
+ [403, {"Content-Type" => "text/html"}, "<h1>403 Forbidden</h1> Request IP is listed as suspicious by <a href='http://projecthoneypot.org/ip_#{request.ip}'>Project Honeypot</a>"]
28
+ else
29
+ @app.call(env)
30
+ end
31
+
32
+ end
33
+
34
+ def resolve(ip)
35
+ query = @options[:api_key] + '.' + ip.split('.').reverse.join('.') + '.dnsbl.httpbl.org'
36
+ Timeout::timeout(@options[:dns_timeout]) do
37
+ Resolv::DNS.new.getaddress(query).to_s rescue nil
38
+ end
39
+ rescue Timeout::Error, Errno::ECONNREFUSED
40
+ end
41
+
42
+ def blocked?(response)
43
+ response = response.split('.').collect!(&:to_i)
44
+ if response[0] == 127
45
+ if response[3] == 0
46
+ @blocked = true if @options[:blocked_search_engines].include? response[2]
47
+ else
48
+ @age = true if response[1] < @options[:age_threshold]
49
+ @threat = true if response[2] > @options[:threat_level_threshold]
50
+ @options[:deny_types].each do |key|
51
+ @deny = true if response[3] & key == key
52
+ end
53
+ @blocked = true if @deny and @threat and @age
54
+ end
55
+ end
56
+ return @blocked
57
+ end
58
+
59
+ end
metadata ADDED
@@ -0,0 +1,76 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: httpbl
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.3
5
+ platform: ruby
6
+ authors:
7
+ - Brandon Palmen
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+
12
+ date: 2009-03-22 00:00:00 -04:00
13
+ default_executable:
14
+ dependencies:
15
+ - !ruby/object:Gem::Dependency
16
+ name: rack
17
+ type: :runtime
18
+ version_requirement:
19
+ version_requirements: !ruby/object:Gem::Requirement
20
+ requirements:
21
+ - - ">="
22
+ - !ruby/object:Gem::Version
23
+ version: "0"
24
+ version:
25
+ description: A Rack middleware IP filter that uses Http:BL to exclude suspicious robots.
26
+ email: ""
27
+ executables: []
28
+
29
+ extensions: []
30
+
31
+ extra_rdoc_files:
32
+ - README
33
+ - lib/httpbl.rb
34
+ - CHANGELOG
35
+ - LICENSE
36
+ files:
37
+ - README
38
+ - lib/httpbl.rb
39
+ - Rakefile
40
+ - httpbl.gemspec
41
+ - CHANGELOG
42
+ - LICENSE
43
+ - Manifest
44
+ has_rdoc: true
45
+ homepage: http://github.com/bpalmen/httpbl
46
+ post_install_message:
47
+ rdoc_options:
48
+ - --line-numbers
49
+ - --inline-source
50
+ - --title
51
+ - Httpbl
52
+ - --main
53
+ - README
54
+ require_paths:
55
+ - lib
56
+ required_ruby_version: !ruby/object:Gem::Requirement
57
+ requirements:
58
+ - - ">="
59
+ - !ruby/object:Gem::Version
60
+ version: "0"
61
+ version:
62
+ required_rubygems_version: !ruby/object:Gem::Requirement
63
+ requirements:
64
+ - - ">="
65
+ - !ruby/object:Gem::Version
66
+ version: "1.2"
67
+ version:
68
+ requirements: []
69
+
70
+ rubyforge_project: httpbl
71
+ rubygems_version: 1.3.1
72
+ signing_key:
73
+ specification_version: 2
74
+ summary: A Rack middleware IP filter that uses Http:BL to exclude suspicious robots.
75
+ test_files: []
76
+