ghazel-httpbl 0.1.6.1

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,5 @@
1
+ v0.1.6.1. Syntax error fixes from ghazel.
2
+
3
+ v0.1.6. Adding memcache-client option to enable per-session lookups instead of per-request. General refactoring.
4
+
5
+ v0.1.3. First public test release, not ready for production
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License
2
+
3
+ Copyright (c) 2009 Brandon Palmen
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
@@ -0,0 +1,7 @@
1
+ README
2
+ lib/httpbl.rb
3
+ Rakefile
4
+ httpbl.gemspec
5
+ CHANGELOG
6
+ LICENSE
7
+ Manifest
data/README ADDED
@@ -0,0 +1,168 @@
1
+ HttpBL
2
+ ===========
3
+
4
+ HttpBL is drop-in IP-filtering middleware for Rails 2.3+ and other Rack-based
5
+ applications. It resolves information about each request's source IP address
6
+ from the Http:BL service at http://projecthoneypot.org, and denies access to
7
+ clients whose IP addresses are associated with suspicious behavior like impolite
8
+ crawling, comment-spamming, dictionary attacks, and email-harvesting.
9
+
10
+ * Deny access to IP addresses that are associated with suspicious
11
+ behavior which exceeds a customizable threshold.
12
+ * Expire blocked IPs that have not been associated with suspicious
13
+ behavior after a customizable period of days.
14
+ * Identify common search engines by IP address (not User-Agent), and
15
+ disallow access to a specific subset.
16
+ * Optionally use memcached to avoid repeated look-ups per client-session
17
+
18
+ Installation
19
+ ------------
20
+
21
+ gem install httpbl
22
+
23
+ Basic Usage
24
+ ------------
25
+
26
+ HttpBL is Rack middleware, and can be used with any Rack-based application. First,
27
+ you must obtain an API key for the Http:BL service at http://projecthoneypot.org
28
+
29
+ To add HttpBL to your middleware stack, simply add the following to config.ru:
30
+
31
+ require 'httpbl'
32
+
33
+ use HttpBL, :api_key => "YOUR API KEY"
34
+
35
+ For Rails 2.3+ add the following to environment.rb:
36
+
37
+ require 'httpbl'
38
+
39
+ config.middleware.use HttpBL, :api_key => "YOUR API KEY"
40
+
41
+ Advanced Usage
42
+ -------------
43
+
44
+ To insert HttpBL at the top of the Rails rackstack:
45
+ (use 'rake middleware' to confirm that Rack::Lock is at the top of the stack)
46
+
47
+ config.middleware.insert_before(Rack::Lock, HttpBL, :api_key => "YOUR API KEY")
48
+
49
+ To customize HttpBL's filtering behavior, use the available options:
50
+
51
+ use HttpBL, :api_key => "YOUR API KEY",
52
+ :deny_types => [1, 2, 4],
53
+ :threat_level_threshold => 0,
54
+ :age_threshold => 5,
55
+ :blocked_search_engines => [0],
56
+ :memcached_server => "127.0.0.1:11211",
57
+ :memcached_options => {see: memcache-client documentation}
58
+
59
+ Available Options:
60
+
61
+ The following options (shown with default values) are available to
62
+ customize the behavior of the httpbl middleware filter:
63
+
64
+ :deny_types => [1, 2, 4, 8, 16, 32, 64, 128]
65
+
66
+ Project Honeypot classifies suspicious behavior as belonging to
67
+ certain types, which are identified in the API's response to
68
+ each IP lookup. You can tell HttpBL to only deny certain kinds
69
+ of behavior by changing this to a subset of those possible.
70
+
71
+ As of March 2009, only types 1, 2, and 4 have been specified,
72
+ but additional types are reserved for the future and HttpBL checks
73
+ against all of the anticipated type codes by default. Thus,
74
+ there may be a very small performance advantage to setting
75
+ :deny_types => [1, 2, 4] simply to exclude checks for codes
76
+ that aren't (yet) being used; however, this will have to be
77
+ updated if more codes come into use, whereas the default
78
+ requires no further attention.
79
+
80
+ The current types are:
81
+ 1: Suspicious
82
+ 2: Harvester
83
+ 4: Comment Spammer
84
+
85
+ :threat_level_threshold => 2
86
+
87
+ The threat level reported by Project Honeypot is based on a
88
+ logarithmic scale, approximated by:
89
+ 1: 1 spam
90
+ 25: 100 spam
91
+ 50: 10,000 spam
92
+ 100: 1,000,000 spam.
93
+ in which spam is pronounced spam even in the plural.
94
+
95
+ Choosing a threat level threshold can be tricky business if
96
+ one isn't sure how accurate the measure of threat is, since it
97
+ would be improper to block legitimate traffic by mistake. Because
98
+ the email addresses that Project Honeypot uses as spam-bait are unique,
99
+ artificial, and well-hidden, NO email should be sent to those addresses
100
+ at all, and it is fair to assume that even the low threat level
101
+ associated with just a few spam is still significant.
102
+
103
+ With that in mind, the default threshold is 2; if you want to
104
+ filter more aggressively, set :threat_level_threshold => 0
105
+
106
+ :age_threshold => 10
107
+
108
+ This sets the number of days that IP addresses that have been
109
+ associated with suspicous activity must wait to regain access after
110
+ the suspicious activity has ceased. Keeping this at a sane value will
111
+ allow IPs that are reassigned or cleaned up to expire from the blacklist.
112
+
113
+ If you want to be more aggressive (require a longer cool-off-period),
114
+ set :age_threshold => 30; if you want to let IPs back in after just a
115
+ few days, set :age_threshold => 5
116
+
117
+ :blocked_search_engines => []
118
+
119
+ Because Project Honeypot identifies search engine traffic by IP
120
+ address, this filter may be used to exclude certain robots from your
121
+ site. If one presumes that request-IPs are at least marginally more
122
+ difficult to spoof than User-Agent strings, this filter may be marginally
123
+ more effective than some other robot detection systems.
124
+
125
+ If there are particular search engines that you would like to exclude
126
+ from your site, set :blocked_search_engines => [0, ... ] where the codes
127
+ defined by http://projecthoneypot.org/httpbl_api.php are:
128
+
129
+ 0: Undocumented
130
+ 1: AltaVista
131
+ 2: Ask
132
+ 3: Baidu
133
+ 4: Excite
134
+ 5: Google
135
+ 6: Looksmart
136
+ 7: Lycos
137
+ 8: MSN
138
+ 9: Yahoo
139
+ 10: Cuil
140
+ 11: InfoSeek
141
+ 12: Miscellaneous
142
+
143
+ :memcached_server => nil
144
+ :memcached_options => {}
145
+
146
+ When using httpbl in a production environment, it is *strongly* recommended
147
+ that you configure httpbl to use memcached to temporarily store the blacklist
148
+ status of client ip addresses. This greatly enhances the efficiency of the
149
+ filter because it need only look up each client ip address once per session,
150
+ instead of once per request. It also reduces the potential burden of a
151
+ popular web application that uses httpbl on project honeypot's api services.
152
+
153
+ Simply set :memcached_server and :memcached_options according to the
154
+ conventions of the memcache-client ruby library; for example:
155
+ :memcached_server => '127.0.0.1:11211', :memcached_options => {:namespace => 'my_app'}
156
+
157
+ memcache-client is included in rails by default, but if you're using rack
158
+ without rails, you will need to install and require the memcache-client gem.
159
+
160
+ :dns_timeout => 0.5
161
+
162
+ DNS requests to the Http:BL service shouldn't take this long, but if
163
+ they do, you can modify this setting to prevent the request from
164
+ hanging until a system default timeout. Of course, setting this timeout
165
+ too low will essentially disable the filter (but 0 is a bad idea), if responses
166
+ can't be returned from the API before the request is permitted.
167
+ Best not to mess with it unless you know what you're doing - it's a safety
168
+ mechanism.
@@ -0,0 +1,7 @@
1
+ require 'echoe'
2
+ Echoe.new('ghazel-httpbl') do |p|
3
+ p.author = "Brandon Palmen"
4
+ p.summary = "A Rack middleware IP filter that uses Http:BL to exclude suspicious robots."
5
+ p.url = "http://github.com/bpalmen/httpbl"
6
+ p.runtime_dependencies = ["rack"]
7
+ end
@@ -0,0 +1,33 @@
1
+ # -*- encoding: utf-8 -*-
2
+
3
+ Gem::Specification.new do |s|
4
+ s.name = %q{ghazel-httpbl}
5
+ s.version = "0.1.6.1"
6
+
7
+ s.required_rubygems_version = Gem::Requirement.new(">= 1.2") if s.respond_to? :required_rubygems_version=
8
+ s.authors = ["Brandon Palmen"]
9
+ s.date = %q{2010-10-20}
10
+ s.description = %q{A Rack middleware IP filter that uses Http:BL to exclude suspicious robots.}
11
+ s.email = %q{}
12
+ s.extra_rdoc_files = ["README", "lib/httpbl.rb", "CHANGELOG", "LICENSE"]
13
+ s.files = ["README", "lib/httpbl.rb", "Rakefile", "httpbl.gemspec", "CHANGELOG", "LICENSE", "Manifest", "ghazel-httpbl.gemspec"]
14
+ s.homepage = %q{http://github.com/bpalmen/httpbl}
15
+ s.rdoc_options = ["--line-numbers", "--inline-source", "--title", "Ghazel-httpbl", "--main", "README"]
16
+ s.require_paths = ["lib"]
17
+ s.rubyforge_project = %q{ghazel-httpbl}
18
+ s.rubygems_version = %q{1.3.5}
19
+ s.summary = %q{A Rack middleware IP filter that uses Http:BL to exclude suspicious robots.}
20
+
21
+ if s.respond_to? :specification_version then
22
+ current_version = Gem::Specification::CURRENT_SPECIFICATION_VERSION
23
+ s.specification_version = 3
24
+
25
+ if Gem::Version.new(Gem::RubyGemsVersion) >= Gem::Version.new('1.2.0') then
26
+ s.add_runtime_dependency(%q<rack>, [">= 0"])
27
+ else
28
+ s.add_dependency(%q<rack>, [">= 0"])
29
+ end
30
+ else
31
+ s.add_dependency(%q<rack>, [">= 0"])
32
+ end
33
+ end
@@ -0,0 +1,22 @@
1
+ Gem::Specification.new do |s|
2
+ s.name = 'httpbl'
3
+ s.version = '0.1.6.1'
4
+ s.date = '2009-05-28'
5
+ s.homepage = "http://bpalmen.github.com/httpbl/"
6
+ s.authors = ["Brandon Palmen"]
7
+ s.email = "brandon.palmen@gmail.com"
8
+ s.rubyforge_project = 'httpbl'
9
+ s.summary = "HttpBL is a Rack middleware filter that blocks requests from suspicious IP addresses."
10
+ s.description = "HttpBL is a Rack middleware filter that blocks requests from suspicious IP addresses."
11
+
12
+ s.files = %w[
13
+ README
14
+ CHANGELOG
15
+ LICENSE
16
+ lib/httpbl.rb
17
+ ]
18
+
19
+ s.add_dependency 'rack', '>= 0.9.0'
20
+ s.extra_rdoc_files = %w[README]
21
+ s.require_paths = %w[lib]
22
+ end
@@ -0,0 +1,90 @@
1
+ # The Httpbl middleware
2
+
3
+ class HttpBL
4
+ autoload :Resolv, 'resolv'
5
+
6
+ def self.encourage_safe_timeouts
7
+ if /^1\.8/ =~ RUBY_VERSION
8
+ begin
9
+ require 'system_timer'
10
+ @@DnsTimeout = SystemTimer
11
+ rescue LoadError
12
+ require 'timeout'
13
+ @@DnsTimeout = Timeout
14
+ end
15
+ else
16
+ require 'timeout'
17
+ @@DnsTimeout = Timeout
18
+ end
19
+ end
20
+
21
+ encourage_safe_timeouts
22
+
23
+ def initialize(app, options = {})
24
+ @app = app
25
+ @options = {:blocked_search_engines => [],
26
+ :age_threshold => 10,
27
+ :threat_level_threshold => 2,
28
+ :deny_types => [1, 2, 4, 8, 16, 32, 64, 128], # 8..128 aren't used as of 10/2009, but might be used in the future
29
+ :dns_timeout => 0.5,
30
+ :memcached_server => nil,
31
+ :memcached_options => {}
32
+ }.merge(options)
33
+ raise "Missing :api_key for Http:BL middleware" unless @options[:api_key]
34
+ if @options[:memcached_server]
35
+ require 'memcache'
36
+ @cache = MemCache.new(@options[:memcached_server], @options[:memcached_options])
37
+ end
38
+ end
39
+
40
+ def call(env)
41
+ dup._call(env)
42
+ end
43
+
44
+ def _call(env)
45
+ request = Rack::Request.new(env)
46
+ bl_status = check(request.ip)
47
+ if bl_status and blocked?(bl_status)
48
+ [403, {"Content-Type" => "text/html"}, "<h1>403 Forbidden</h1> Request IP is listed as suspicious by <a href='http://projecthoneypot.org/ip_#{request.ip}'>Project Honeypot</a>"]
49
+ else
50
+ @app.call(env)
51
+ end
52
+
53
+ end
54
+
55
+ def check(ip)
56
+ @cache ? cache_check(ip) : resolve(ip)
57
+ end
58
+
59
+ def cache_check(ip)
60
+ cache = @cache.clone if @cache
61
+ unless response = cache.get("httpbl_#{ip}")
62
+ response = resolve(ip)
63
+ cache.set("httpbl_#{ip}", (response || "0.0.0.0"), 1.hour)
64
+ end
65
+ return response
66
+ end
67
+
68
+ def resolve(ip)
69
+ query = @options[:api_key] + '.' + ip.split('.').reverse.join('.') + '.dnsbl.httpbl.org'
70
+ @@DnsTimeout::timeout(@options[:dns_timeout]) do
71
+ Resolv::DNS.new.getaddress(query).to_s rescue false
72
+ end
73
+ rescue Timeout::Error, Errno::ECONNREFUSED
74
+ end
75
+
76
+ def blocked?(response)
77
+ response = response.split('.').collect!(&:to_i)
78
+ if response[0] == 127
79
+ if response[3] == 0
80
+ blocked = @options[:blocked_search_engines].include?(response[2])
81
+ else
82
+ blocked = @options[:deny_types].collect{|key| response[3] & key == key }.any? and response[2] > @options[:threat_level_threshold] and response[1] < @options[:age_threshold]
83
+ end
84
+ end
85
+ return blocked
86
+ end
87
+
88
+ private
89
+
90
+ end
metadata ADDED
@@ -0,0 +1,79 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: ghazel-httpbl
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.6.1
5
+ platform: ruby
6
+ authors:
7
+ - Brandon Palmen
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+
12
+ date: 2010-10-20 00:00:00 -07:00
13
+ default_executable:
14
+ dependencies:
15
+ - !ruby/object:Gem::Dependency
16
+ name: rack
17
+ type: :runtime
18
+ version_requirement:
19
+ version_requirements: !ruby/object:Gem::Requirement
20
+ requirements:
21
+ - - ">="
22
+ - !ruby/object:Gem::Version
23
+ version: "0"
24
+ version:
25
+ description: A Rack middleware IP filter that uses Http:BL to exclude suspicious robots.
26
+ email: ""
27
+ executables: []
28
+
29
+ extensions: []
30
+
31
+ extra_rdoc_files:
32
+ - README
33
+ - lib/httpbl.rb
34
+ - CHANGELOG
35
+ - LICENSE
36
+ files:
37
+ - README
38
+ - lib/httpbl.rb
39
+ - Rakefile
40
+ - httpbl.gemspec
41
+ - CHANGELOG
42
+ - LICENSE
43
+ - Manifest
44
+ - ghazel-httpbl.gemspec
45
+ has_rdoc: true
46
+ homepage: http://github.com/bpalmen/httpbl
47
+ licenses: []
48
+
49
+ post_install_message:
50
+ rdoc_options:
51
+ - --line-numbers
52
+ - --inline-source
53
+ - --title
54
+ - Ghazel-httpbl
55
+ - --main
56
+ - README
57
+ require_paths:
58
+ - lib
59
+ required_ruby_version: !ruby/object:Gem::Requirement
60
+ requirements:
61
+ - - ">="
62
+ - !ruby/object:Gem::Version
63
+ version: "0"
64
+ version:
65
+ required_rubygems_version: !ruby/object:Gem::Requirement
66
+ requirements:
67
+ - - ">="
68
+ - !ruby/object:Gem::Version
69
+ version: "1.2"
70
+ version:
71
+ requirements: []
72
+
73
+ rubyforge_project: ghazel-httpbl
74
+ rubygems_version: 1.3.5
75
+ signing_key:
76
+ specification_version: 3
77
+ summary: A Rack middleware IP filter that uses Http:BL to exclude suspicious robots.
78
+ test_files: []
79
+