middle_squid 1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +13 -0
- data/.travis.yml +3 -0
- data/Gemfile +4 -0
- data/LICENSE.txt +674 -0
- data/README.md +227 -0
- data/Rakefile +7 -0
- data/bin/middle_squid +7 -0
- data/lib/middle_squid/actions.rb +77 -0
- data/lib/middle_squid/adapter.rb +54 -0
- data/lib/middle_squid/adapters/squid.rb +57 -0
- data/lib/middle_squid/backends/keyboard.rb +31 -0
- data/lib/middle_squid/backends/thin.rb +14 -0
- data/lib/middle_squid/blacklist.rb +67 -0
- data/lib/middle_squid/builder.rb +159 -0
- data/lib/middle_squid/cli.rb +119 -0
- data/lib/middle_squid/core_ext/hash.rb +29 -0
- data/lib/middle_squid/database.rb +47 -0
- data/lib/middle_squid/exceptions.rb +4 -0
- data/lib/middle_squid/helpers.rb +74 -0
- data/lib/middle_squid/indexer.rb +194 -0
- data/lib/middle_squid/runner.rb +37 -0
- data/lib/middle_squid/server.rb +84 -0
- data/lib/middle_squid/uri.rb +31 -0
- data/lib/middle_squid/version.rb +3 -0
- data/lib/middle_squid.rb +46 -0
- data/middle_squid.gemspec +37 -0
- data/middle_squid_wrapper.sh +4 -0
- data/test/helper.rb +26 -0
- data/test/resources/backslash/cat/list +1 -0
- data/test/resources/black/ads/domains +2 -0
- data/test/resources/black/ads/urls +1 -0
- data/test/resources/black/tracker/domains +2 -0
- data/test/resources/black/tracker/urls +2 -0
- data/test/resources/copy_of_duplicates/cat/copy_of_list +2 -0
- data/test/resources/copy_of_duplicates/cat/list +2 -0
- data/test/resources/copy_of_duplicates/copy_of_cat/copy_of_list +2 -0
- data/test/resources/copy_of_duplicates/copy_of_cat/list +2 -0
- data/test/resources/duplicates/cat/copy_of_list +2 -0
- data/test/resources/duplicates/cat/list +2 -0
- data/test/resources/duplicates/copy_of_cat/copy_of_list +2 -0
- data/test/resources/duplicates/copy_of_cat/list +2 -0
- data/test/resources/empty/cat/emptylist +0 -0
- data/test/resources/empty_path/cat/list +1 -0
- data/test/resources/expressions/cat/list +3 -0
- data/test/resources/gray/isp/domains +2 -0
- data/test/resources/gray/isp/urls +1 -0
- data/test/resources/gray/news/domains +2 -0
- data/test/resources/hello.rb +2 -0
- data/test/resources/invalid_byte/cat/list +1 -0
- data/test/resources/mixed/cat/list +2 -0
- data/test/resources/subdirectory/cat/ignore/.gitkeep +0 -0
- data/test/resources/trailing_space/cat/list +2 -0
- data/test/test_actions.rb +76 -0
- data/test/test_adapter.rb +61 -0
- data/test/test_blacklist.rb +189 -0
- data/test/test_builder.rb +89 -0
- data/test/test_cli.rb +105 -0
- data/test/test_database.rb +20 -0
- data/test/test_hash.rb +28 -0
- data/test/test_helper.rb +76 -0
- data/test/test_indexer.rb +457 -0
- data/test/test_keyboard.rb +79 -0
- data/test/test_runner.rb +56 -0
- data/test/test_server.rb +86 -0
- data/test/test_squid.rb +110 -0
- data/test/test_thin.rb +7 -0
- data/test/test_uri.rb +69 -0
- metadata +363 -0
data/README.md
ADDED
@@ -0,0 +1,227 @@
|
|
1
|
+
# MiddleSquid
|
2
|
+
|
3
|
+
[![Gem Version](https://badge.fury.io/rb/middle_squid.svg)](http://badge.fury.io/rb/middle_squid)
|
4
|
+
[![Build Status](https://travis-ci.org/cfillion/middle_squid.svg?branch=master)](https://travis-ci.org/cfillion/middle_squid)
|
5
|
+
[![Dependency Status](https://gemnasium.com/cfillion/middle_squid.svg)](https://gemnasium.com/cfillion/middle_squid)
|
6
|
+
[![Code Climate](https://codeclimate.com/github/cfillion/middle_squid/badges/gpa.svg)](https://codeclimate.com/github/cfillion/middle_squid)
|
7
|
+
[![Coverage Status](https://img.shields.io/coveralls/cfillion/middle_squid.svg)](https://coveralls.io/r/cfillion/middle_squid?branch=master)
|
8
|
+
|
9
|
+
MiddleSquid is a redirector, url mangler and webpage interceptor for the Squid HTTP proxy.
|
10
|
+
|
11
|
+
**Features**
|
12
|
+
|
13
|
+
- Configuration is done by writing a ruby script
|
14
|
+
- Supports plain-text domains/urls blacklists
|
15
|
+
- Can intercept and modify any HTTP request or response
|
16
|
+
- Works with HTTPS
|
17
|
+
if [SslBump](http://wiki.squid-cache.org/Features/SslBump) is enabled.
|
18
|
+
|
19
|
+
## Installation & Setup
|
20
|
+
|
21
|
+
Assuming [Squid](http://www.squid-cache.org/) is installed and running as user 'proxy'.
|
22
|
+
These instructions were written for [Arch Linux](https://www.archlinux.org/).
|
23
|
+
Some adaptation to your favorite operating system may be necessary, at your
|
24
|
+
discretion.
|
25
|
+
|
26
|
+
**Dependencies:**
|
27
|
+
|
28
|
+
- Squid version 3.4 or newer
|
29
|
+
- Ruby version 2.1 or newer
|
30
|
+
|
31
|
+
### Step 1: Set a home folder for user 'proxy'
|
32
|
+
|
33
|
+
```sh
|
34
|
+
sudo mkdir /home/proxy
|
35
|
+
sudo chown proxy:proxy /home/proxy
|
36
|
+
sudo usermod --home /home/proxy proxy
|
37
|
+
```
|
38
|
+
|
39
|
+
### Step 2: Install MiddleSquid
|
40
|
+
|
41
|
+
```sh
|
42
|
+
sudo su - proxy
|
43
|
+
|
44
|
+
gem install middle_squid
|
45
|
+
echo 'run lambda {|uri, extras| }' > middle_squid_config.rb
|
46
|
+
|
47
|
+
exit
|
48
|
+
```
|
49
|
+
|
50
|
+
### Step 3: Create a launcher script
|
51
|
+
|
52
|
+
Create the file `/usr/local/bin/middle_squid_wrapper.sh`:
|
53
|
+
|
54
|
+
```sh
|
55
|
+
#!/bin/sh
|
56
|
+
|
57
|
+
GEM_HOME=$(ruby -e 'puts Gem.user_dir')
|
58
|
+
exec $GEM_HOME/bin/middle_squid $*
|
59
|
+
```
|
60
|
+
|
61
|
+
### Step 4: Setup Squid
|
62
|
+
|
63
|
+
Add these lines to your `/etc/squid/squid.conf`:
|
64
|
+
|
65
|
+
```squidconf
|
66
|
+
url_rewrite_program /usr/bin/sh /usr/local/bin/middle_squid_wrapper.sh start -C /home/proxy/middle_squid_config.rb
|
67
|
+
|
68
|
+
# required to fix HTTPS sites (if SslBump is enabled)
|
69
|
+
acl fix_ssl_rewrite method GET
|
70
|
+
acl fix_ssl_rewrite method POST
|
71
|
+
url_rewrite_access allow fix_ssl_rewrite
|
72
|
+
url_rewrite_access deny all
|
73
|
+
```
|
74
|
+
|
75
|
+
Finish with `sudo squid -k reconfigure`. Check `/var/log/squid/cache.log` for errors.
|
76
|
+
|
77
|
+
## Configuration
|
78
|
+
|
79
|
+
MiddleSquid is configured by the ruby script specified in the command line by the `-C` or `--config-file` argument.
|
80
|
+
|
81
|
+
The script must call the `run` method:
|
82
|
+
|
83
|
+
```ruby
|
84
|
+
run lambda {|uri, extras|
|
85
|
+
# decide what to do with uri
|
86
|
+
}
|
87
|
+
```
|
88
|
+
|
89
|
+
The argument must be an object that responds to the `call` method and taking two arguments:
|
90
|
+
the URI to process and an array of extra data received from squid
|
91
|
+
(see url_rewrite_extras in
|
92
|
+
[squid's documentation](http://www.squid-cache.org/Doc/config/url_rewrite_extras/)).
|
93
|
+
|
94
|
+
Write this in the file `/home/proxy/middle_squid_config.rb` we have created earlier:
|
95
|
+
|
96
|
+
```ruby
|
97
|
+
run lambda {|uri, extras|
|
98
|
+
redirect_to 'http://duckduckgo.com' if uri.host.end_with? 'google.com'
|
99
|
+
}
|
100
|
+
```
|
101
|
+
|
102
|
+
Run `sudo squid -k reconfigure` again to restart all MiddleSquid processes.
|
103
|
+
You should now be redirected to http://duckduckgo.com each time you visit
|
104
|
+
Google under your Squid proxy.
|
105
|
+
|
106
|
+
### Black Lists
|
107
|
+
|
108
|
+
While it may be fun to redirect yourself to an alternate search engine,
|
109
|
+
MiddleSquid is more useful at blocking annoying advertisements and tracking
|
110
|
+
services that are constantly watching your whereabouts.
|
111
|
+
|
112
|
+
MiddleSquid can scan any black list collection distributed in plain-text format
|
113
|
+
and compatible with SquidGuard or Dansguardian, such as:
|
114
|
+
|
115
|
+
- [Shalla's Blacklists](http://www.shallalist.de/) (free for personal use)
|
116
|
+
- [URLBlackList.com](http://www.urlblacklist.com/) (commercial)
|
117
|
+
|
118
|
+
Replace the previous configuration in `/home/proxy/middle_squid_config.rb`
|
119
|
+
by this one:
|
120
|
+
|
121
|
+
```ruby
|
122
|
+
database '/home/proxy/blacklist.db'
|
123
|
+
|
124
|
+
adv = blacklist 'adv'
|
125
|
+
tracker = blacklist 'tracker'
|
126
|
+
|
127
|
+
run lambda {|uri, extras|
|
128
|
+
if adv.include? uri
|
129
|
+
redirect_to 'http://your.webserver/block_pages/advertising.html'
|
130
|
+
end
|
131
|
+
|
132
|
+
if tracker.include? uri
|
133
|
+
redirect_to 'http://your.webserver/block_pages/tracker.html'
|
134
|
+
end
|
135
|
+
}
|
136
|
+
```
|
137
|
+
|
138
|
+
Next we have to download a blacklist and ask MiddleSquid to index its content
|
139
|
+
in the database for fast access:
|
140
|
+
|
141
|
+
```sh
|
142
|
+
sudo su - proxy
|
143
|
+
|
144
|
+
# Download Shalla's Blacklists
|
145
|
+
wget "http://www.shallalist.de/Downloads/shallalist.tar.gz" -O shallalist.tar.gz
|
146
|
+
tar xzf shallalist.tar.gz
|
147
|
+
mv BL ShallaBlackList
|
148
|
+
|
149
|
+
# Construct the blacklist database
|
150
|
+
/usr/local/bin/middle_squid_wrapper.sh index ShallaBlackList -C /etc/squid/middle_squid.rb
|
151
|
+
|
152
|
+
exit
|
153
|
+
```
|
154
|
+
|
155
|
+
The `index` command above may take a while to complete. Once it's done, re-run `squid -k reconfigure` and
|
156
|
+
enjoy an internet without ads or tracking beacons.
|
157
|
+
|
158
|
+
### Content Interception
|
159
|
+
|
160
|
+
MiddleSquid can also intercept the client's requests and modify the data sent to the
|
161
|
+
browser. Let's translate a few click-bait headlines on BuzzFeed
|
162
|
+
(check out [Downworthy](http://downworthy.snipe.net/) while you are at it):
|
163
|
+
|
164
|
+
```ruby
|
165
|
+
CLICK_BAITS = {
|
166
|
+
'Literally' => 'Figuratively',
|
167
|
+
'Mind-Blowing' => 'Painfully Ordinary',
|
168
|
+
'Will Blow Your Mind' => 'Might Perhaps Mildly Entertain You For a Moment',
|
169
|
+
# ...
|
170
|
+
}.freeze
|
171
|
+
|
172
|
+
define_action :translate do |uri|
|
173
|
+
intercept {|req, res|
|
174
|
+
status, headers, body = download_like req, uri
|
175
|
+
|
176
|
+
content_type = headers['Content-Type'].to_s
|
177
|
+
|
178
|
+
if content_type.include? 'text/html'
|
179
|
+
CLICK_BAITS.each {|before, after|
|
180
|
+
body.gsub! before, after
|
181
|
+
}
|
182
|
+
end
|
183
|
+
|
184
|
+
[status, headers, body]
|
185
|
+
}
|
186
|
+
end
|
187
|
+
|
188
|
+
run lambda {|uri, extras|
|
189
|
+
if uri.host == 'www.buzzfeed.com'
|
190
|
+
translate uri
|
191
|
+
end
|
192
|
+
}
|
193
|
+
```
|
194
|
+
|
195
|
+
Don't use this feature unless you have the permission from all your users to do so.
|
196
|
+
This indeed constitutes a man-in-the-middle attack and should be used with
|
197
|
+
moderation.
|
198
|
+
|
199
|
+
## Documentation
|
200
|
+
|
201
|
+
MiddleSquid's documentation is hosted at
|
202
|
+
[http://rubydoc.info/gems/middle_squid/MiddleSquid](http://rubydoc.info/gems/middle_squid/MiddleSquid).
|
203
|
+
|
204
|
+
- [Configuration syntax (DSL)](http://rubydoc.info/gems/middle_squid/MiddleSquid/Builder)
|
205
|
+
- [List of predefined actions](http://rubydoc.info/gems/middle_squid/MiddleSquid/Actions)
|
206
|
+
- [List of predefined helpers](http://rubydoc.info/gems/middle_squid/MiddleSquid/Helpers)
|
207
|
+
- [Available adapters](http://rubydoc.info/gems/middle_squid/MiddleSquid/Adapters)
|
208
|
+
|
209
|
+
## Changelog
|
210
|
+
|
211
|
+
### v1.0 (2014-10-05)
|
212
|
+
|
213
|
+
First public release.
|
214
|
+
|
215
|
+
## Future Plans
|
216
|
+
|
217
|
+
- Find out why HTTPS is not always working under Squid without the ACL hack.
|
218
|
+
- Write new adapters for other proxies or softwares.
|
219
|
+
|
220
|
+
## Contributing
|
221
|
+
|
222
|
+
1. [Fork it](https://github.com/cfillion/middle_squid/fork)
|
223
|
+
2. Create your feature branch (`git checkout -b my-new-feature`)
|
224
|
+
3. Test your changes (`rake`)
|
225
|
+
4. Commit your changes (`git commit -am 'Add some feature'`)
|
226
|
+
5. Push to the branch (`git push origin my-new-feature`)
|
227
|
+
6. Create a new Pull Request
|
data/Rakefile
ADDED
data/bin/middle_squid
ADDED
@@ -0,0 +1,77 @@
|
|
1
|
+
module MiddleSquid::Actions
|
2
|
+
#
|
3
|
+
# @!group Predefined Actions
|
4
|
+
#
|
5
|
+
|
6
|
+
# Allow the request to pass through. This is the default action.
|
7
|
+
#
|
8
|
+
# @example Whitelist a domain
|
9
|
+
# run lambda {|uri, extras|
|
10
|
+
# accept if uri.host == 'github.com'
|
11
|
+
# }
|
12
|
+
def accept
|
13
|
+
action :accept
|
14
|
+
end
|
15
|
+
|
16
|
+
# Redirect the browser to another URL.
|
17
|
+
#
|
18
|
+
# @example Redirect google.com to duckduckgo.com
|
19
|
+
# run lambda {|uri, extras|
|
20
|
+
# redirect_to "http://duckduckgo.com/#{uri.request_uri}" if uri.host == 'google.com'
|
21
|
+
# }
|
22
|
+
# @param url [String] the new url
|
23
|
+
# @param status [Fixnum] HTTP status code (see http://tools.ietf.org/html/rfc7231#section-6.4)
|
24
|
+
def redirect_to(url, status: 301)
|
25
|
+
action :redirect, status: status, url: url
|
26
|
+
end
|
27
|
+
|
28
|
+
# Serve another page in place of the requested one.
|
29
|
+
# Avoid in favor of {#redirect_to} when possible.
|
30
|
+
#
|
31
|
+
# @example Block Google advertisements
|
32
|
+
# run lambda {|uri, extras|
|
33
|
+
# redirect_to 'http://webserver.lan/blocked.html' if uri.host == 'ads.google.com'
|
34
|
+
# }
|
35
|
+
# @param url [String] the substitute url
|
36
|
+
def replace_by(url)
|
37
|
+
action :replace, url: url
|
38
|
+
end
|
39
|
+
|
40
|
+
# Hijack the request and generate a dynamic reply.
|
41
|
+
# This can be used to skip landing pages,
|
42
|
+
# change the behaviour of a website depending on the browser's headers or to
|
43
|
+
# generate an entire virtual website using your favorite Rack framework.
|
44
|
+
#
|
45
|
+
# The block is called inside a fiber.
|
46
|
+
# If the return value is a Rack triplet, it will be sent to the browser.
|
47
|
+
#
|
48
|
+
# @note
|
49
|
+
# With great power comes great responsibility.
|
50
|
+
# Please respect the privacy of your users.
|
51
|
+
# @example Hello World
|
52
|
+
# run lambda {|uri, extras|
|
53
|
+
# intercept {|req, res|
|
54
|
+
# [200, {}, 'Hello World']
|
55
|
+
# }
|
56
|
+
# }
|
57
|
+
# @yieldparam req [Rack::Request] the browser request
|
58
|
+
# @yieldparam res [Thin::AsyncResponse] the response to send back
|
59
|
+
# @yieldreturn Rack triplet or anything else
|
60
|
+
# @see Helpers#download_like
|
61
|
+
def intercept(&block)
|
62
|
+
raise ArgumentError, 'no block given' unless block_given?
|
63
|
+
|
64
|
+
token = server.token_for block
|
65
|
+
|
66
|
+
replace_by "http://#{server.host}:#{server.port}/#{token}"
|
67
|
+
end
|
68
|
+
|
69
|
+
#
|
70
|
+
# @!endgroup
|
71
|
+
#
|
72
|
+
|
73
|
+
private
|
74
|
+
def action(name, options = {})
|
75
|
+
throw :action, [name, options]
|
76
|
+
end
|
77
|
+
end
|
@@ -0,0 +1,54 @@
|
|
1
|
+
module MiddleSquid
|
2
|
+
# Base class for MiddleSquid's adapters.
|
3
|
+
# Subclasses should call {#handle} when they have received and parsed a request.
|
4
|
+
#
|
5
|
+
# @abstract Subclass and override {#output} to implement a custom adapter.
|
6
|
+
class Adapter
|
7
|
+
# Returns whatever was passed to {Builder#run}.
|
8
|
+
#
|
9
|
+
# @return [#call]
|
10
|
+
attr_accessor :handler
|
11
|
+
|
12
|
+
# Returns a new instance of Adapter.
|
13
|
+
# Use {Builder#use} instead.
|
14
|
+
def initialize(options = {})
|
15
|
+
@options = options
|
16
|
+
end
|
17
|
+
|
18
|
+
# Execute the user handler (see {#handler}) and calls +#output+.
|
19
|
+
#
|
20
|
+
# @param url <String> string representation of the url to be processed
|
21
|
+
# @param extras <Array> extra data to pass to the user's handler
|
22
|
+
def handle(url, extras = [])
|
23
|
+
uri = MiddleSquid::URI.parse url
|
24
|
+
raise InvalidURIError, "invalid URL received: '#{url}'" if !uri || !uri.host
|
25
|
+
|
26
|
+
action, options = catch :action do
|
27
|
+
@handler.call uri, extras
|
28
|
+
throw :action, [:accept, {}]
|
29
|
+
end
|
30
|
+
|
31
|
+
output action, options
|
32
|
+
end
|
33
|
+
|
34
|
+
# Pass an action to an underlying software.
|
35
|
+
#
|
36
|
+
# accept::
|
37
|
+
# (no options)
|
38
|
+
#
|
39
|
+
# redirect::
|
40
|
+
# Options:
|
41
|
+
# - +status+ [+Fixnum+]
|
42
|
+
# - +url+ [+String+]
|
43
|
+
#
|
44
|
+
# replace::
|
45
|
+
# Options:
|
46
|
+
# - +url+ [+String+]
|
47
|
+
#
|
48
|
+
# @param action [Symbol]
|
49
|
+
# @param options [Hash]
|
50
|
+
def output(action, options)
|
51
|
+
raise NotImplementedError
|
52
|
+
end
|
53
|
+
end
|
54
|
+
end
|
@@ -0,0 +1,57 @@
|
|
1
|
+
module MiddleSquid
|
2
|
+
# Adapter for the {http://www.squid-cache.org Squid HTTP Proxy}.
|
3
|
+
#
|
4
|
+
# *Options:*
|
5
|
+
#
|
6
|
+
# concurrency::
|
7
|
+
# Whether to expect a channel ID from Squid.
|
8
|
+
#
|
9
|
+
# Enable this option if the concurrency option is set to > 0 in Squid's
|
10
|
+
# {http://www.squid-cache.org/Doc/config/url_rewrite_children/ url_rewrite_children} directive.
|
11
|
+
#
|
12
|
+
# Extra data is configured in Squid with the {http://www.squid-cache.org/Doc/config/url_rewrite_extras/ url_rewrite_extras} directive.
|
13
|
+
#
|
14
|
+
# @see http://wiki.squid-cache.org/Features/Redirectors
|
15
|
+
class Adapters::Squid < Adapter
|
16
|
+
def start
|
17
|
+
warn 'WARNING: STDOUT is a terminal. This command should be launched from squid.' if STDOUT.tty?
|
18
|
+
|
19
|
+
EM.open_keyboard Backends::Keyboard, method(:input)
|
20
|
+
end
|
21
|
+
|
22
|
+
def input(line)
|
23
|
+
parts = line.split
|
24
|
+
|
25
|
+
@chan_id = @options[:concurrency] ? parts.shift : nil
|
26
|
+
url, *extras = parts
|
27
|
+
|
28
|
+
extras.map! {|str| URI.unescape str }
|
29
|
+
|
30
|
+
handle url, extras
|
31
|
+
end
|
32
|
+
|
33
|
+
def output(action, options)
|
34
|
+
case action
|
35
|
+
when :accept
|
36
|
+
reply 'ERR'
|
37
|
+
when :redirect
|
38
|
+
reply 'OK', status: options[:status], url: options[:url]
|
39
|
+
when :replace
|
40
|
+
reply 'OK', :'rewrite-url' => options[:url]
|
41
|
+
else
|
42
|
+
raise Error, "unsupported action: #{action}"
|
43
|
+
end
|
44
|
+
end
|
45
|
+
|
46
|
+
private
|
47
|
+
def reply(result, **kv_pairs)
|
48
|
+
parts = []
|
49
|
+
parts << @chan_id if @chan_id
|
50
|
+
parts << result
|
51
|
+
parts.concat kv_pairs.map {|k,v| "#{k}=#{URI.escape v.to_s}" }
|
52
|
+
|
53
|
+
$stdout.puts parts.join("\x20")
|
54
|
+
$stdout.flush
|
55
|
+
end
|
56
|
+
end
|
57
|
+
end
|
@@ -0,0 +1,31 @@
|
|
1
|
+
module MiddleSquid::Backends
|
2
|
+
# Receives data from the standard input.
|
3
|
+
class Keyboard < EventMachine::Connection
|
4
|
+
# @param handler [#call] called when a full line has been received
|
5
|
+
def initialize(handler)
|
6
|
+
@buffer = []
|
7
|
+
@handler = handler
|
8
|
+
end
|
9
|
+
|
10
|
+
# @param char [String] single character
|
11
|
+
def receive_data(char)
|
12
|
+
case char
|
13
|
+
when "\x00"
|
14
|
+
EM.stop
|
15
|
+
when "\n"
|
16
|
+
line = @buffer.join
|
17
|
+
@buffer.clear
|
18
|
+
|
19
|
+
receive_line line
|
20
|
+
else
|
21
|
+
@buffer << char
|
22
|
+
end
|
23
|
+
end
|
24
|
+
|
25
|
+
# @param line [String] full line without the trailing linebreak
|
26
|
+
def receive_line(line)
|
27
|
+
# EventMachine sends ASCII-8BIT strings, somehow preventing the databases queries to match
|
28
|
+
@handler.call line.force_encoding(Encoding::UTF_8)
|
29
|
+
end
|
30
|
+
end
|
31
|
+
end
|
@@ -0,0 +1,14 @@
|
|
1
|
+
module MiddleSquid::Backends
|
2
|
+
# Exposes the signature of Thin's TCP socket.
|
3
|
+
#
|
4
|
+
# @example Extract the current host and port
|
5
|
+
# sockname = EM.get_sockname @thin.backend.signature
|
6
|
+
# @port, @host = Socket.unpack_sockaddr_in sockname
|
7
|
+
class Thin < Thin::Backends::TcpServer
|
8
|
+
attr_reader :signature
|
9
|
+
|
10
|
+
def initialize(host, port, options)
|
11
|
+
super host, port
|
12
|
+
end
|
13
|
+
end
|
14
|
+
end
|
@@ -0,0 +1,67 @@
|
|
1
|
+
module MiddleSquid
|
2
|
+
# Use to query the blacklist database.
|
3
|
+
# URIs can be matched by hostname (see {#include_domain?}), path (see {#include_url?}) or both (see {#include?}).
|
4
|
+
#
|
5
|
+
# Instances of this class should be created using {Builder#blacklist Builder#blacklist}
|
6
|
+
# (otherwise they would not be seen by the "<code>middle_squid index</code>" command unless the "+--full+" flag is enabled).
|
7
|
+
class BlackList
|
8
|
+
include Database
|
9
|
+
|
10
|
+
# @return [String] the category passed to {#initialize}
|
11
|
+
attr_reader :category
|
12
|
+
|
13
|
+
# @return [Array<String>] the aliases passed to {#initialize}
|
14
|
+
attr_reader :aliases
|
15
|
+
|
16
|
+
# Returns a new instance of BlackList. Use {Builder#blacklist Builder#blacklist} instead.
|
17
|
+
# @param category [String]
|
18
|
+
# @param aliases [Array<String>]
|
19
|
+
def initialize(category, aliases: [])
|
20
|
+
@category = category
|
21
|
+
@aliases = aliases
|
22
|
+
end
|
23
|
+
|
24
|
+
# Whether the blacklist category contains the URI's hostname or an upper-level domain.
|
25
|
+
#
|
26
|
+
# Rules to the <code>www</code> subdomain match any subdomains.
|
27
|
+
#
|
28
|
+
# @example Rule: sub.domain.com
|
29
|
+
# Matches:
|
30
|
+
# - http://sub.domain.com/...
|
31
|
+
# - http://second.sub.domain.com/...
|
32
|
+
# - http://infinite.level.of.sub.domain.com/...
|
33
|
+
# @param uri [URI] the URI to search
|
34
|
+
def include_domain?(uri)
|
35
|
+
!!db.get_first_value(
|
36
|
+
"SELECT 1 FROM domains WHERE category = ? AND ? LIKE '%' || host LIMIT 1",
|
37
|
+
[@category, uri.cleanhost]
|
38
|
+
)
|
39
|
+
end
|
40
|
+
|
41
|
+
# Whether the blacklist category contains the URI. Matches by partial domain (like {#include_domain?}) and path. The query string is ignored.
|
42
|
+
#
|
43
|
+
# Rules to index files (index.html, Default.aspx and friends) match the whole directory.
|
44
|
+
#
|
45
|
+
# @example Rule: domain.com/path
|
46
|
+
# Matches:
|
47
|
+
# - http://domain.com/path
|
48
|
+
# - http://domain.com/dummy/../path
|
49
|
+
# - http://domain.com/path/extra_path
|
50
|
+
# - http://domain.com/path?query=string
|
51
|
+
# @param uri [URI] the URI to search
|
52
|
+
def include_url?(uri)
|
53
|
+
!!db.get_first_value(
|
54
|
+
"SELECT 1 FROM urls WHERE category = ? AND ? LIKE '%' || host AND ? LIKE path || '%' LIMIT 1",
|
55
|
+
[@category, uri.cleanhost, uri.cleanpath]
|
56
|
+
)
|
57
|
+
end
|
58
|
+
|
59
|
+
# Whether this blacklists contains the uri.
|
60
|
+
# Matches by domain and/or path.
|
61
|
+
#
|
62
|
+
# @param uri [URI] the uri to search
|
63
|
+
def include?(uri)
|
64
|
+
include_domain?(uri) || include_url?(uri)
|
65
|
+
end
|
66
|
+
end
|
67
|
+
end
|