proxy_fetcher 0.2.5 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +88 -39
- data/bin/proxy_fetcher +57 -0
- data/lib/proxy_fetcher.rb +7 -1
- data/lib/proxy_fetcher/configuration.rb +18 -6
- data/lib/proxy_fetcher/manager.rb +8 -5
- data/lib/proxy_fetcher/providers/base.rb +39 -12
- data/lib/proxy_fetcher/providers/free_proxy_list.rb +14 -29
- data/lib/proxy_fetcher/providers/free_proxy_list_ssl.rb +11 -21
- data/lib/proxy_fetcher/providers/hide_my_name.rb +29 -31
- data/lib/proxy_fetcher/providers/proxy_docker.rb +12 -21
- data/lib/proxy_fetcher/providers/proxy_list.rb +35 -0
- data/lib/proxy_fetcher/providers/xroxy.rb +12 -25
- data/lib/proxy_fetcher/proxy.rb +2 -19
- data/lib/proxy_fetcher/utils/html.rb +15 -0
- data/lib/proxy_fetcher/utils/http_client.rb +46 -0
- data/lib/proxy_fetcher/version.rb +2 -2
- data/proxy_fetcher.gemspec +4 -2
- data/spec/proxy_fetcher/configuration_spec.rb +48 -0
- data/spec/proxy_fetcher/providers/base_spec.rb +28 -0
- data/spec/proxy_fetcher/providers/proxy_list_spec.rb +9 -0
- data/spec/proxy_fetcher/proxy_spec.rb +8 -8
- metadata +12 -5
- data/lib/proxy_fetcher/utils/http_fetcher.rb +0 -24
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 2082cc216a388f9014cbdec5daa8a54ac1d93016
|
4
|
+
data.tar.gz: 6d0d817f9b1fecdbc3512440803a4444bcaf3c67
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: f3a33d56dd95b4c3a7755a8f76f780b702af03de5a546b78cf3bfa56fad96b877169d02ca4c4f83dcfdcee8ee7a1efd67983ead21580b0ea4bb94c78db92da3a
|
7
|
+
data.tar.gz: b0019c53bed440256de36853a9b9973f58bd2fd019e90ca1b8178c727ad64572a3a0b3fc4ded49c64d6f4134935b72f124ba4c8780f9ed01b3412bb91c953499
|
data/README.md
CHANGED
@@ -2,6 +2,7 @@
|
|
2
2
|
[](http://badge.fury.io/rb/proxy_fetcher)
|
3
3
|
[](https://travis-ci.org/nbulaj/proxy_fetcher)
|
4
4
|
[](https://coveralls.io/github/nbulaj/proxy_fetcher)
|
5
|
+
[](https://codeclimate.com/github/nbulaj/proxy_fetcher)
|
5
6
|
[](#license)
|
6
7
|
|
7
8
|
This gem can help your Ruby application to make HTTP(S) requests from proxy by fetching and validating actual
|
@@ -10,12 +11,15 @@ proxy lists from the different providers like [HideMyName](https://hidemy.name/e
|
|
10
11
|
It gives you a `Manager` class that can load proxy list, validate it and return random or specific proxy entry. Take a look
|
11
12
|
at the documentation below to find all the gem features.
|
12
13
|
|
14
|
+
Also this gem can be used as standalone solution for downloading and validating proxy lists from the different providers.
|
15
|
+
Checkout examples of usage below.
|
16
|
+
|
13
17
|
## Installation
|
14
18
|
|
15
19
|
If using bundler, first add 'proxy_fetcher' to your Gemfile:
|
16
20
|
|
17
21
|
```ruby
|
18
|
-
gem 'proxy_fetcher', '~> 0.
|
22
|
+
gem 'proxy_fetcher', '~> 0.3'
|
19
23
|
```
|
20
24
|
|
21
25
|
or if you want to use the latest version (from `master` branch), then:
|
@@ -33,12 +37,14 @@ bundle install
|
|
33
37
|
Otherwise simply install the gem:
|
34
38
|
|
35
39
|
```sh
|
36
|
-
gem install proxy_fetcher -v '0.
|
40
|
+
gem install proxy_fetcher -v '0.3'
|
37
41
|
```
|
38
42
|
|
39
43
|
## Example of usage
|
40
44
|
|
41
|
-
|
45
|
+
### In Ruby application
|
46
|
+
|
47
|
+
Get current proxy list without validation:
|
42
48
|
|
43
49
|
```ruby
|
44
50
|
manager = ProxyFetcher::Manager.new # will immediately load proxy list from the server
|
@@ -48,7 +54,7 @@ manager.proxies
|
|
48
54
|
# @response_time=5217, @speed=48, @type="HTTP", @anonymity="High">, ... ]
|
49
55
|
```
|
50
56
|
|
51
|
-
You can initialize proxy manager without
|
57
|
+
You can initialize proxy manager without immediate load of proxy list from the remote server by passing `refresh: false` on initialization:
|
52
58
|
|
53
59
|
```ruby
|
54
60
|
manager = ProxyFetcher::Manager.new(refresh: false) # just initialize class instance
|
@@ -57,7 +63,13 @@ manager.proxies
|
|
57
63
|
#=> []
|
58
64
|
```
|
59
65
|
|
60
|
-
|
66
|
+
If you wanna clean current proxy list from some dead servers that does not respond to the requests, than you can just call `cleanup!` method:
|
67
|
+
|
68
|
+
```ruby
|
69
|
+
manager.cleanup! # or manager.validate!
|
70
|
+
```
|
71
|
+
|
72
|
+
Get raw proxy URLs as Strings:
|
61
73
|
|
62
74
|
```ruby
|
63
75
|
manager = ProxyFetcher::Manager.new
|
@@ -76,6 +88,58 @@ manager.refresh_list! # or manager.fetch!
|
|
76
88
|
# @response_time=5217, @speed=48, @type="HTTP", @anonymity="High">, ... ]
|
77
89
|
```
|
78
90
|
|
91
|
+
If you need to filter proxy list, for example, by country or response time and selected provider supports filtering by GET params, then you
|
92
|
+
can pass your filters to the Manager instance like that:
|
93
|
+
|
94
|
+
```ruby
|
95
|
+
ProxyFetcher.config.provider = :hide_my_name
|
96
|
+
|
97
|
+
manager = ProxyFetcher::Manager.new(filters: { country: 'AO', maxtime: '500' })
|
98
|
+
manager.proxies
|
99
|
+
|
100
|
+
# => [...]
|
101
|
+
```
|
102
|
+
|
103
|
+
*NOTE*: not all the providers support filtering. Take a look at the provider class to see if it supports custom filters.
|
104
|
+
|
105
|
+
You can use two methods to get the first proxy from the list:
|
106
|
+
|
107
|
+
* `get` or aliased `pop` (will return first proxy and move it to the end of the list)
|
108
|
+
* `get!` or aliased `pop!` (will return first **connectable** proxy and move it to the end of the list; all the proxies till the working one will be removed)
|
109
|
+
|
110
|
+
Or you can get just random proxy by calling `manager.random_proxy` or it's alias `manager.random`.
|
111
|
+
|
112
|
+
### Standalone
|
113
|
+
|
114
|
+
All you need to use this gem is Ruby >= 2.0 (2.3 is recommended). You can install it in a different ways. If you are using Ubuntu Xenial (16.04LTS)
|
115
|
+
then you already have Ruby 2.3 installed. In other cases you can install it with [RVM](https://rvm.io/) or [rbenv](https://github.com/rbenv/rbenv).
|
116
|
+
|
117
|
+
Just install the gem by running `gem install proxy_fetcher` in your terminal and run it:
|
118
|
+
|
119
|
+
```bash
|
120
|
+
proxy_fetcher >> proxies.txt # Will download proxies, validate them and write to file
|
121
|
+
```
|
122
|
+
|
123
|
+
If you need a list of proxies in JSON then pass `--json` argument to the command:
|
124
|
+
|
125
|
+
```bash
|
126
|
+
proxy_fetcher --json
|
127
|
+
|
128
|
+
# Will print:
|
129
|
+
# {"proxies":["https://120.26.206.178:8888","https://119.61.13.242:1080","https://117.40.213.26:1080","https://92.62.72.242:1080",
|
130
|
+
# "https://58.20.41.172:1080","https://204.116.192.151:35923","https://190.5.96.58:1080","https://170.250.109.97:35923",
|
131
|
+
# "https://121.41.82.99:1080","https://77.53.105.155:35923"]}
|
132
|
+
|
133
|
+
```
|
134
|
+
|
135
|
+
To get all the possible options run:
|
136
|
+
|
137
|
+
```bash
|
138
|
+
proxy_fetcher --help
|
139
|
+
```
|
140
|
+
|
141
|
+
## Proxy object
|
142
|
+
|
79
143
|
Every proxy is a `ProxyFetcher::Proxy` object that has next readers (instance variables):
|
80
144
|
|
81
145
|
* `addr` (IP address)
|
@@ -84,7 +148,7 @@ Every proxy is a `ProxyFetcher::Proxy` object that has next readers (instance va
|
|
84
148
|
* `response_time` (5217 for example)
|
85
149
|
* `speed` (`:slow`, `:medium` or `:fast`. **Note:** depends on the proxy provider and can be `nil`)
|
86
150
|
* `type` (URI schema, HTTP or HTTPS)
|
87
|
-
* `
|
151
|
+
* `anonymity` (`Low`, `Elite proxy` or `High +KA` for example)
|
88
152
|
|
89
153
|
Also you can call next instance methods for every Proxy object:
|
90
154
|
|
@@ -94,18 +158,7 @@ Also you can call next instance methods for every Proxy object:
|
|
94
158
|
* `uri` (returns `URI::Generic` object)
|
95
159
|
* `url` (returns a formatted URL like "_http://IP:PORT_" )
|
96
160
|
|
97
|
-
You can
|
98
|
-
|
99
|
-
* `get` or aliased `pop` (will return first proxy and move it to the end of the list)
|
100
|
-
* `get!` or aliased `pop!` (will return first **connectable** proxy and move it to the end of the list; all the proxies till the working one will be removed)
|
101
|
-
|
102
|
-
If you wanna clear current proxy manager list from dead servers, you can just call `cleanup!` method:
|
103
|
-
|
104
|
-
```ruby
|
105
|
-
manager.cleanup! # or manager.validate!
|
106
|
-
```
|
107
|
-
|
108
|
-
You can sort or find any proxy by speed using next 3 instance methods:
|
161
|
+
You can sort or find any proxy by speed using next 3 instance methods (if it is available for the specific provider):
|
109
162
|
|
110
163
|
* `fast?`
|
111
164
|
* `medium?`
|
@@ -117,26 +170,27 @@ To change open/read timeout for `cleanup!` and `connectable?` methods you need t
|
|
117
170
|
|
118
171
|
```ruby
|
119
172
|
ProxyFetcher.configure do |config|
|
120
|
-
config.
|
121
|
-
config.open_timeout = 1 # default is 3
|
173
|
+
config.connection_timeout = 1 # default is 3
|
122
174
|
end
|
123
175
|
|
124
176
|
manager = ProxyFetcher::Manager.new
|
125
177
|
manager.cleanup!
|
126
178
|
```
|
127
179
|
|
128
|
-
ProxyFetcher uses simple Ruby solution for dealing with HTTP requests - `net/http` library. If you wanna add, for example, your custom provider that
|
129
|
-
was developed as a Single Page Application (SPA) with some JavaScript, then you will need something like [
|
180
|
+
ProxyFetcher uses simple Ruby solution for dealing with HTTP(S) requests - `net/http` library from the stdlib. If you wanna add, for example, your custom provider that
|
181
|
+
was developed as a Single Page Application (SPA) with some JavaScript, then you will need something like [selenium-webdriver](https://github.com/SeleniumHQ/selenium/tree/master/rb)
|
130
182
|
to properly load the content of the website. For those and other cases you can write your own class for fetching HTML content by the URL and setup it
|
131
183
|
in the ProxyFetcher config:
|
132
184
|
|
133
185
|
```ruby
|
134
186
|
class MyHTTPClient
|
135
|
-
|
136
|
-
|
137
|
-
|
138
|
-
|
139
|
-
|
187
|
+
# [IMPORTANT]: below methods are required!
|
188
|
+
def self.fetch(url)
|
189
|
+
# ... some magic to return proper HTML ...
|
190
|
+
end
|
191
|
+
|
192
|
+
def self.connectable?(url)
|
193
|
+
# ... some magic to check if url is connectable ...
|
140
194
|
end
|
141
195
|
end
|
142
196
|
|
@@ -149,14 +203,17 @@ manager.proxies
|
|
149
203
|
# @response_time=5217, @speed=48, @type="HTTP", @anonymity="High">, ... ]
|
150
204
|
```
|
151
205
|
|
206
|
+
You can take a look at the [lib/proxy_fetcher/utils/http_client.rb](lib/proxy_fetcher/utils/http_client.rb) for an example.
|
207
|
+
|
152
208
|
## Providers
|
153
209
|
|
154
210
|
Currently ProxyFetcher can deal with next proxy providers (services):
|
155
211
|
|
156
212
|
* Hide My Name (default one)
|
157
213
|
* Free Proxy List
|
158
|
-
* SSL Proxies
|
214
|
+
* Free SSL Proxies
|
159
215
|
* Proxy Docker
|
216
|
+
* Proxy List
|
160
217
|
* XRoxy
|
161
218
|
|
162
219
|
If you wanna use one of them just setup required in the config:
|
@@ -176,14 +233,8 @@ Also you can write your own provider. All you need is to create a class, that wo
|
|
176
233
|
ProxyFetcher::Configuration.register_provider(:your_provider, YourProviderClass)
|
177
234
|
```
|
178
235
|
|
179
|
-
Provider class must implement `self.load_proxy_list` and `#
|
180
|
-
provider HTML page with proxy list. Take a look at the
|
181
|
-
|
182
|
-
## TODO
|
183
|
-
|
184
|
-
* Add proxy filters
|
185
|
-
* Code refactoring
|
186
|
-
* Rewrite specs
|
236
|
+
Provider class must implement `self.load_proxy_list` and `#to_proxy(html_element)` methods that will load and parse
|
237
|
+
provider HTML page with proxy list. Take a look at the existing providers in the [lib/proxy_fetcher/providers](lib/proxy_fetcher/providers) directory.
|
187
238
|
|
188
239
|
## Contributing
|
189
240
|
|
@@ -206,8 +257,6 @@ Thanks.
|
|
206
257
|
|
207
258
|
## License
|
208
259
|
|
209
|
-
proxy_fetcher gem is released under the [MIT License](http://www.opensource.org/licenses/MIT).
|
260
|
+
`proxy_fetcher` gem is released under the [MIT License](http://www.opensource.org/licenses/MIT).
|
210
261
|
|
211
262
|
Copyright (c) 2017 Nikita Bulai (bulajnikita@gmail.com).
|
212
|
-
|
213
|
-
Some parser code (c) [pifleo](https://gist.github.com/pifleo/3889803)
|
data/bin/proxy_fetcher
ADDED
@@ -0,0 +1,57 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
require 'optparse'
|
4
|
+
require 'proxy_fetcher'
|
5
|
+
|
6
|
+
options = {
|
7
|
+
validate: true,
|
8
|
+
json: false
|
9
|
+
}
|
10
|
+
|
11
|
+
OptionParser.new do |opts|
|
12
|
+
opts.banner = 'Usage: proxy_fetcher [OPTIONS]'
|
13
|
+
|
14
|
+
opts.on('-h', '--help', '# Show this help message and quit') do
|
15
|
+
puts opts
|
16
|
+
exit(0)
|
17
|
+
end
|
18
|
+
|
19
|
+
opts.on('-p', '--provider=NAME', '# Use specific proxy provider') do |value|
|
20
|
+
provider_name = value.downcase
|
21
|
+
|
22
|
+
unless ProxyFetcher::Configuration.providers.include?(provider_name.to_sym)
|
23
|
+
possible_providers = ProxyFetcher::Configuration.providers.keys
|
24
|
+
|
25
|
+
puts "Unknown provider - '#{value}'.\nUse one of the following: #{possible_providers.join(', ')}."
|
26
|
+
exit(0)
|
27
|
+
end
|
28
|
+
|
29
|
+
options[:provider] = provider_name
|
30
|
+
end
|
31
|
+
|
32
|
+
opts.on('-n', '--no-validate', '# Dump all the proxies without validation') do
|
33
|
+
options[:validate] = false
|
34
|
+
end
|
35
|
+
|
36
|
+
opts.on('-t', '--timeout=SECONDS', Integer, '# Connection timeout in seconds') do |value|
|
37
|
+
options[:timeout] = value
|
38
|
+
end
|
39
|
+
|
40
|
+
opts.on('-j', '--json', '# Dump proxies to the JSON format') do
|
41
|
+
options[:json] = true
|
42
|
+
end
|
43
|
+
end.parse!
|
44
|
+
|
45
|
+
ProxyFetcher.config.provider = options[:provider] if options[:provider]
|
46
|
+
ProxyFetcher.config.connection_timeout = options[:timeout] if options[:timeout]
|
47
|
+
|
48
|
+
manager = ProxyFetcher::Manager.new
|
49
|
+
manager.validate! if options[:validate]
|
50
|
+
|
51
|
+
if options[:json]
|
52
|
+
require 'json'
|
53
|
+
|
54
|
+
puts JSON.generate(proxies: manager.raw_proxies)
|
55
|
+
else
|
56
|
+
puts manager.raw_proxies
|
57
|
+
end
|
data/lib/proxy_fetcher.rb
CHANGED
@@ -1,16 +1,22 @@
|
|
1
1
|
require 'uri'
|
2
2
|
require 'net/http'
|
3
|
+
require 'openssl'
|
3
4
|
require 'nokogiri'
|
5
|
+
require 'ostruct'
|
4
6
|
|
5
7
|
require 'proxy_fetcher/configuration'
|
6
8
|
require 'proxy_fetcher/proxy'
|
7
9
|
require 'proxy_fetcher/manager'
|
8
|
-
|
10
|
+
|
11
|
+
require 'proxy_fetcher/utils/http_client'
|
12
|
+
require 'proxy_fetcher/utils/html'
|
13
|
+
|
9
14
|
require 'proxy_fetcher/providers/base'
|
10
15
|
require 'proxy_fetcher/providers/free_proxy_list'
|
11
16
|
require 'proxy_fetcher/providers/free_proxy_list_ssl'
|
12
17
|
require 'proxy_fetcher/providers/hide_my_name'
|
13
18
|
require 'proxy_fetcher/providers/proxy_docker'
|
19
|
+
require 'proxy_fetcher/providers/proxy_list'
|
14
20
|
require 'proxy_fetcher/providers/xroxy'
|
15
21
|
|
16
22
|
module ProxyFetcher
|
@@ -2,9 +2,10 @@ module ProxyFetcher
|
|
2
2
|
class Configuration
|
3
3
|
UnknownProvider = Class.new(StandardError)
|
4
4
|
RegisteredProvider = Class.new(StandardError)
|
5
|
+
WrongHttpClient = Class.new(StandardError)
|
5
6
|
|
6
|
-
attr_accessor :
|
7
|
-
attr_accessor :
|
7
|
+
attr_accessor :http_client, :connection_timeout
|
8
|
+
attr_accessor :provider
|
8
9
|
|
9
10
|
class << self
|
10
11
|
def providers
|
@@ -12,15 +13,18 @@ module ProxyFetcher
|
|
12
13
|
end
|
13
14
|
|
14
15
|
def register_provider(name, klass)
|
15
|
-
raise RegisteredProvider, "
|
16
|
+
raise RegisteredProvider, "`#{name}` provider already registered!" if providers.key?(name.to_sym)
|
16
17
|
|
17
18
|
providers[name.to_sym] = klass
|
18
19
|
end
|
19
20
|
end
|
20
21
|
|
21
22
|
def initialize
|
22
|
-
|
23
|
-
|
23
|
+
reset!
|
24
|
+
end
|
25
|
+
|
26
|
+
def reset!
|
27
|
+
@connection_timeout = 3
|
24
28
|
@http_client = HTTPClient
|
25
29
|
|
26
30
|
self.provider = :hide_my_name # currently default one
|
@@ -29,7 +33,15 @@ module ProxyFetcher
|
|
29
33
|
def provider=(name)
|
30
34
|
@provider = self.class.providers[name.to_sym]
|
31
35
|
|
32
|
-
raise UnknownProvider, "unregistered proxy provider
|
36
|
+
raise UnknownProvider, "unregistered proxy provider `#{name}`!" if @provider.nil?
|
37
|
+
end
|
38
|
+
|
39
|
+
def http_client=(klass)
|
40
|
+
unless klass.respond_to?(:fetch, :connectable?)
|
41
|
+
raise WrongHttpClient, "#{klass} must respond to #fetch and #connectable? class methods!"
|
42
|
+
end
|
43
|
+
|
44
|
+
@http_client = klass
|
33
45
|
end
|
34
46
|
end
|
35
47
|
end
|
@@ -1,10 +1,12 @@
|
|
1
1
|
module ProxyFetcher
|
2
2
|
class Manager
|
3
|
-
attr_reader :proxies
|
3
|
+
attr_reader :proxies, :filters
|
4
4
|
|
5
5
|
# refresh: true - load proxy list from the remote server on initialization
|
6
6
|
# refresh: false - just initialize the class, proxy list will be empty ([])
|
7
|
-
def initialize(refresh: true)
|
7
|
+
def initialize(refresh: true, filters: {})
|
8
|
+
@filters = filters
|
9
|
+
|
8
10
|
if refresh
|
9
11
|
refresh_list!
|
10
12
|
else
|
@@ -14,8 +16,7 @@ module ProxyFetcher
|
|
14
16
|
|
15
17
|
# Update current proxy list from the provider
|
16
18
|
def refresh_list!
|
17
|
-
|
18
|
-
@proxies = rows.map { |row| Proxy.new(row) }
|
19
|
+
@proxies = ProxyFetcher.config.provider.fetch_proxies!(filters)
|
19
20
|
end
|
20
21
|
|
21
22
|
alias fetch! refresh_list!
|
@@ -56,10 +57,12 @@ module ProxyFetcher
|
|
56
57
|
alias validate! cleanup!
|
57
58
|
|
58
59
|
# Return random proxy
|
59
|
-
def
|
60
|
+
def random_proxy
|
60
61
|
proxies.sample
|
61
62
|
end
|
62
63
|
|
64
|
+
alias random random_proxy
|
65
|
+
|
63
66
|
# Returns array of proxy URLs (just schema + host + port)
|
64
67
|
def raw_proxies
|
65
68
|
proxies.map(&:url)
|
@@ -1,25 +1,52 @@
|
|
1
|
+
require 'forwardable'
|
2
|
+
|
1
3
|
module ProxyFetcher
|
2
4
|
module Providers
|
3
5
|
class Base
|
4
|
-
|
6
|
+
extend Forwardable
|
5
7
|
|
6
|
-
|
7
|
-
|
8
|
-
|
8
|
+
def_delegators ProxyFetcher::HTML, :clear, :convert_to_int
|
9
|
+
|
10
|
+
PROXY_TYPES = [
|
11
|
+
HTTP = 'HTTP'.freeze,
|
12
|
+
HTTPS = 'HTTPS'.freeze
|
13
|
+
].freeze
|
9
14
|
|
10
|
-
|
11
|
-
|
15
|
+
attr_reader :proxy
|
16
|
+
|
17
|
+
def fetch_proxies!(filters = {})
|
18
|
+
load_proxy_list(filters).map { |html| to_proxy(html) }
|
12
19
|
end
|
13
20
|
|
14
21
|
class << self
|
15
|
-
def
|
16
|
-
new
|
22
|
+
def fetch_proxies!(filters = {})
|
23
|
+
new.fetch_proxies!(filters)
|
17
24
|
end
|
25
|
+
end
|
18
26
|
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
27
|
+
protected
|
28
|
+
|
29
|
+
# Loads HTML document with Nokogiri by the URL combined with custom filters
|
30
|
+
def load_document(url, filters = {})
|
31
|
+
uri = URI.parse(url)
|
32
|
+
uri.query = URI.encode_www_form(filters) if filters.any?
|
33
|
+
|
34
|
+
Nokogiri::HTML(ProxyFetcher.config.http_client.fetch(uri.to_s))
|
35
|
+
end
|
36
|
+
|
37
|
+
# Get HTML elements with proxy info
|
38
|
+
def load_proxy_list(*)
|
39
|
+
raise NotImplementedError, "#{__method__} must be implemented in a descendant class!"
|
40
|
+
end
|
41
|
+
|
42
|
+
# Convert HTML element with proxy info to ProxyFetcher::Proxy instance
|
43
|
+
def to_proxy(*)
|
44
|
+
raise NotImplementedError, "#{__method__} must be implemented in a descendant class!"
|
45
|
+
end
|
46
|
+
|
47
|
+
# Return normalized HTML element content by selector
|
48
|
+
def parse_element(element, selector, method = :at_xpath)
|
49
|
+
clear(element.public_send(method, selector).content)
|
23
50
|
end
|
24
51
|
end
|
25
52
|
end
|
@@ -3,42 +3,27 @@ module ProxyFetcher
|
|
3
3
|
class FreeProxyList < Base
|
4
4
|
PROVIDER_URL = 'https://free-proxy-list.net/'.freeze
|
5
5
|
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
end
|
6
|
+
# [NOTE] Doesn't support filtering
|
7
|
+
def load_proxy_list(*)
|
8
|
+
doc = load_document(PROVIDER_URL, {})
|
9
|
+
doc.xpath('//table[@id="proxylisttable"]/tbody/tr')
|
11
10
|
end
|
12
11
|
|
13
|
-
def
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
when 3 then
|
21
|
-
set!(:country, td.content.strip)
|
22
|
-
when 4
|
23
|
-
set!(:anonymity, td.content.strip)
|
24
|
-
when 6
|
25
|
-
set!(:type, parse_type(td))
|
26
|
-
else
|
27
|
-
# nothing
|
28
|
-
end
|
12
|
+
def to_proxy(html_element)
|
13
|
+
ProxyFetcher::Proxy.new.tap do |proxy|
|
14
|
+
proxy.addr = parse_element(html_element, 'td[1]')
|
15
|
+
proxy.port = convert_to_int(parse_element(html_element, 'td[2]'))
|
16
|
+
proxy.country = parse_element(html_element, 'td[4]')
|
17
|
+
proxy.anonymity = parse_element(html_element, 'td[5]')
|
18
|
+
proxy.type = parse_type(html_element)
|
29
19
|
end
|
30
20
|
end
|
31
21
|
|
32
22
|
private
|
33
23
|
|
34
|
-
def parse_type(
|
35
|
-
type = td
|
36
|
-
|
37
|
-
if type && type.downcase.include?('yes')
|
38
|
-
'HTTPS'
|
39
|
-
else
|
40
|
-
'HTTP'
|
41
|
-
end
|
24
|
+
def parse_type(element)
|
25
|
+
type = parse_element(element, 'td[6]')
|
26
|
+
type && type.casecmp('yes').zero? ? HTTPS : HTTP
|
42
27
|
end
|
43
28
|
end
|
44
29
|
|
@@ -3,29 +3,19 @@ module ProxyFetcher
|
|
3
3
|
class FreeProxyListSSL < Base
|
4
4
|
PROVIDER_URL = 'https://www.sslproxies.org/'.freeze
|
5
5
|
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
end
|
6
|
+
# [NOTE] Doesn't support filtering
|
7
|
+
def load_proxy_list(*)
|
8
|
+
doc = load_document(PROVIDER_URL, {})
|
9
|
+
doc.xpath('//table[@id="proxylisttable"]/tbody/tr')
|
11
10
|
end
|
12
11
|
|
13
|
-
def
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
when 3 then
|
21
|
-
set!(:country, td.content.strip)
|
22
|
-
when 4
|
23
|
-
set!(:anonymity, td.content.strip)
|
24
|
-
when 6
|
25
|
-
set!(:type, 'HTTPS')
|
26
|
-
else
|
27
|
-
# nothing
|
28
|
-
end
|
12
|
+
def to_proxy(html_element)
|
13
|
+
ProxyFetcher::Proxy.new.tap do |proxy|
|
14
|
+
proxy.addr = parse_element(html_element, 'td[1]')
|
15
|
+
proxy.port = convert_to_int(parse_element(html_element, 'td[2]'))
|
16
|
+
proxy.country = parse_element(html_element, 'td[4]')
|
17
|
+
proxy.anonymity = parse_element(html_element, 'td[5]')
|
18
|
+
proxy.type = HTTPS
|
29
19
|
end
|
30
20
|
end
|
31
21
|
end
|
@@ -1,51 +1,49 @@
|
|
1
1
|
module ProxyFetcher
|
2
2
|
module Providers
|
3
3
|
class HideMyName < Base
|
4
|
-
PROVIDER_URL = 'https://hidemy.name/en/proxy-list
|
4
|
+
PROVIDER_URL = 'https://hidemy.name/en/proxy-list/'.freeze
|
5
5
|
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
doc.xpath('//table[@class="proxy__t"]/tbody/tr')
|
10
|
-
end
|
6
|
+
def load_proxy_list(filters = { type: 'hs' })
|
7
|
+
doc = load_document(PROVIDER_URL, filters)
|
8
|
+
doc.xpath('//table[@class="proxy__t"]/tbody/tr')
|
11
9
|
end
|
12
10
|
|
13
|
-
def
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
set!(:speed, speed_from_response_time(response_time))
|
27
|
-
when 4
|
28
|
-
set!(:type, parse_type(td))
|
29
|
-
when 5
|
30
|
-
set!(:anonymity, td.content.strip)
|
31
|
-
else
|
32
|
-
# nothing
|
33
|
-
end
|
11
|
+
def to_proxy(html_element)
|
12
|
+
ProxyFetcher::Proxy.new.tap do |proxy|
|
13
|
+
proxy.addr = parse_element(html_element, 'td[1]')
|
14
|
+
proxy.port = convert_to_int(parse_element(html_element, 'td[2]'))
|
15
|
+
proxy.anonymity = parse_element(html_element, 'td[6]')
|
16
|
+
|
17
|
+
proxy.country = parse_country(html_element)
|
18
|
+
proxy.type = parse_type(html_element)
|
19
|
+
|
20
|
+
response_time = parse_response_time(html_element)
|
21
|
+
|
22
|
+
proxy.response_time = response_time
|
23
|
+
proxy.speed = speed_from_response_time(response_time)
|
34
24
|
end
|
35
25
|
end
|
36
26
|
|
37
27
|
private
|
38
28
|
|
39
|
-
def
|
40
|
-
|
29
|
+
def parse_country(element)
|
30
|
+
clear(element.at_xpath('*//span[1]/following-sibling::text()[1]').content)
|
31
|
+
end
|
32
|
+
|
33
|
+
def parse_type(element)
|
34
|
+
schemas = parse_element(element, 'td[5]')
|
41
35
|
|
42
36
|
if schemas && schemas.downcase.include?('https')
|
43
|
-
|
37
|
+
HTTPS
|
44
38
|
else
|
45
|
-
|
39
|
+
HTTP
|
46
40
|
end
|
47
41
|
end
|
48
42
|
|
43
|
+
def parse_response_time(element)
|
44
|
+
convert_to_int(element.at_xpath('td[4]').content.strip[/\d+/])
|
45
|
+
end
|
46
|
+
|
49
47
|
def speed_from_response_time(response_time)
|
50
48
|
if response_time < 1500
|
51
49
|
:fast
|
@@ -3,30 +3,21 @@ module ProxyFetcher
|
|
3
3
|
class ProxyDocker < Base
|
4
4
|
PROVIDER_URL = 'https://www.proxydocker.com/en'.freeze
|
5
5
|
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
end
|
6
|
+
# [NOTE] Doesn't support direct filters
|
7
|
+
def load_proxy_list(*)
|
8
|
+
doc = load_document(PROVIDER_URL, {})
|
9
|
+
doc.xpath('//table[contains(@class, "table")]/tr[(not(@id="proxy-table-header")) and (count(td)>2)]')
|
11
10
|
end
|
12
11
|
|
13
|
-
def
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
12
|
+
def to_proxy(html_element)
|
13
|
+
ProxyFetcher::Proxy.new.tap do |proxy|
|
14
|
+
uri = URI("//#{parse_element(html_element, 'td[1]')}")
|
15
|
+
proxy.addr = uri.host
|
16
|
+
proxy.port = uri.port
|
18
17
|
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
set!(:type, td.content.strip)
|
23
|
-
when 2
|
24
|
-
set!(:anonymity, td.content.strip)
|
25
|
-
when 4 then
|
26
|
-
set!(:country, td.content.strip)
|
27
|
-
else
|
28
|
-
# nothing
|
29
|
-
end
|
18
|
+
proxy.type = parse_element(html_element, 'td[2]')
|
19
|
+
proxy.anonymity = parse_element(html_element, 'td[3]')
|
20
|
+
proxy.country = parse_element(html_element, 'td[5]')
|
30
21
|
end
|
31
22
|
end
|
32
23
|
end
|
@@ -0,0 +1,35 @@
|
|
1
|
+
require 'base64'
|
2
|
+
|
3
|
+
module ProxyFetcher
|
4
|
+
module Providers
|
5
|
+
class ProxyList < Base
|
6
|
+
PROVIDER_URL = 'https://proxy-list.org/english/index.php'.freeze
|
7
|
+
|
8
|
+
def load_proxy_list(filters = {})
|
9
|
+
doc = load_document(PROVIDER_URL, filters)
|
10
|
+
doc.css('.table-wrap .table ul')
|
11
|
+
end
|
12
|
+
|
13
|
+
def to_proxy(html_element)
|
14
|
+
ProxyFetcher::Proxy.new.tap do |proxy|
|
15
|
+
uri = parse_proxy_uri(html_element)
|
16
|
+
proxy.addr = uri.host
|
17
|
+
proxy.port = uri.port
|
18
|
+
|
19
|
+
proxy.type = parse_element(html_element, 'li[2]')
|
20
|
+
proxy.anonymity = parse_element(html_element, 'li[4]')
|
21
|
+
proxy.country = clear(html_element.at_xpath("li[5]//span[@class='country']").attr('title'))
|
22
|
+
end
|
23
|
+
end
|
24
|
+
|
25
|
+
private
|
26
|
+
|
27
|
+
def parse_proxy_uri(element)
|
28
|
+
full_addr = ::Base64.decode64(element.at('li script').inner_html.match(/'(.+)'/)[1])
|
29
|
+
URI.parse("http://#{full_addr}")
|
30
|
+
end
|
31
|
+
end
|
32
|
+
|
33
|
+
ProxyFetcher::Configuration.register_provider(:proxy_list, ProxyList)
|
34
|
+
end
|
35
|
+
end
|
@@ -1,34 +1,21 @@
|
|
1
1
|
module ProxyFetcher
|
2
2
|
module Providers
|
3
3
|
class XRoxy < Base
|
4
|
-
PROVIDER_URL = 'http://www.xroxy.com/proxylist.php
|
4
|
+
PROVIDER_URL = 'http://www.xroxy.com/proxylist.php'.freeze
|
5
5
|
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
doc.xpath('//div[@id="content"]/table[1]/tr[contains(@class, "row")]')
|
10
|
-
end
|
6
|
+
def load_proxy_list(filters = { type: 'All_http' })
|
7
|
+
doc = load_document(PROVIDER_URL, filters)
|
8
|
+
doc.xpath('//div[@id="content"]/table[1]/tr[contains(@class, "row")]')
|
11
9
|
end
|
12
10
|
|
13
|
-
def
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
set!(:anonymity, td.content.strip)
|
22
|
-
when 4
|
23
|
-
ssl = td.content.strip.downcase
|
24
|
-
set!(:type, ssl.include?('true') ? 'HTTPS' : 'HTTP')
|
25
|
-
when 5 then
|
26
|
-
set!(:country, td.content.strip)
|
27
|
-
when 6
|
28
|
-
set!(:response_time, Integer(td.content.strip))
|
29
|
-
else
|
30
|
-
# nothing
|
31
|
-
end
|
11
|
+
def to_proxy(html_element)
|
12
|
+
ProxyFetcher::Proxy.new.tap do |proxy|
|
13
|
+
proxy.addr = parse_element(html_element, 'td[2]')
|
14
|
+
proxy.port = convert_to_int(parse_element(html_element, 'td[3]'))
|
15
|
+
proxy.anonymity = parse_element(html_element, 'td[4]')
|
16
|
+
proxy.type = parse_element(html_element, 'td[5]').casecmp('true').zero? ? HTTPS : HTTP
|
17
|
+
proxy.country = parse_element(html_element, 'td[6]')
|
18
|
+
proxy.response_time = convert_to_int(parse_element(html_element, 'td[7]'))
|
32
19
|
end
|
33
20
|
end
|
34
21
|
end
|
data/lib/proxy_fetcher/proxy.rb
CHANGED
@@ -1,24 +1,7 @@
|
|
1
1
|
module ProxyFetcher
|
2
|
-
class Proxy
|
3
|
-
attr_reader :addr, :port, :country, :response_time, :speed, :type, :anonymity
|
4
|
-
|
5
|
-
def initialize(html_row)
|
6
|
-
ProxyFetcher.config.provider.parse_entry(html_row, self)
|
7
|
-
|
8
|
-
self
|
9
|
-
end
|
10
|
-
|
2
|
+
class Proxy < OpenStruct
|
11
3
|
def connectable?
|
12
|
-
|
13
|
-
connection.use_ssl = true if https?
|
14
|
-
connection.open_timeout = ProxyFetcher.config.open_timeout
|
15
|
-
connection.read_timeout = ProxyFetcher.config.read_timeout
|
16
|
-
|
17
|
-
connection.start { |http| return true if http.request_head('/') }
|
18
|
-
|
19
|
-
false
|
20
|
-
rescue Timeout::Error, Errno::ECONNREFUSED, Errno::ECONNRESET, Errno::ECONNABORTED
|
21
|
-
false
|
4
|
+
ProxyFetcher.config.http_client.connectable?(url)
|
22
5
|
end
|
23
6
|
|
24
7
|
alias valid? connectable?
|
@@ -0,0 +1,46 @@
|
|
1
|
+
module ProxyFetcher
|
2
|
+
class HTTPClient
|
3
|
+
attr_reader :uri, :http
|
4
|
+
|
5
|
+
def initialize(url)
|
6
|
+
@uri = URI.parse(url)
|
7
|
+
@http = Net::HTTP.new(@uri.host, @uri.port)
|
8
|
+
return unless https?
|
9
|
+
|
10
|
+
@http.use_ssl = true
|
11
|
+
@http.verify_mode = OpenSSL::SSL::VERIFY_NONE
|
12
|
+
end
|
13
|
+
|
14
|
+
def fetch
|
15
|
+
request = Net::HTTP::Get.new(@uri.to_s)
|
16
|
+
request['Connection'] = 'keep-alive'
|
17
|
+
response = @http.request(request)
|
18
|
+
response.body
|
19
|
+
end
|
20
|
+
|
21
|
+
def connectable?
|
22
|
+
@http.open_timeout = ProxyFetcher.config.connection_timeout
|
23
|
+
@http.read_timeout = ProxyFetcher.config.connection_timeout
|
24
|
+
|
25
|
+
@http.start { |connection| return true if connection.request_head('/') }
|
26
|
+
|
27
|
+
false
|
28
|
+
rescue StandardError
|
29
|
+
false
|
30
|
+
end
|
31
|
+
|
32
|
+
def https?
|
33
|
+
@uri.scheme.casecmp('https').zero?
|
34
|
+
end
|
35
|
+
|
36
|
+
class << self
|
37
|
+
def fetch(url)
|
38
|
+
new(url).fetch
|
39
|
+
end
|
40
|
+
|
41
|
+
def connectable?(url)
|
42
|
+
new(url).connectable?
|
43
|
+
end
|
44
|
+
end
|
45
|
+
end
|
46
|
+
end
|
data/proxy_fetcher.gemspec
CHANGED
@@ -5,14 +5,16 @@ require 'proxy_fetcher/version'
|
|
5
5
|
Gem::Specification.new do |gem|
|
6
6
|
gem.name = 'proxy_fetcher'
|
7
7
|
gem.version = ProxyFetcher.gem_version
|
8
|
-
gem.date = '2017-08-
|
9
|
-
gem.summary = 'Ruby gem for dealing with proxy lists '
|
8
|
+
gem.date = '2017-08-21'
|
9
|
+
gem.summary = 'Ruby gem for dealing with proxy lists from different providers'
|
10
10
|
gem.description = 'This gem can help your Ruby application to make HTTP(S) requests ' \
|
11
11
|
'from proxy server by fetching and validating proxy lists from the different providers.'
|
12
12
|
gem.authors = ['Nikita Bulai']
|
13
13
|
gem.email = 'bulajnikita@gmail.com'
|
14
14
|
gem.require_paths = ['lib']
|
15
|
+
gem.bindir = 'bin'
|
15
16
|
gem.files = `git ls-files`.split($RS)
|
17
|
+
gem.executables = `git ls-files -- bin/*`.split("\n").map { |f| File.basename(f) }
|
16
18
|
gem.homepage = 'http://github.com/nbulaj/proxy_fetcher'
|
17
19
|
gem.license = 'MIT'
|
18
20
|
gem.required_ruby_version = '>= 2.2.2'
|
@@ -0,0 +1,48 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
|
3
|
+
describe ProxyFetcher::Configuration do
|
4
|
+
before { ProxyFetcher.config.reset! }
|
5
|
+
after { ProxyFetcher.config.reset! }
|
6
|
+
|
7
|
+
context 'custom HTTP client' do
|
8
|
+
it 'successfully setups if class has all the required methods' do
|
9
|
+
class MyHTTPClient
|
10
|
+
def self.fetch(url)
|
11
|
+
url
|
12
|
+
end
|
13
|
+
|
14
|
+
def self.connectable?(*)
|
15
|
+
true
|
16
|
+
end
|
17
|
+
end
|
18
|
+
|
19
|
+
expect { ProxyFetcher.config.http_client = MyHTTPClient }.not_to raise_error
|
20
|
+
end
|
21
|
+
|
22
|
+
it 'failed on setup if required methods are missing' do
|
23
|
+
MyWrongHTTPClient = Class.new
|
24
|
+
|
25
|
+
expect { ProxyFetcher.config.http_client = MyWrongHTTPClient }
|
26
|
+
.to raise_error(ProxyFetcher::Configuration::WrongHttpClient)
|
27
|
+
end
|
28
|
+
end
|
29
|
+
|
30
|
+
context 'custom provider' do
|
31
|
+
it 'successfully setups if provider class registered' do
|
32
|
+
CustomProvider = Class.new(ProxyFetcher::Providers::Base)
|
33
|
+
ProxyFetcher::Configuration.register_provider(:custom_provider, CustomProvider)
|
34
|
+
|
35
|
+
expect { ProxyFetcher.config.provider = :custom_provider }.not_to raise_error
|
36
|
+
end
|
37
|
+
|
38
|
+
it 'failed on setup if provider class is not registered' do
|
39
|
+
expect { ProxyFetcher.config.provider = :unexisting_provider }
|
40
|
+
.to raise_error(ProxyFetcher::Configuration::UnknownProvider)
|
41
|
+
end
|
42
|
+
|
43
|
+
it 'failed on setup if provider class already registered' do
|
44
|
+
expect { ProxyFetcher::Configuration.register_provider(:xroxy, Class.new)}
|
45
|
+
.to raise_error(ProxyFetcher::Configuration::RegisteredProvider)
|
46
|
+
end
|
47
|
+
end
|
48
|
+
end
|
@@ -0,0 +1,28 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
|
3
|
+
describe ProxyFetcher::Providers::Base do
|
4
|
+
before { ProxyFetcher.config.reset! }
|
5
|
+
after { ProxyFetcher.config.reset! }
|
6
|
+
|
7
|
+
it 'does not allows to use not implemented methods' do
|
8
|
+
NotImplementedCustomProvider = Class.new(ProxyFetcher::Providers::Base)
|
9
|
+
|
10
|
+
ProxyFetcher::Configuration.register_provider(:provider_without_methods, NotImplementedCustomProvider)
|
11
|
+
ProxyFetcher.config.provider = :provider_without_methods
|
12
|
+
|
13
|
+
expect { ProxyFetcher::Manager.new }.to raise_error(NotImplementedError) do |error|
|
14
|
+
expect(error.message).to include('load_proxy_list')
|
15
|
+
end
|
16
|
+
|
17
|
+
# implement one of the methods
|
18
|
+
NotImplementedCustomProvider.class_eval do
|
19
|
+
def load_proxy_list(*)
|
20
|
+
[1, 2, 3]
|
21
|
+
end
|
22
|
+
end
|
23
|
+
|
24
|
+
expect { ProxyFetcher::Manager.new }.to raise_error(NotImplementedError) do |error|
|
25
|
+
expect(error.message).to include('to_proxy')
|
26
|
+
end
|
27
|
+
end
|
28
|
+
end
|
@@ -9,24 +9,24 @@ describe ProxyFetcher::Proxy do
|
|
9
9
|
@manager = ProxyFetcher::Manager.new
|
10
10
|
end
|
11
11
|
|
12
|
-
let(:proxy) { @manager.proxies.first }
|
12
|
+
let(:proxy) { @manager.proxies.first.dup }
|
13
13
|
|
14
14
|
it 'checks schema' do
|
15
|
-
proxy.
|
15
|
+
proxy.type = ProxyFetcher::Providers::Base::HTTP
|
16
16
|
expect(proxy.http?).to be_truthy
|
17
17
|
expect(proxy.https?).to be_falsey
|
18
18
|
|
19
|
-
proxy.
|
19
|
+
proxy.type = ProxyFetcher::Providers::Base::HTTPS
|
20
20
|
expect(proxy.https?).to be_truthy
|
21
21
|
expect(proxy.http?).to be_falsey
|
22
22
|
end
|
23
23
|
|
24
24
|
it 'not connectable if IP addr is wrong' do
|
25
|
-
|
25
|
+
proxy.addr = '192.168.1.0'
|
26
26
|
expect(proxy.connectable?).to be_falsey
|
27
27
|
end
|
28
28
|
|
29
|
-
it 'not connectable if
|
29
|
+
it 'not connectable if there are some error during connection request' do
|
30
30
|
allow_any_instance_of(Net::HTTP).to receive(:start).and_raise(Errno::ECONNABORTED)
|
31
31
|
expect(proxy.connectable?).to be_falsey
|
32
32
|
end
|
@@ -46,13 +46,13 @@ describe ProxyFetcher::Proxy do
|
|
46
46
|
end
|
47
47
|
|
48
48
|
it 'checks speed' do
|
49
|
-
proxy.
|
49
|
+
proxy.speed = :fast
|
50
50
|
expect(proxy.fast?).to be_truthy
|
51
51
|
|
52
|
-
proxy.
|
52
|
+
proxy.speed = :slow
|
53
53
|
expect(proxy.slow?).to be_truthy
|
54
54
|
|
55
|
-
proxy.
|
55
|
+
proxy.speed = :medium
|
56
56
|
expect(proxy.medium?).to be_truthy
|
57
57
|
end
|
58
58
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: proxy_fetcher
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.3.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Nikita Bulai
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2017-08-
|
11
|
+
date: 2017-08-21 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: nokogiri
|
@@ -47,7 +47,8 @@ dependencies:
|
|
47
47
|
description: This gem can help your Ruby application to make HTTP(S) requests from
|
48
48
|
proxy server by fetching and validating proxy lists from the different providers.
|
49
49
|
email: bulajnikita@gmail.com
|
50
|
-
executables:
|
50
|
+
executables:
|
51
|
+
- proxy_fetcher
|
51
52
|
extensions: []
|
52
53
|
extra_rdoc_files: []
|
53
54
|
files:
|
@@ -58,6 +59,7 @@ files:
|
|
58
59
|
- LICENSE
|
59
60
|
- README.md
|
60
61
|
- Rakefile
|
62
|
+
- bin/proxy_fetcher
|
61
63
|
- lib/proxy_fetcher.rb
|
62
64
|
- lib/proxy_fetcher/configuration.rb
|
63
65
|
- lib/proxy_fetcher/manager.rb
|
@@ -66,15 +68,20 @@ files:
|
|
66
68
|
- lib/proxy_fetcher/providers/free_proxy_list_ssl.rb
|
67
69
|
- lib/proxy_fetcher/providers/hide_my_name.rb
|
68
70
|
- lib/proxy_fetcher/providers/proxy_docker.rb
|
71
|
+
- lib/proxy_fetcher/providers/proxy_list.rb
|
69
72
|
- lib/proxy_fetcher/providers/xroxy.rb
|
70
73
|
- lib/proxy_fetcher/proxy.rb
|
71
|
-
- lib/proxy_fetcher/utils/
|
74
|
+
- lib/proxy_fetcher/utils/html.rb
|
75
|
+
- lib/proxy_fetcher/utils/http_client.rb
|
72
76
|
- lib/proxy_fetcher/version.rb
|
73
77
|
- proxy_fetcher.gemspec
|
78
|
+
- spec/proxy_fetcher/configuration_spec.rb
|
79
|
+
- spec/proxy_fetcher/providers/base_spec.rb
|
74
80
|
- spec/proxy_fetcher/providers/free_proxy_list_spec.rb
|
75
81
|
- spec/proxy_fetcher/providers/free_proxy_list_ssl_spec.rb
|
76
82
|
- spec/proxy_fetcher/providers/hide_my_name_spec.rb
|
77
83
|
- spec/proxy_fetcher/providers/proxy_docker_spec.rb
|
84
|
+
- spec/proxy_fetcher/providers/proxy_list_spec.rb
|
78
85
|
- spec/proxy_fetcher/providers/xroxy_spec.rb
|
79
86
|
- spec/proxy_fetcher/proxy_spec.rb
|
80
87
|
- spec/spec_helper.rb
|
@@ -102,5 +109,5 @@ rubyforge_project:
|
|
102
109
|
rubygems_version: 2.6.11
|
103
110
|
signing_key:
|
104
111
|
specification_version: 4
|
105
|
-
summary: Ruby gem for dealing with proxy lists
|
112
|
+
summary: Ruby gem for dealing with proxy lists from different providers
|
106
113
|
test_files: []
|
@@ -1,24 +0,0 @@
|
|
1
|
-
module ProxyFetcher
|
2
|
-
class HTTPClient
|
3
|
-
attr_reader :http
|
4
|
-
|
5
|
-
def initialize(url)
|
6
|
-
@uri = URI.parse(url)
|
7
|
-
@http = Net::HTTP.new(@uri.host, @uri.port)
|
8
|
-
@http.use_ssl = true if @uri.scheme.downcase == 'https'
|
9
|
-
end
|
10
|
-
|
11
|
-
def fetch
|
12
|
-
request = Net::HTTP::Get.new(@uri.to_s)
|
13
|
-
request['Connection'] = 'keep-alive'
|
14
|
-
response = @http.request(request)
|
15
|
-
response.body
|
16
|
-
end
|
17
|
-
|
18
|
-
class << self
|
19
|
-
def fetch(url)
|
20
|
-
new(url).fetch
|
21
|
-
end
|
22
|
-
end
|
23
|
-
end
|
24
|
-
end
|