proxy_fetcher 0.2.5 → 0.3.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +88 -39
- data/bin/proxy_fetcher +57 -0
- data/lib/proxy_fetcher.rb +7 -1
- data/lib/proxy_fetcher/configuration.rb +18 -6
- data/lib/proxy_fetcher/manager.rb +8 -5
- data/lib/proxy_fetcher/providers/base.rb +39 -12
- data/lib/proxy_fetcher/providers/free_proxy_list.rb +14 -29
- data/lib/proxy_fetcher/providers/free_proxy_list_ssl.rb +11 -21
- data/lib/proxy_fetcher/providers/hide_my_name.rb +29 -31
- data/lib/proxy_fetcher/providers/proxy_docker.rb +12 -21
- data/lib/proxy_fetcher/providers/proxy_list.rb +35 -0
- data/lib/proxy_fetcher/providers/xroxy.rb +12 -25
- data/lib/proxy_fetcher/proxy.rb +2 -19
- data/lib/proxy_fetcher/utils/html.rb +15 -0
- data/lib/proxy_fetcher/utils/http_client.rb +46 -0
- data/lib/proxy_fetcher/version.rb +2 -2
- data/proxy_fetcher.gemspec +4 -2
- data/spec/proxy_fetcher/configuration_spec.rb +48 -0
- data/spec/proxy_fetcher/providers/base_spec.rb +28 -0
- data/spec/proxy_fetcher/providers/proxy_list_spec.rb +9 -0
- data/spec/proxy_fetcher/proxy_spec.rb +8 -8
- metadata +12 -5
- data/lib/proxy_fetcher/utils/http_fetcher.rb +0 -24
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 2082cc216a388f9014cbdec5daa8a54ac1d93016
|
4
|
+
data.tar.gz: 6d0d817f9b1fecdbc3512440803a4444bcaf3c67
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: f3a33d56dd95b4c3a7755a8f76f780b702af03de5a546b78cf3bfa56fad96b877169d02ca4c4f83dcfdcee8ee7a1efd67983ead21580b0ea4bb94c78db92da3a
|
7
|
+
data.tar.gz: b0019c53bed440256de36853a9b9973f58bd2fd019e90ca1b8178c727ad64572a3a0b3fc4ded49c64d6f4134935b72f124ba4c8780f9ed01b3412bb91c953499
|
data/README.md
CHANGED
@@ -2,6 +2,7 @@
|
|
2
2
|
[![Gem Version](https://badge.fury.io/rb/proxy_fetcher.svg)](http://badge.fury.io/rb/proxy_fetcher)
|
3
3
|
[![Build Status](https://travis-ci.org/nbulaj/proxy_fetcher.svg?branch=master)](https://travis-ci.org/nbulaj/proxy_fetcher)
|
4
4
|
[![Coverage Status](https://coveralls.io/repos/github/nbulaj/proxy_fetcher/badge.svg)](https://coveralls.io/github/nbulaj/proxy_fetcher)
|
5
|
+
[![Code Climate](https://codeclimate.com/github/nbulaj/proxy_fetcher/badges/gpa.svg)](https://codeclimate.com/github/nbulaj/proxy_fetcher)
|
5
6
|
[![License](http://img.shields.io/badge/license-MIT-brightgreen.svg)](#license)
|
6
7
|
|
7
8
|
This gem can help your Ruby application to make HTTP(S) requests from proxy by fetching and validating actual
|
@@ -10,12 +11,15 @@ proxy lists from the different providers like [HideMyName](https://hidemy.name/e
|
|
10
11
|
It gives you a `Manager` class that can load proxy list, validate it and return random or specific proxy entry. Take a look
|
11
12
|
at the documentation below to find all the gem features.
|
12
13
|
|
14
|
+
Also this gem can be used as standalone solution for downloading and validating proxy lists from the different providers.
|
15
|
+
Checkout examples of usage below.
|
16
|
+
|
13
17
|
## Installation
|
14
18
|
|
15
19
|
If using bundler, first add 'proxy_fetcher' to your Gemfile:
|
16
20
|
|
17
21
|
```ruby
|
18
|
-
gem 'proxy_fetcher', '~> 0.
|
22
|
+
gem 'proxy_fetcher', '~> 0.3'
|
19
23
|
```
|
20
24
|
|
21
25
|
or if you want to use the latest version (from `master` branch), then:
|
@@ -33,12 +37,14 @@ bundle install
|
|
33
37
|
Otherwise simply install the gem:
|
34
38
|
|
35
39
|
```sh
|
36
|
-
gem install proxy_fetcher -v '0.
|
40
|
+
gem install proxy_fetcher -v '0.3'
|
37
41
|
```
|
38
42
|
|
39
43
|
## Example of usage
|
40
44
|
|
41
|
-
|
45
|
+
### In Ruby application
|
46
|
+
|
47
|
+
Get current proxy list without validation:
|
42
48
|
|
43
49
|
```ruby
|
44
50
|
manager = ProxyFetcher::Manager.new # will immediately load proxy list from the server
|
@@ -48,7 +54,7 @@ manager.proxies
|
|
48
54
|
# @response_time=5217, @speed=48, @type="HTTP", @anonymity="High">, ... ]
|
49
55
|
```
|
50
56
|
|
51
|
-
You can initialize proxy manager without
|
57
|
+
You can initialize proxy manager without immediate load of proxy list from the remote server by passing `refresh: false` on initialization:
|
52
58
|
|
53
59
|
```ruby
|
54
60
|
manager = ProxyFetcher::Manager.new(refresh: false) # just initialize class instance
|
@@ -57,7 +63,13 @@ manager.proxies
|
|
57
63
|
#=> []
|
58
64
|
```
|
59
65
|
|
60
|
-
|
66
|
+
If you wanna clean current proxy list from some dead servers that does not respond to the requests, than you can just call `cleanup!` method:
|
67
|
+
|
68
|
+
```ruby
|
69
|
+
manager.cleanup! # or manager.validate!
|
70
|
+
```
|
71
|
+
|
72
|
+
Get raw proxy URLs as Strings:
|
61
73
|
|
62
74
|
```ruby
|
63
75
|
manager = ProxyFetcher::Manager.new
|
@@ -76,6 +88,58 @@ manager.refresh_list! # or manager.fetch!
|
|
76
88
|
# @response_time=5217, @speed=48, @type="HTTP", @anonymity="High">, ... ]
|
77
89
|
```
|
78
90
|
|
91
|
+
If you need to filter proxy list, for example, by country or response time and selected provider supports filtering by GET params, then you
|
92
|
+
can pass your filters to the Manager instance like that:
|
93
|
+
|
94
|
+
```ruby
|
95
|
+
ProxyFetcher.config.provider = :hide_my_name
|
96
|
+
|
97
|
+
manager = ProxyFetcher::Manager.new(filters: { country: 'AO', maxtime: '500' })
|
98
|
+
manager.proxies
|
99
|
+
|
100
|
+
# => [...]
|
101
|
+
```
|
102
|
+
|
103
|
+
*NOTE*: not all the providers support filtering. Take a look at the provider class to see if it supports custom filters.
|
104
|
+
|
105
|
+
You can use two methods to get the first proxy from the list:
|
106
|
+
|
107
|
+
* `get` or aliased `pop` (will return first proxy and move it to the end of the list)
|
108
|
+
* `get!` or aliased `pop!` (will return first **connectable** proxy and move it to the end of the list; all the proxies till the working one will be removed)
|
109
|
+
|
110
|
+
Or you can get just random proxy by calling `manager.random_proxy` or it's alias `manager.random`.
|
111
|
+
|
112
|
+
### Standalone
|
113
|
+
|
114
|
+
All you need to use this gem is Ruby >= 2.0 (2.3 is recommended). You can install it in a different ways. If you are using Ubuntu Xenial (16.04LTS)
|
115
|
+
then you already have Ruby 2.3 installed. In other cases you can install it with [RVM](https://rvm.io/) or [rbenv](https://github.com/rbenv/rbenv).
|
116
|
+
|
117
|
+
Just install the gem by running `gem install proxy_fetcher` in your terminal and run it:
|
118
|
+
|
119
|
+
```bash
|
120
|
+
proxy_fetcher >> proxies.txt # Will download proxies, validate them and write to file
|
121
|
+
```
|
122
|
+
|
123
|
+
If you need a list of proxies in JSON then pass `--json` argument to the command:
|
124
|
+
|
125
|
+
```bash
|
126
|
+
proxy_fetcher --json
|
127
|
+
|
128
|
+
# Will print:
|
129
|
+
# {"proxies":["https://120.26.206.178:8888","https://119.61.13.242:1080","https://117.40.213.26:1080","https://92.62.72.242:1080",
|
130
|
+
# "https://58.20.41.172:1080","https://204.116.192.151:35923","https://190.5.96.58:1080","https://170.250.109.97:35923",
|
131
|
+
# "https://121.41.82.99:1080","https://77.53.105.155:35923"]}
|
132
|
+
|
133
|
+
```
|
134
|
+
|
135
|
+
To get all the possible options run:
|
136
|
+
|
137
|
+
```bash
|
138
|
+
proxy_fetcher --help
|
139
|
+
```
|
140
|
+
|
141
|
+
## Proxy object
|
142
|
+
|
79
143
|
Every proxy is a `ProxyFetcher::Proxy` object that has next readers (instance variables):
|
80
144
|
|
81
145
|
* `addr` (IP address)
|
@@ -84,7 +148,7 @@ Every proxy is a `ProxyFetcher::Proxy` object that has next readers (instance va
|
|
84
148
|
* `response_time` (5217 for example)
|
85
149
|
* `speed` (`:slow`, `:medium` or `:fast`. **Note:** depends on the proxy provider and can be `nil`)
|
86
150
|
* `type` (URI schema, HTTP or HTTPS)
|
87
|
-
* `
|
151
|
+
* `anonymity` (`Low`, `Elite proxy` or `High +KA` for example)
|
88
152
|
|
89
153
|
Also you can call next instance methods for every Proxy object:
|
90
154
|
|
@@ -94,18 +158,7 @@ Also you can call next instance methods for every Proxy object:
|
|
94
158
|
* `uri` (returns `URI::Generic` object)
|
95
159
|
* `url` (returns a formatted URL like "_http://IP:PORT_" )
|
96
160
|
|
97
|
-
You can
|
98
|
-
|
99
|
-
* `get` or aliased `pop` (will return first proxy and move it to the end of the list)
|
100
|
-
* `get!` or aliased `pop!` (will return first **connectable** proxy and move it to the end of the list; all the proxies till the working one will be removed)
|
101
|
-
|
102
|
-
If you wanna clear current proxy manager list from dead servers, you can just call `cleanup!` method:
|
103
|
-
|
104
|
-
```ruby
|
105
|
-
manager.cleanup! # or manager.validate!
|
106
|
-
```
|
107
|
-
|
108
|
-
You can sort or find any proxy by speed using next 3 instance methods:
|
161
|
+
You can sort or find any proxy by speed using next 3 instance methods (if it is available for the specific provider):
|
109
162
|
|
110
163
|
* `fast?`
|
111
164
|
* `medium?`
|
@@ -117,26 +170,27 @@ To change open/read timeout for `cleanup!` and `connectable?` methods you need t
|
|
117
170
|
|
118
171
|
```ruby
|
119
172
|
ProxyFetcher.configure do |config|
|
120
|
-
config.
|
121
|
-
config.open_timeout = 1 # default is 3
|
173
|
+
config.connection_timeout = 1 # default is 3
|
122
174
|
end
|
123
175
|
|
124
176
|
manager = ProxyFetcher::Manager.new
|
125
177
|
manager.cleanup!
|
126
178
|
```
|
127
179
|
|
128
|
-
ProxyFetcher uses simple Ruby solution for dealing with HTTP requests - `net/http` library. If you wanna add, for example, your custom provider that
|
129
|
-
was developed as a Single Page Application (SPA) with some JavaScript, then you will need something like [
|
180
|
+
ProxyFetcher uses simple Ruby solution for dealing with HTTP(S) requests - `net/http` library from the stdlib. If you wanna add, for example, your custom provider that
|
181
|
+
was developed as a Single Page Application (SPA) with some JavaScript, then you will need something like [selenium-webdriver](https://github.com/SeleniumHQ/selenium/tree/master/rb)
|
130
182
|
to properly load the content of the website. For those and other cases you can write your own class for fetching HTML content by the URL and setup it
|
131
183
|
in the ProxyFetcher config:
|
132
184
|
|
133
185
|
```ruby
|
134
186
|
class MyHTTPClient
|
135
|
-
|
136
|
-
|
137
|
-
|
138
|
-
|
139
|
-
|
187
|
+
# [IMPORTANT]: below methods are required!
|
188
|
+
def self.fetch(url)
|
189
|
+
# ... some magic to return proper HTML ...
|
190
|
+
end
|
191
|
+
|
192
|
+
def self.connectable?(url)
|
193
|
+
# ... some magic to check if url is connectable ...
|
140
194
|
end
|
141
195
|
end
|
142
196
|
|
@@ -149,14 +203,17 @@ manager.proxies
|
|
149
203
|
# @response_time=5217, @speed=48, @type="HTTP", @anonymity="High">, ... ]
|
150
204
|
```
|
151
205
|
|
206
|
+
You can take a look at the [lib/proxy_fetcher/utils/http_client.rb](lib/proxy_fetcher/utils/http_client.rb) for an example.
|
207
|
+
|
152
208
|
## Providers
|
153
209
|
|
154
210
|
Currently ProxyFetcher can deal with next proxy providers (services):
|
155
211
|
|
156
212
|
* Hide My Name (default one)
|
157
213
|
* Free Proxy List
|
158
|
-
* SSL Proxies
|
214
|
+
* Free SSL Proxies
|
159
215
|
* Proxy Docker
|
216
|
+
* Proxy List
|
160
217
|
* XRoxy
|
161
218
|
|
162
219
|
If you wanna use one of them just setup required in the config:
|
@@ -176,14 +233,8 @@ Also you can write your own provider. All you need is to create a class, that wo
|
|
176
233
|
ProxyFetcher::Configuration.register_provider(:your_provider, YourProviderClass)
|
177
234
|
```
|
178
235
|
|
179
|
-
Provider class must implement `self.load_proxy_list` and `#
|
180
|
-
provider HTML page with proxy list. Take a look at the
|
181
|
-
|
182
|
-
## TODO
|
183
|
-
|
184
|
-
* Add proxy filters
|
185
|
-
* Code refactoring
|
186
|
-
* Rewrite specs
|
236
|
+
Provider class must implement `self.load_proxy_list` and `#to_proxy(html_element)` methods that will load and parse
|
237
|
+
provider HTML page with proxy list. Take a look at the existing providers in the [lib/proxy_fetcher/providers](lib/proxy_fetcher/providers) directory.
|
187
238
|
|
188
239
|
## Contributing
|
189
240
|
|
@@ -206,8 +257,6 @@ Thanks.
|
|
206
257
|
|
207
258
|
## License
|
208
259
|
|
209
|
-
proxy_fetcher gem is released under the [MIT License](http://www.opensource.org/licenses/MIT).
|
260
|
+
`proxy_fetcher` gem is released under the [MIT License](http://www.opensource.org/licenses/MIT).
|
210
261
|
|
211
262
|
Copyright (c) 2017 Nikita Bulai (bulajnikita@gmail.com).
|
212
|
-
|
213
|
-
Some parser code (c) [pifleo](https://gist.github.com/pifleo/3889803)
|
data/bin/proxy_fetcher
ADDED
@@ -0,0 +1,57 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
require 'optparse'
|
4
|
+
require 'proxy_fetcher'
|
5
|
+
|
6
|
+
options = {
|
7
|
+
validate: true,
|
8
|
+
json: false
|
9
|
+
}
|
10
|
+
|
11
|
+
OptionParser.new do |opts|
|
12
|
+
opts.banner = 'Usage: proxy_fetcher [OPTIONS]'
|
13
|
+
|
14
|
+
opts.on('-h', '--help', '# Show this help message and quit') do
|
15
|
+
puts opts
|
16
|
+
exit(0)
|
17
|
+
end
|
18
|
+
|
19
|
+
opts.on('-p', '--provider=NAME', '# Use specific proxy provider') do |value|
|
20
|
+
provider_name = value.downcase
|
21
|
+
|
22
|
+
unless ProxyFetcher::Configuration.providers.include?(provider_name.to_sym)
|
23
|
+
possible_providers = ProxyFetcher::Configuration.providers.keys
|
24
|
+
|
25
|
+
puts "Unknown provider - '#{value}'.\nUse one of the following: #{possible_providers.join(', ')}."
|
26
|
+
exit(0)
|
27
|
+
end
|
28
|
+
|
29
|
+
options[:provider] = provider_name
|
30
|
+
end
|
31
|
+
|
32
|
+
opts.on('-n', '--no-validate', '# Dump all the proxies without validation') do
|
33
|
+
options[:validate] = false
|
34
|
+
end
|
35
|
+
|
36
|
+
opts.on('-t', '--timeout=SECONDS', Integer, '# Connection timeout in seconds') do |value|
|
37
|
+
options[:timeout] = value
|
38
|
+
end
|
39
|
+
|
40
|
+
opts.on('-j', '--json', '# Dump proxies to the JSON format') do
|
41
|
+
options[:json] = true
|
42
|
+
end
|
43
|
+
end.parse!
|
44
|
+
|
45
|
+
ProxyFetcher.config.provider = options[:provider] if options[:provider]
|
46
|
+
ProxyFetcher.config.connection_timeout = options[:timeout] if options[:timeout]
|
47
|
+
|
48
|
+
manager = ProxyFetcher::Manager.new
|
49
|
+
manager.validate! if options[:validate]
|
50
|
+
|
51
|
+
if options[:json]
|
52
|
+
require 'json'
|
53
|
+
|
54
|
+
puts JSON.generate(proxies: manager.raw_proxies)
|
55
|
+
else
|
56
|
+
puts manager.raw_proxies
|
57
|
+
end
|
data/lib/proxy_fetcher.rb
CHANGED
@@ -1,16 +1,22 @@
|
|
1
1
|
require 'uri'
|
2
2
|
require 'net/http'
|
3
|
+
require 'openssl'
|
3
4
|
require 'nokogiri'
|
5
|
+
require 'ostruct'
|
4
6
|
|
5
7
|
require 'proxy_fetcher/configuration'
|
6
8
|
require 'proxy_fetcher/proxy'
|
7
9
|
require 'proxy_fetcher/manager'
|
8
|
-
|
10
|
+
|
11
|
+
require 'proxy_fetcher/utils/http_client'
|
12
|
+
require 'proxy_fetcher/utils/html'
|
13
|
+
|
9
14
|
require 'proxy_fetcher/providers/base'
|
10
15
|
require 'proxy_fetcher/providers/free_proxy_list'
|
11
16
|
require 'proxy_fetcher/providers/free_proxy_list_ssl'
|
12
17
|
require 'proxy_fetcher/providers/hide_my_name'
|
13
18
|
require 'proxy_fetcher/providers/proxy_docker'
|
19
|
+
require 'proxy_fetcher/providers/proxy_list'
|
14
20
|
require 'proxy_fetcher/providers/xroxy'
|
15
21
|
|
16
22
|
module ProxyFetcher
|
@@ -2,9 +2,10 @@ module ProxyFetcher
|
|
2
2
|
class Configuration
|
3
3
|
UnknownProvider = Class.new(StandardError)
|
4
4
|
RegisteredProvider = Class.new(StandardError)
|
5
|
+
WrongHttpClient = Class.new(StandardError)
|
5
6
|
|
6
|
-
attr_accessor :
|
7
|
-
attr_accessor :
|
7
|
+
attr_accessor :http_client, :connection_timeout
|
8
|
+
attr_accessor :provider
|
8
9
|
|
9
10
|
class << self
|
10
11
|
def providers
|
@@ -12,15 +13,18 @@ module ProxyFetcher
|
|
12
13
|
end
|
13
14
|
|
14
15
|
def register_provider(name, klass)
|
15
|
-
raise RegisteredProvider, "
|
16
|
+
raise RegisteredProvider, "`#{name}` provider already registered!" if providers.key?(name.to_sym)
|
16
17
|
|
17
18
|
providers[name.to_sym] = klass
|
18
19
|
end
|
19
20
|
end
|
20
21
|
|
21
22
|
def initialize
|
22
|
-
|
23
|
-
|
23
|
+
reset!
|
24
|
+
end
|
25
|
+
|
26
|
+
def reset!
|
27
|
+
@connection_timeout = 3
|
24
28
|
@http_client = HTTPClient
|
25
29
|
|
26
30
|
self.provider = :hide_my_name # currently default one
|
@@ -29,7 +33,15 @@ module ProxyFetcher
|
|
29
33
|
def provider=(name)
|
30
34
|
@provider = self.class.providers[name.to_sym]
|
31
35
|
|
32
|
-
raise UnknownProvider, "unregistered proxy provider
|
36
|
+
raise UnknownProvider, "unregistered proxy provider `#{name}`!" if @provider.nil?
|
37
|
+
end
|
38
|
+
|
39
|
+
def http_client=(klass)
|
40
|
+
unless klass.respond_to?(:fetch, :connectable?)
|
41
|
+
raise WrongHttpClient, "#{klass} must respond to #fetch and #connectable? class methods!"
|
42
|
+
end
|
43
|
+
|
44
|
+
@http_client = klass
|
33
45
|
end
|
34
46
|
end
|
35
47
|
end
|
@@ -1,10 +1,12 @@
|
|
1
1
|
module ProxyFetcher
|
2
2
|
class Manager
|
3
|
-
attr_reader :proxies
|
3
|
+
attr_reader :proxies, :filters
|
4
4
|
|
5
5
|
# refresh: true - load proxy list from the remote server on initialization
|
6
6
|
# refresh: false - just initialize the class, proxy list will be empty ([])
|
7
|
-
def initialize(refresh: true)
|
7
|
+
def initialize(refresh: true, filters: {})
|
8
|
+
@filters = filters
|
9
|
+
|
8
10
|
if refresh
|
9
11
|
refresh_list!
|
10
12
|
else
|
@@ -14,8 +16,7 @@ module ProxyFetcher
|
|
14
16
|
|
15
17
|
# Update current proxy list from the provider
|
16
18
|
def refresh_list!
|
17
|
-
|
18
|
-
@proxies = rows.map { |row| Proxy.new(row) }
|
19
|
+
@proxies = ProxyFetcher.config.provider.fetch_proxies!(filters)
|
19
20
|
end
|
20
21
|
|
21
22
|
alias fetch! refresh_list!
|
@@ -56,10 +57,12 @@ module ProxyFetcher
|
|
56
57
|
alias validate! cleanup!
|
57
58
|
|
58
59
|
# Return random proxy
|
59
|
-
def
|
60
|
+
def random_proxy
|
60
61
|
proxies.sample
|
61
62
|
end
|
62
63
|
|
64
|
+
alias random random_proxy
|
65
|
+
|
63
66
|
# Returns array of proxy URLs (just schema + host + port)
|
64
67
|
def raw_proxies
|
65
68
|
proxies.map(&:url)
|
@@ -1,25 +1,52 @@
|
|
1
|
+
require 'forwardable'
|
2
|
+
|
1
3
|
module ProxyFetcher
|
2
4
|
module Providers
|
3
5
|
class Base
|
4
|
-
|
6
|
+
extend Forwardable
|
5
7
|
|
6
|
-
|
7
|
-
|
8
|
-
|
8
|
+
def_delegators ProxyFetcher::HTML, :clear, :convert_to_int
|
9
|
+
|
10
|
+
PROXY_TYPES = [
|
11
|
+
HTTP = 'HTTP'.freeze,
|
12
|
+
HTTPS = 'HTTPS'.freeze
|
13
|
+
].freeze
|
9
14
|
|
10
|
-
|
11
|
-
|
15
|
+
attr_reader :proxy
|
16
|
+
|
17
|
+
def fetch_proxies!(filters = {})
|
18
|
+
load_proxy_list(filters).map { |html| to_proxy(html) }
|
12
19
|
end
|
13
20
|
|
14
21
|
class << self
|
15
|
-
def
|
16
|
-
new
|
22
|
+
def fetch_proxies!(filters = {})
|
23
|
+
new.fetch_proxies!(filters)
|
17
24
|
end
|
25
|
+
end
|
18
26
|
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
27
|
+
protected
|
28
|
+
|
29
|
+
# Loads HTML document with Nokogiri by the URL combined with custom filters
|
30
|
+
def load_document(url, filters = {})
|
31
|
+
uri = URI.parse(url)
|
32
|
+
uri.query = URI.encode_www_form(filters) if filters.any?
|
33
|
+
|
34
|
+
Nokogiri::HTML(ProxyFetcher.config.http_client.fetch(uri.to_s))
|
35
|
+
end
|
36
|
+
|
37
|
+
# Get HTML elements with proxy info
|
38
|
+
def load_proxy_list(*)
|
39
|
+
raise NotImplementedError, "#{__method__} must be implemented in a descendant class!"
|
40
|
+
end
|
41
|
+
|
42
|
+
# Convert HTML element with proxy info to ProxyFetcher::Proxy instance
|
43
|
+
def to_proxy(*)
|
44
|
+
raise NotImplementedError, "#{__method__} must be implemented in a descendant class!"
|
45
|
+
end
|
46
|
+
|
47
|
+
# Return normalized HTML element content by selector
|
48
|
+
def parse_element(element, selector, method = :at_xpath)
|
49
|
+
clear(element.public_send(method, selector).content)
|
23
50
|
end
|
24
51
|
end
|
25
52
|
end
|
@@ -3,42 +3,27 @@ module ProxyFetcher
|
|
3
3
|
class FreeProxyList < Base
|
4
4
|
PROVIDER_URL = 'https://free-proxy-list.net/'.freeze
|
5
5
|
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
end
|
6
|
+
# [NOTE] Doesn't support filtering
|
7
|
+
def load_proxy_list(*)
|
8
|
+
doc = load_document(PROVIDER_URL, {})
|
9
|
+
doc.xpath('//table[@id="proxylisttable"]/tbody/tr')
|
11
10
|
end
|
12
11
|
|
13
|
-
def
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
when 3 then
|
21
|
-
set!(:country, td.content.strip)
|
22
|
-
when 4
|
23
|
-
set!(:anonymity, td.content.strip)
|
24
|
-
when 6
|
25
|
-
set!(:type, parse_type(td))
|
26
|
-
else
|
27
|
-
# nothing
|
28
|
-
end
|
12
|
+
def to_proxy(html_element)
|
13
|
+
ProxyFetcher::Proxy.new.tap do |proxy|
|
14
|
+
proxy.addr = parse_element(html_element, 'td[1]')
|
15
|
+
proxy.port = convert_to_int(parse_element(html_element, 'td[2]'))
|
16
|
+
proxy.country = parse_element(html_element, 'td[4]')
|
17
|
+
proxy.anonymity = parse_element(html_element, 'td[5]')
|
18
|
+
proxy.type = parse_type(html_element)
|
29
19
|
end
|
30
20
|
end
|
31
21
|
|
32
22
|
private
|
33
23
|
|
34
|
-
def parse_type(
|
35
|
-
type = td
|
36
|
-
|
37
|
-
if type && type.downcase.include?('yes')
|
38
|
-
'HTTPS'
|
39
|
-
else
|
40
|
-
'HTTP'
|
41
|
-
end
|
24
|
+
def parse_type(element)
|
25
|
+
type = parse_element(element, 'td[6]')
|
26
|
+
type && type.casecmp('yes').zero? ? HTTPS : HTTP
|
42
27
|
end
|
43
28
|
end
|
44
29
|
|
@@ -3,29 +3,19 @@ module ProxyFetcher
|
|
3
3
|
class FreeProxyListSSL < Base
|
4
4
|
PROVIDER_URL = 'https://www.sslproxies.org/'.freeze
|
5
5
|
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
end
|
6
|
+
# [NOTE] Doesn't support filtering
|
7
|
+
def load_proxy_list(*)
|
8
|
+
doc = load_document(PROVIDER_URL, {})
|
9
|
+
doc.xpath('//table[@id="proxylisttable"]/tbody/tr')
|
11
10
|
end
|
12
11
|
|
13
|
-
def
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
when 3 then
|
21
|
-
set!(:country, td.content.strip)
|
22
|
-
when 4
|
23
|
-
set!(:anonymity, td.content.strip)
|
24
|
-
when 6
|
25
|
-
set!(:type, 'HTTPS')
|
26
|
-
else
|
27
|
-
# nothing
|
28
|
-
end
|
12
|
+
def to_proxy(html_element)
|
13
|
+
ProxyFetcher::Proxy.new.tap do |proxy|
|
14
|
+
proxy.addr = parse_element(html_element, 'td[1]')
|
15
|
+
proxy.port = convert_to_int(parse_element(html_element, 'td[2]'))
|
16
|
+
proxy.country = parse_element(html_element, 'td[4]')
|
17
|
+
proxy.anonymity = parse_element(html_element, 'td[5]')
|
18
|
+
proxy.type = HTTPS
|
29
19
|
end
|
30
20
|
end
|
31
21
|
end
|
@@ -1,51 +1,49 @@
|
|
1
1
|
module ProxyFetcher
|
2
2
|
module Providers
|
3
3
|
class HideMyName < Base
|
4
|
-
PROVIDER_URL = 'https://hidemy.name/en/proxy-list
|
4
|
+
PROVIDER_URL = 'https://hidemy.name/en/proxy-list/'.freeze
|
5
5
|
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
doc.xpath('//table[@class="proxy__t"]/tbody/tr')
|
10
|
-
end
|
6
|
+
def load_proxy_list(filters = { type: 'hs' })
|
7
|
+
doc = load_document(PROVIDER_URL, filters)
|
8
|
+
doc.xpath('//table[@class="proxy__t"]/tbody/tr')
|
11
9
|
end
|
12
10
|
|
13
|
-
def
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
set!(:speed, speed_from_response_time(response_time))
|
27
|
-
when 4
|
28
|
-
set!(:type, parse_type(td))
|
29
|
-
when 5
|
30
|
-
set!(:anonymity, td.content.strip)
|
31
|
-
else
|
32
|
-
# nothing
|
33
|
-
end
|
11
|
+
def to_proxy(html_element)
|
12
|
+
ProxyFetcher::Proxy.new.tap do |proxy|
|
13
|
+
proxy.addr = parse_element(html_element, 'td[1]')
|
14
|
+
proxy.port = convert_to_int(parse_element(html_element, 'td[2]'))
|
15
|
+
proxy.anonymity = parse_element(html_element, 'td[6]')
|
16
|
+
|
17
|
+
proxy.country = parse_country(html_element)
|
18
|
+
proxy.type = parse_type(html_element)
|
19
|
+
|
20
|
+
response_time = parse_response_time(html_element)
|
21
|
+
|
22
|
+
proxy.response_time = response_time
|
23
|
+
proxy.speed = speed_from_response_time(response_time)
|
34
24
|
end
|
35
25
|
end
|
36
26
|
|
37
27
|
private
|
38
28
|
|
39
|
-
def
|
40
|
-
|
29
|
+
def parse_country(element)
|
30
|
+
clear(element.at_xpath('*//span[1]/following-sibling::text()[1]').content)
|
31
|
+
end
|
32
|
+
|
33
|
+
def parse_type(element)
|
34
|
+
schemas = parse_element(element, 'td[5]')
|
41
35
|
|
42
36
|
if schemas && schemas.downcase.include?('https')
|
43
|
-
|
37
|
+
HTTPS
|
44
38
|
else
|
45
|
-
|
39
|
+
HTTP
|
46
40
|
end
|
47
41
|
end
|
48
42
|
|
43
|
+
def parse_response_time(element)
|
44
|
+
convert_to_int(element.at_xpath('td[4]').content.strip[/\d+/])
|
45
|
+
end
|
46
|
+
|
49
47
|
def speed_from_response_time(response_time)
|
50
48
|
if response_time < 1500
|
51
49
|
:fast
|
@@ -3,30 +3,21 @@ module ProxyFetcher
|
|
3
3
|
class ProxyDocker < Base
|
4
4
|
PROVIDER_URL = 'https://www.proxydocker.com/en'.freeze
|
5
5
|
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
|
10
|
-
end
|
6
|
+
# [NOTE] Doesn't support direct filters
|
7
|
+
def load_proxy_list(*)
|
8
|
+
doc = load_document(PROVIDER_URL, {})
|
9
|
+
doc.xpath('//table[contains(@class, "table")]/tr[(not(@id="proxy-table-header")) and (count(td)>2)]')
|
11
10
|
end
|
12
11
|
|
13
|
-
def
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
12
|
+
def to_proxy(html_element)
|
13
|
+
ProxyFetcher::Proxy.new.tap do |proxy|
|
14
|
+
uri = URI("//#{parse_element(html_element, 'td[1]')}")
|
15
|
+
proxy.addr = uri.host
|
16
|
+
proxy.port = uri.port
|
18
17
|
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
set!(:type, td.content.strip)
|
23
|
-
when 2
|
24
|
-
set!(:anonymity, td.content.strip)
|
25
|
-
when 4 then
|
26
|
-
set!(:country, td.content.strip)
|
27
|
-
else
|
28
|
-
# nothing
|
29
|
-
end
|
18
|
+
proxy.type = parse_element(html_element, 'td[2]')
|
19
|
+
proxy.anonymity = parse_element(html_element, 'td[3]')
|
20
|
+
proxy.country = parse_element(html_element, 'td[5]')
|
30
21
|
end
|
31
22
|
end
|
32
23
|
end
|
@@ -0,0 +1,35 @@
|
|
1
|
+
require 'base64'
|
2
|
+
|
3
|
+
module ProxyFetcher
|
4
|
+
module Providers
|
5
|
+
class ProxyList < Base
|
6
|
+
PROVIDER_URL = 'https://proxy-list.org/english/index.php'.freeze
|
7
|
+
|
8
|
+
def load_proxy_list(filters = {})
|
9
|
+
doc = load_document(PROVIDER_URL, filters)
|
10
|
+
doc.css('.table-wrap .table ul')
|
11
|
+
end
|
12
|
+
|
13
|
+
def to_proxy(html_element)
|
14
|
+
ProxyFetcher::Proxy.new.tap do |proxy|
|
15
|
+
uri = parse_proxy_uri(html_element)
|
16
|
+
proxy.addr = uri.host
|
17
|
+
proxy.port = uri.port
|
18
|
+
|
19
|
+
proxy.type = parse_element(html_element, 'li[2]')
|
20
|
+
proxy.anonymity = parse_element(html_element, 'li[4]')
|
21
|
+
proxy.country = clear(html_element.at_xpath("li[5]//span[@class='country']").attr('title'))
|
22
|
+
end
|
23
|
+
end
|
24
|
+
|
25
|
+
private
|
26
|
+
|
27
|
+
def parse_proxy_uri(element)
|
28
|
+
full_addr = ::Base64.decode64(element.at('li script').inner_html.match(/'(.+)'/)[1])
|
29
|
+
URI.parse("http://#{full_addr}")
|
30
|
+
end
|
31
|
+
end
|
32
|
+
|
33
|
+
ProxyFetcher::Configuration.register_provider(:proxy_list, ProxyList)
|
34
|
+
end
|
35
|
+
end
|
@@ -1,34 +1,21 @@
|
|
1
1
|
module ProxyFetcher
|
2
2
|
module Providers
|
3
3
|
class XRoxy < Base
|
4
|
-
PROVIDER_URL = 'http://www.xroxy.com/proxylist.php
|
4
|
+
PROVIDER_URL = 'http://www.xroxy.com/proxylist.php'.freeze
|
5
5
|
|
6
|
-
|
7
|
-
|
8
|
-
|
9
|
-
doc.xpath('//div[@id="content"]/table[1]/tr[contains(@class, "row")]')
|
10
|
-
end
|
6
|
+
def load_proxy_list(filters = { type: 'All_http' })
|
7
|
+
doc = load_document(PROVIDER_URL, filters)
|
8
|
+
doc.xpath('//div[@id="content"]/table[1]/tr[contains(@class, "row")]')
|
11
9
|
end
|
12
10
|
|
13
|
-
def
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
set!(:anonymity, td.content.strip)
|
22
|
-
when 4
|
23
|
-
ssl = td.content.strip.downcase
|
24
|
-
set!(:type, ssl.include?('true') ? 'HTTPS' : 'HTTP')
|
25
|
-
when 5 then
|
26
|
-
set!(:country, td.content.strip)
|
27
|
-
when 6
|
28
|
-
set!(:response_time, Integer(td.content.strip))
|
29
|
-
else
|
30
|
-
# nothing
|
31
|
-
end
|
11
|
+
def to_proxy(html_element)
|
12
|
+
ProxyFetcher::Proxy.new.tap do |proxy|
|
13
|
+
proxy.addr = parse_element(html_element, 'td[2]')
|
14
|
+
proxy.port = convert_to_int(parse_element(html_element, 'td[3]'))
|
15
|
+
proxy.anonymity = parse_element(html_element, 'td[4]')
|
16
|
+
proxy.type = parse_element(html_element, 'td[5]').casecmp('true').zero? ? HTTPS : HTTP
|
17
|
+
proxy.country = parse_element(html_element, 'td[6]')
|
18
|
+
proxy.response_time = convert_to_int(parse_element(html_element, 'td[7]'))
|
32
19
|
end
|
33
20
|
end
|
34
21
|
end
|
data/lib/proxy_fetcher/proxy.rb
CHANGED
@@ -1,24 +1,7 @@
|
|
1
1
|
module ProxyFetcher
|
2
|
-
class Proxy
|
3
|
-
attr_reader :addr, :port, :country, :response_time, :speed, :type, :anonymity
|
4
|
-
|
5
|
-
def initialize(html_row)
|
6
|
-
ProxyFetcher.config.provider.parse_entry(html_row, self)
|
7
|
-
|
8
|
-
self
|
9
|
-
end
|
10
|
-
|
2
|
+
class Proxy < OpenStruct
|
11
3
|
def connectable?
|
12
|
-
|
13
|
-
connection.use_ssl = true if https?
|
14
|
-
connection.open_timeout = ProxyFetcher.config.open_timeout
|
15
|
-
connection.read_timeout = ProxyFetcher.config.read_timeout
|
16
|
-
|
17
|
-
connection.start { |http| return true if http.request_head('/') }
|
18
|
-
|
19
|
-
false
|
20
|
-
rescue Timeout::Error, Errno::ECONNREFUSED, Errno::ECONNRESET, Errno::ECONNABORTED
|
21
|
-
false
|
4
|
+
ProxyFetcher.config.http_client.connectable?(url)
|
22
5
|
end
|
23
6
|
|
24
7
|
alias valid? connectable?
|
@@ -0,0 +1,46 @@
|
|
1
|
+
module ProxyFetcher
|
2
|
+
class HTTPClient
|
3
|
+
attr_reader :uri, :http
|
4
|
+
|
5
|
+
def initialize(url)
|
6
|
+
@uri = URI.parse(url)
|
7
|
+
@http = Net::HTTP.new(@uri.host, @uri.port)
|
8
|
+
return unless https?
|
9
|
+
|
10
|
+
@http.use_ssl = true
|
11
|
+
@http.verify_mode = OpenSSL::SSL::VERIFY_NONE
|
12
|
+
end
|
13
|
+
|
14
|
+
def fetch
|
15
|
+
request = Net::HTTP::Get.new(@uri.to_s)
|
16
|
+
request['Connection'] = 'keep-alive'
|
17
|
+
response = @http.request(request)
|
18
|
+
response.body
|
19
|
+
end
|
20
|
+
|
21
|
+
def connectable?
|
22
|
+
@http.open_timeout = ProxyFetcher.config.connection_timeout
|
23
|
+
@http.read_timeout = ProxyFetcher.config.connection_timeout
|
24
|
+
|
25
|
+
@http.start { |connection| return true if connection.request_head('/') }
|
26
|
+
|
27
|
+
false
|
28
|
+
rescue StandardError
|
29
|
+
false
|
30
|
+
end
|
31
|
+
|
32
|
+
def https?
|
33
|
+
@uri.scheme.casecmp('https').zero?
|
34
|
+
end
|
35
|
+
|
36
|
+
class << self
|
37
|
+
def fetch(url)
|
38
|
+
new(url).fetch
|
39
|
+
end
|
40
|
+
|
41
|
+
def connectable?(url)
|
42
|
+
new(url).connectable?
|
43
|
+
end
|
44
|
+
end
|
45
|
+
end
|
46
|
+
end
|
data/proxy_fetcher.gemspec
CHANGED
@@ -5,14 +5,16 @@ require 'proxy_fetcher/version'
|
|
5
5
|
Gem::Specification.new do |gem|
|
6
6
|
gem.name = 'proxy_fetcher'
|
7
7
|
gem.version = ProxyFetcher.gem_version
|
8
|
-
gem.date = '2017-08-
|
9
|
-
gem.summary = 'Ruby gem for dealing with proxy lists '
|
8
|
+
gem.date = '2017-08-21'
|
9
|
+
gem.summary = 'Ruby gem for dealing with proxy lists from different providers'
|
10
10
|
gem.description = 'This gem can help your Ruby application to make HTTP(S) requests ' \
|
11
11
|
'from proxy server by fetching and validating proxy lists from the different providers.'
|
12
12
|
gem.authors = ['Nikita Bulai']
|
13
13
|
gem.email = 'bulajnikita@gmail.com'
|
14
14
|
gem.require_paths = ['lib']
|
15
|
+
gem.bindir = 'bin'
|
15
16
|
gem.files = `git ls-files`.split($RS)
|
17
|
+
gem.executables = `git ls-files -- bin/*`.split("\n").map { |f| File.basename(f) }
|
16
18
|
gem.homepage = 'http://github.com/nbulaj/proxy_fetcher'
|
17
19
|
gem.license = 'MIT'
|
18
20
|
gem.required_ruby_version = '>= 2.2.2'
|
@@ -0,0 +1,48 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
|
3
|
+
describe ProxyFetcher::Configuration do
|
4
|
+
before { ProxyFetcher.config.reset! }
|
5
|
+
after { ProxyFetcher.config.reset! }
|
6
|
+
|
7
|
+
context 'custom HTTP client' do
|
8
|
+
it 'successfully setups if class has all the required methods' do
|
9
|
+
class MyHTTPClient
|
10
|
+
def self.fetch(url)
|
11
|
+
url
|
12
|
+
end
|
13
|
+
|
14
|
+
def self.connectable?(*)
|
15
|
+
true
|
16
|
+
end
|
17
|
+
end
|
18
|
+
|
19
|
+
expect { ProxyFetcher.config.http_client = MyHTTPClient }.not_to raise_error
|
20
|
+
end
|
21
|
+
|
22
|
+
it 'failed on setup if required methods are missing' do
|
23
|
+
MyWrongHTTPClient = Class.new
|
24
|
+
|
25
|
+
expect { ProxyFetcher.config.http_client = MyWrongHTTPClient }
|
26
|
+
.to raise_error(ProxyFetcher::Configuration::WrongHttpClient)
|
27
|
+
end
|
28
|
+
end
|
29
|
+
|
30
|
+
context 'custom provider' do
|
31
|
+
it 'successfully setups if provider class registered' do
|
32
|
+
CustomProvider = Class.new(ProxyFetcher::Providers::Base)
|
33
|
+
ProxyFetcher::Configuration.register_provider(:custom_provider, CustomProvider)
|
34
|
+
|
35
|
+
expect { ProxyFetcher.config.provider = :custom_provider }.not_to raise_error
|
36
|
+
end
|
37
|
+
|
38
|
+
it 'failed on setup if provider class is not registered' do
|
39
|
+
expect { ProxyFetcher.config.provider = :unexisting_provider }
|
40
|
+
.to raise_error(ProxyFetcher::Configuration::UnknownProvider)
|
41
|
+
end
|
42
|
+
|
43
|
+
it 'failed on setup if provider class already registered' do
|
44
|
+
expect { ProxyFetcher::Configuration.register_provider(:xroxy, Class.new)}
|
45
|
+
.to raise_error(ProxyFetcher::Configuration::RegisteredProvider)
|
46
|
+
end
|
47
|
+
end
|
48
|
+
end
|
@@ -0,0 +1,28 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
|
3
|
+
describe ProxyFetcher::Providers::Base do
|
4
|
+
before { ProxyFetcher.config.reset! }
|
5
|
+
after { ProxyFetcher.config.reset! }
|
6
|
+
|
7
|
+
it 'does not allows to use not implemented methods' do
|
8
|
+
NotImplementedCustomProvider = Class.new(ProxyFetcher::Providers::Base)
|
9
|
+
|
10
|
+
ProxyFetcher::Configuration.register_provider(:provider_without_methods, NotImplementedCustomProvider)
|
11
|
+
ProxyFetcher.config.provider = :provider_without_methods
|
12
|
+
|
13
|
+
expect { ProxyFetcher::Manager.new }.to raise_error(NotImplementedError) do |error|
|
14
|
+
expect(error.message).to include('load_proxy_list')
|
15
|
+
end
|
16
|
+
|
17
|
+
# implement one of the methods
|
18
|
+
NotImplementedCustomProvider.class_eval do
|
19
|
+
def load_proxy_list(*)
|
20
|
+
[1, 2, 3]
|
21
|
+
end
|
22
|
+
end
|
23
|
+
|
24
|
+
expect { ProxyFetcher::Manager.new }.to raise_error(NotImplementedError) do |error|
|
25
|
+
expect(error.message).to include('to_proxy')
|
26
|
+
end
|
27
|
+
end
|
28
|
+
end
|
@@ -9,24 +9,24 @@ describe ProxyFetcher::Proxy do
|
|
9
9
|
@manager = ProxyFetcher::Manager.new
|
10
10
|
end
|
11
11
|
|
12
|
-
let(:proxy) { @manager.proxies.first }
|
12
|
+
let(:proxy) { @manager.proxies.first.dup }
|
13
13
|
|
14
14
|
it 'checks schema' do
|
15
|
-
proxy.
|
15
|
+
proxy.type = ProxyFetcher::Providers::Base::HTTP
|
16
16
|
expect(proxy.http?).to be_truthy
|
17
17
|
expect(proxy.https?).to be_falsey
|
18
18
|
|
19
|
-
proxy.
|
19
|
+
proxy.type = ProxyFetcher::Providers::Base::HTTPS
|
20
20
|
expect(proxy.https?).to be_truthy
|
21
21
|
expect(proxy.http?).to be_falsey
|
22
22
|
end
|
23
23
|
|
24
24
|
it 'not connectable if IP addr is wrong' do
|
25
|
-
|
25
|
+
proxy.addr = '192.168.1.0'
|
26
26
|
expect(proxy.connectable?).to be_falsey
|
27
27
|
end
|
28
28
|
|
29
|
-
it 'not connectable if
|
29
|
+
it 'not connectable if there are some error during connection request' do
|
30
30
|
allow_any_instance_of(Net::HTTP).to receive(:start).and_raise(Errno::ECONNABORTED)
|
31
31
|
expect(proxy.connectable?).to be_falsey
|
32
32
|
end
|
@@ -46,13 +46,13 @@ describe ProxyFetcher::Proxy do
|
|
46
46
|
end
|
47
47
|
|
48
48
|
it 'checks speed' do
|
49
|
-
proxy.
|
49
|
+
proxy.speed = :fast
|
50
50
|
expect(proxy.fast?).to be_truthy
|
51
51
|
|
52
|
-
proxy.
|
52
|
+
proxy.speed = :slow
|
53
53
|
expect(proxy.slow?).to be_truthy
|
54
54
|
|
55
|
-
proxy.
|
55
|
+
proxy.speed = :medium
|
56
56
|
expect(proxy.medium?).to be_truthy
|
57
57
|
end
|
58
58
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: proxy_fetcher
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.3.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Nikita Bulai
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2017-08-
|
11
|
+
date: 2017-08-21 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: nokogiri
|
@@ -47,7 +47,8 @@ dependencies:
|
|
47
47
|
description: This gem can help your Ruby application to make HTTP(S) requests from
|
48
48
|
proxy server by fetching and validating proxy lists from the different providers.
|
49
49
|
email: bulajnikita@gmail.com
|
50
|
-
executables:
|
50
|
+
executables:
|
51
|
+
- proxy_fetcher
|
51
52
|
extensions: []
|
52
53
|
extra_rdoc_files: []
|
53
54
|
files:
|
@@ -58,6 +59,7 @@ files:
|
|
58
59
|
- LICENSE
|
59
60
|
- README.md
|
60
61
|
- Rakefile
|
62
|
+
- bin/proxy_fetcher
|
61
63
|
- lib/proxy_fetcher.rb
|
62
64
|
- lib/proxy_fetcher/configuration.rb
|
63
65
|
- lib/proxy_fetcher/manager.rb
|
@@ -66,15 +68,20 @@ files:
|
|
66
68
|
- lib/proxy_fetcher/providers/free_proxy_list_ssl.rb
|
67
69
|
- lib/proxy_fetcher/providers/hide_my_name.rb
|
68
70
|
- lib/proxy_fetcher/providers/proxy_docker.rb
|
71
|
+
- lib/proxy_fetcher/providers/proxy_list.rb
|
69
72
|
- lib/proxy_fetcher/providers/xroxy.rb
|
70
73
|
- lib/proxy_fetcher/proxy.rb
|
71
|
-
- lib/proxy_fetcher/utils/
|
74
|
+
- lib/proxy_fetcher/utils/html.rb
|
75
|
+
- lib/proxy_fetcher/utils/http_client.rb
|
72
76
|
- lib/proxy_fetcher/version.rb
|
73
77
|
- proxy_fetcher.gemspec
|
78
|
+
- spec/proxy_fetcher/configuration_spec.rb
|
79
|
+
- spec/proxy_fetcher/providers/base_spec.rb
|
74
80
|
- spec/proxy_fetcher/providers/free_proxy_list_spec.rb
|
75
81
|
- spec/proxy_fetcher/providers/free_proxy_list_ssl_spec.rb
|
76
82
|
- spec/proxy_fetcher/providers/hide_my_name_spec.rb
|
77
83
|
- spec/proxy_fetcher/providers/proxy_docker_spec.rb
|
84
|
+
- spec/proxy_fetcher/providers/proxy_list_spec.rb
|
78
85
|
- spec/proxy_fetcher/providers/xroxy_spec.rb
|
79
86
|
- spec/proxy_fetcher/proxy_spec.rb
|
80
87
|
- spec/spec_helper.rb
|
@@ -102,5 +109,5 @@ rubyforge_project:
|
|
102
109
|
rubygems_version: 2.6.11
|
103
110
|
signing_key:
|
104
111
|
specification_version: 4
|
105
|
-
summary: Ruby gem for dealing with proxy lists
|
112
|
+
summary: Ruby gem for dealing with proxy lists from different providers
|
106
113
|
test_files: []
|
@@ -1,24 +0,0 @@
|
|
1
|
-
module ProxyFetcher
|
2
|
-
class HTTPClient
|
3
|
-
attr_reader :http
|
4
|
-
|
5
|
-
def initialize(url)
|
6
|
-
@uri = URI.parse(url)
|
7
|
-
@http = Net::HTTP.new(@uri.host, @uri.port)
|
8
|
-
@http.use_ssl = true if @uri.scheme.downcase == 'https'
|
9
|
-
end
|
10
|
-
|
11
|
-
def fetch
|
12
|
-
request = Net::HTTP::Get.new(@uri.to_s)
|
13
|
-
request['Connection'] = 'keep-alive'
|
14
|
-
response = @http.request(request)
|
15
|
-
response.body
|
16
|
-
end
|
17
|
-
|
18
|
-
class << self
|
19
|
-
def fetch(url)
|
20
|
-
new(url).fetch
|
21
|
-
end
|
22
|
-
end
|
23
|
-
end
|
24
|
-
end
|