proxy_fetcher 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 405be695c325a43b7669a1c8511b9003e3640e18
4
+ data.tar.gz: e5b4caa06fb9e5b2fa5f04f91c1ed7c9f11fdfa6
5
+ SHA512:
6
+ metadata.gz: 6516ad0696069ce2ec2e9bf52ed83f47dadb4fd8dc202277105f61bc57ee3a366d99f8023fb3e5fd3c0de8242a586919e9d1172b86e62f21f11216fbc0b4ba70
7
+ data.tar.gz: a331a1db2a45667c1ef132e7ec4f9fa1c789435e1ab3ff7bcad9a31da5340416b221260923a40b0009909a448c08157da7d32b5834acdade3e2124aa73dff6f7
@@ -0,0 +1,46 @@
1
+ *.rbc
2
+ capybara-*.html
3
+ .rspec
4
+ /log
5
+ /tmp
6
+ /db/*.sqlite3
7
+ /db/*.sqlite3-journal
8
+ /public/system
9
+ /coverage/
10
+ /spec/tmp
11
+ *.orig
12
+ rerun.txt
13
+ pickle-email-*.html
14
+ .idea
15
+ Gemfile.lock
16
+
17
+ # TODO Comment out this rule if you are OK with secrets being uploaded to the repo
18
+ config/initializers/secret_token.rb
19
+
20
+ # Only include if you have production secrets in this file, which is no longer a Rails default
21
+ # config/secrets.yml
22
+
23
+ # dotenv
24
+ # TODO Comment out this rule if environment variables can be committed
25
+ .env
26
+
27
+ ## Environment normalization:
28
+ /.bundle
29
+ /vendor/bundle
30
+
31
+ # these should all be checked in to normalize the environment:
32
+ # Gemfile.lock, .ruby-version, .ruby-gemset
33
+
34
+ # unless supporting rvm < 1.11.0 or doing something fancy, ignore this:
35
+ .rvmrc
36
+
37
+ # if using bower-rails ignore default bower_components path bower.json files
38
+ /vendor/assets/bower_components
39
+ *.bowerrc
40
+ bower.json
41
+
42
+ # Ignore pow environment settings
43
+ .powenv
44
+
45
+ # Ignore Byebug command history file.
46
+ .byebug_history
@@ -0,0 +1,12 @@
1
+ language: ruby
2
+ before_install: gem install bundler
3
+ bundler_args: --without yard guard benchmarks
4
+ script: "rake spec"
5
+ rvm:
6
+ - 2.2.4
7
+ - 2.3.3
8
+ - 2.4.0
9
+ - ruby-head
10
+ matrix:
11
+ allow_failures:
12
+ - rvm: ruby-head
data/Gemfile ADDED
@@ -0,0 +1,7 @@
1
+ source 'https://rubygems.org'
2
+
3
+ gemspec
4
+
5
+ group :test do
6
+ gem 'coveralls', require: false
7
+ end
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2017 Nikita Bulai
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,118 @@
1
+ # Ruby lib for managing proxies
2
+ [![Build Status](https://travis-ci.org/nbulaj/proxy_fetcher.svg?branch=master)](https://travis-ci.org/nbulaj/proxy_fetcher)
3
+ [![Coverage Status](https://coveralls.io/repos/github/nbulaj/proxy_fetcher/badge.svg)](https://coveralls.io/github/nbulaj/proxy_fetcher)
4
+ [![License](http://img.shields.io/badge/license-MIT-brightgreen.svg)](#license)
5
+
6
+ This gem can help your Ruby application to make HTTP(S) requests from proxy server, fetching and validating
7
+ current proxy lists from the [HideMyAss](http://hidemyass.com/) service.
8
+
9
+ ## Installation
10
+
11
+ If using bundler, first add 'proxy_fetcher' to your Gemfile:
12
+
13
+ ```ruby
14
+ gem 'proxy_fetcher', '~> 0.1'
15
+ ```
16
+
17
+ And run:
18
+
19
+ ```sh
20
+ bundle install
21
+ ```
22
+
23
+ Otherwise simply install the gem:
24
+
25
+ ```sh
26
+ gem install proxy_fetcher -v '0.1'
27
+ ```
28
+
29
+ ## Example of usage
30
+
31
+ Get current proxy list:
32
+
33
+ ```ruby
34
+ manager = ProxyFetcher::Manager.new # will immediately load proxy list from the server
35
+ manager.proxies
36
+
37
+ #=> [#<ProxyFetcher::Proxy:0x00000002879680 @addr="97.77.104.22", @port="3128", @country="USA",
38
+ # @response_time="5217", @speed="48", @connection_time="100", @type="HTTP", @anonymity="High">, ... ]
39
+ ```
40
+
41
+ Get raw proxy URLs:
42
+
43
+ ```ruby
44
+ manager = ProxyFetcher::Manager.new
45
+ manager.raw_proxies
46
+
47
+ # => ["http://97.77.104.22:3128", "http://94.23.205.32:3128", "http://209.79.65.140:8080",
48
+ # "http://91.217.42.2:8080", "http://97.77.104.22:80", "http://165.234.102.177:8080", ...]
49
+ ```
50
+
51
+ If `ProxyFetcher::Manager` was already initialized somewhere, you can refresh the proxy list by calling `#refresh_list!` method:
52
+
53
+ ```ruby
54
+ manager.refresh_list!
55
+
56
+ #=> [#<ProxyFetcher::Proxy:0x00000002879680 @addr="97.77.104.22", @port="3128", @country="USA",
57
+ # @response_time="5217", @speed="48", @connection_time="100", @type="HTTP", @anonymity="High">, ... ]
58
+ ```
59
+
60
+ Every proxy is a `ProxyFetcher::Proxy` object that has next readers:
61
+
62
+ * `addr` (IP address)
63
+ * `port`
64
+ * `country` (USA or Brazil for example)
65
+ * `response_time` (5217 for example)
66
+ * `connection_time` (rank from 0 to 100, where 0 — slow, 100 — high)
67
+ * `speed` (rank from 0 to 100, where 0 — slow, 100 — high)
68
+ * `type` (URI schema, HTTP for example)
69
+ * `anonimity` (Low or High +KA for example)
70
+
71
+ Also you can call next instance method for every Proxy object:
72
+
73
+ * `connectable?` (whether proxy server is available)
74
+ * `http?` (whether proxy server has a HTTP protocol)
75
+ * `https?` (whether proxy server has a HTTPS protocol)
76
+ * `uri` (returns `URI::Generic` object)
77
+ * `url` (returns a formatted URL like "_http://IP:PORT_" )
78
+
79
+ If you wanna clear current proxy manager list from dead servers, you can just call `cleanup!` method:
80
+
81
+ ```ruby
82
+ manager.cleanup!
83
+ ```
84
+
85
+ To change open/read timeout for `cleanup!` and `connectable?` methods yu need to change ProxyFetcher::Manager config:
86
+
87
+ ```ruby
88
+ ProxyFetcher::Manager.config.read_timeout = 1 # default is 3
89
+ ProxyFetcher::Manager.config.open_timeout = 1# default is 3
90
+
91
+ manager = ProxyFetcher::Manager.new
92
+ manager.cleanup!
93
+ ```
94
+
95
+ ## Contributing
96
+
97
+ You are very welcome to help improve ProxyFetcher if you have suggestions for features that other people can use.
98
+
99
+ To contribute:
100
+
101
+ 1. Fork the project.
102
+ 2. Create your feature branch (`git checkout -b my-new-feature`).
103
+ 3. Implement your feature or bug fix.
104
+ 4. Add documentation for your feature or bug fix.
105
+ 5. Run <tt>rake doc:yard</tt>. If your changes are not 100% documented, go back to step 4.
106
+ 6. Add tests for your feature or bug fix.
107
+ 7. Run `rake` to make sure all tests pass.
108
+ 8. Commit your changes (`git commit -am 'Add new feature'`).
109
+ 9. Push to the branch (`git push origin my-new-feature`).
110
+ 10. Create new pull request.
111
+
112
+ Thanks.
113
+
114
+ ## License
115
+
116
+ proxy_fetcher gem is released under the [MIT License](http://www.opensource.org/licenses/MIT).
117
+
118
+ Copyright (c) 2017 Nikita Bulai (bulajnikita@gmail.com).
@@ -0,0 +1,6 @@
1
+ require 'bundler/gem_tasks'
2
+
3
+ require 'rspec/core/rake_task'
4
+ RSpec::Core::RakeTask.new(:spec)
5
+
6
+ task default: :spec
@@ -0,0 +1,66 @@
1
+ require 'uri'
2
+ require 'net/http'
3
+ require 'nokogiri'
4
+
5
+ require 'proxy_fetcher/configuration'
6
+ require 'proxy_fetcher/proxy'
7
+
8
+ module ProxyFetcher
9
+ class Manager
10
+ PROXY_PROVIDER_URL = 'http://proxylist.hidemyass.com/'.freeze
11
+
12
+ class << self
13
+ def config
14
+ @config ||= ProxyFetcher::Configuration.new
15
+ end
16
+ end
17
+
18
+ attr_reader :proxies
19
+
20
+ # refresh: true - load proxy list from the remote server on initialization
21
+ # refresh: false - just initialize the class, proxy list will be empty ([])
22
+ def initialize(refresh: true)
23
+ if refresh
24
+ refresh_list!
25
+ else
26
+ @proxies = []
27
+ end
28
+ end
29
+
30
+ # Update current proxy list from the provider
31
+ def refresh_list!
32
+ doc = Nokogiri::HTML(load_html(PROXY_PROVIDER_URL))
33
+ rows = doc.xpath('//table[@id="listable"]/tbody/tr')
34
+
35
+ @proxies = rows.map { |row| Proxy.new(row) }
36
+ end
37
+
38
+ alias_method :fetch!, :refresh_list!
39
+
40
+ # Clean current proxies list from dead proxies
41
+ def cleanup!
42
+ proxies.keep_if(&:connectable?)
43
+ end
44
+
45
+ alias_method :validate!, :cleanup!
46
+
47
+ # Just schema + host + port
48
+ def raw_proxies
49
+ proxies.map(&:url)
50
+ end
51
+
52
+ # No need to put all the attr_readers
53
+ def inspect
54
+ to_s
55
+ end
56
+
57
+ private
58
+
59
+ def load_html(url)
60
+ uri = URI.parse(url)
61
+ http = Net::HTTP.new(uri.host, uri.port)
62
+ response = http.get(uri.request_uri)
63
+ response.body
64
+ end
65
+ end
66
+ end
@@ -0,0 +1,10 @@
1
+ module ProxyFetcher
2
+ class Configuration
3
+ attr_reader :open_timeout, :read_timeout
4
+
5
+ def initialize
6
+ @open_timeout = 3
7
+ @read_timeout = 3
8
+ end
9
+ end
10
+ end
@@ -0,0 +1,94 @@
1
+ module ProxyFetcher
2
+ class Proxy
3
+ attr_reader :addr, :port, :country, :response_time,
4
+ :connection_time, :speed, :type, :anonimity
5
+
6
+ def initialize(html_row)
7
+ parse_row!(html_row)
8
+
9
+ self
10
+ end
11
+
12
+ def connectable?
13
+ connection = Net::HTTP.new(addr, port)
14
+ connection.open_timeout = ProxyFetcher::Manager.config.open_timeout
15
+ connection.read_timeout = ProxyFetcher::Manager.config.read_timeout
16
+
17
+ connection.start { |http| return true if http.request_head('/') }
18
+
19
+ false
20
+ rescue Timeout::Error, Errno::ECONNREFUSED, Errno::ECONNRESET, Errno::ECONNABORTED
21
+ false
22
+ end
23
+
24
+ def http?
25
+ type.casecmp('http').zero?
26
+ end
27
+
28
+ def https?
29
+ !http?
30
+ end
31
+
32
+ def uri
33
+ URI::Generic.build(host: addr, port: port, scheme: type)
34
+ end
35
+
36
+ def url
37
+ uri.to_s
38
+ end
39
+
40
+ private
41
+
42
+ def parse_row!(html)
43
+ html.xpath('td').each_with_index do |td, index|
44
+ case index
45
+ when 1
46
+ @addr = parse_addr(td)
47
+ when 2 then
48
+ @port = Integer(td.content.strip)
49
+ when 3 then
50
+ @country = td.content.strip
51
+ when 4
52
+ @response_time = parse_response_time(td)
53
+ @speed = parse_indicator_value(td)
54
+ when 5
55
+ @connection_time = parse_indicator_value(td)
56
+ when 6 then
57
+ @type = td.content.strip
58
+ when 7
59
+ @anonymity = td.content.strip
60
+ else
61
+ # nothing
62
+ end
63
+ end
64
+ end
65
+
66
+ def parse_addr(html)
67
+ good = []
68
+ bytes = []
69
+ css = html.at_xpath('span/style/text()').to_s
70
+ css.split.each { |l| good << $1 if l.match(/\.(.+?)\{.*inline/) }
71
+
72
+ html.xpath('span/span | span | span/text()').each do |span|
73
+ if span.is_a?(Nokogiri::XML::Text)
74
+ bytes << $1 if span.content.strip.match(/\.{0,1}(.+)\.{0,1}/)
75
+ elsif (span['style'] && span['style'] =~ /inline/) ||
76
+ (span['class'] && good.include?(span['class'])) ||
77
+ (span['class'] =~ /^[0-9]/)
78
+
79
+ bytes << span.content
80
+ end
81
+ end
82
+
83
+ bytes.join('.').gsub(/\.+/, '.')
84
+ end
85
+
86
+ def parse_response_time(html)
87
+ Integer(html.at_xpath('div')['rel'])
88
+ end
89
+
90
+ def parse_indicator_value(html)
91
+ Integer(html.at('.indicator').attr('style').match(/width: (\d+)%/i)[1])
92
+ end
93
+ end
94
+ end
@@ -0,0 +1,17 @@
1
+ module ProxyFetcher
2
+ def self.gem_version
3
+ Gem::Version.new VERSION::STRING
4
+ end
5
+
6
+ module VERSION
7
+ # Major version number
8
+ MAJOR = 0
9
+ # Minor version number
10
+ MINOR = 1
11
+ # Smallest version number
12
+ TINY = 0
13
+
14
+ # Full version number
15
+ STRING = [MAJOR, MINOR, TINY].compact.join('.')
16
+ end
17
+ end
@@ -0,0 +1,23 @@
1
+ $LOAD_PATH.unshift(File.join(File.dirname(__FILE__), 'lib'))
2
+
3
+ require 'proxy_fetcher/version'
4
+
5
+ Gem::Specification.new do |gem|
6
+ gem.name = 'proxy_fetcher'
7
+ gem.version = ProxyFetcher.gem_version
8
+ gem.date = '2017-05-19'
9
+ gem.summary = 'Ruby gem for dealing with proxy lists '
10
+ gem.description = 'This gem can help your Ruby application to make HTTP(S) requests ' \
11
+ 'from proxy server, fetching and validating current proxy lists from the HideMyAss service.'
12
+ gem.authors = ['Nikita Bulai']
13
+ gem.email = 'bulajnikita@gmail.com'
14
+ gem.require_paths = ['lib']
15
+ gem.files = `git ls-files`.split($RS)
16
+ gem.homepage = 'http://github.com/nbulaj/proxy_fetcher'
17
+ gem.license = 'MIT'
18
+ gem.required_ruby_version = '>= 2.2.2'
19
+
20
+ gem.add_runtime_dependency 'nokogiri', '~> 1.6', '>= 1.6'
21
+
22
+ gem.add_development_dependency 'rspec', '~> 3.5'
23
+ end
@@ -0,0 +1,31 @@
1
+ require 'spec_helper'
2
+
3
+ describe ProxyFetcher::Manager do
4
+ it 'loads proxy list on initialization by default' do
5
+ manager = described_class.new
6
+ expect(manager.proxies).not_to be_empty
7
+ end
8
+
9
+ it "doesn't load proxy list on initialization if `refresh` argument was set to false" do
10
+ manager = described_class.new(refresh: false)
11
+ expect(manager.proxies).to be_empty
12
+ end
13
+
14
+ it 'can returns Proxy objects' do
15
+ manager = described_class.new
16
+ expect(manager.proxies).to all(be_a(ProxyFetcher::Proxy))
17
+ end
18
+
19
+ it 'can returns raw proxies' do
20
+ manager = described_class.new
21
+ expect(manager.raw_proxies).to all(be_a(String))
22
+ end
23
+
24
+ it 'cleanup proxy list from dead servers' do
25
+ allow_any_instance_of(ProxyFetcher::Proxy).to receive(:connectable?).and_return(false)
26
+
27
+ manager = described_class.new
28
+
29
+ expect { manager.cleanup! }.to change { manager.proxies }.to([])
30
+ end
31
+ end
@@ -0,0 +1,27 @@
1
+ require 'spec_helper'
2
+
3
+ describe ProxyFetcher::Proxy do
4
+ before :all do
5
+ @manager = ProxyFetcher::Manager.new
6
+ end
7
+
8
+ let(:proxy) { @manager.proxies.first }
9
+
10
+ it 'checks schema' do
11
+ expect(proxy.http?).to be_falsey.or(be_truthy)
12
+ expect(proxy.https?).to be_falsey.or(be_truthy)
13
+ end
14
+
15
+ it 'checks connection status' do
16
+ allow_any_instance_of(ProxyFetcher::Proxy).to receive(:addr).and_return('192.168.1.1')
17
+ expect(proxy.connectable?).to be_falsey
18
+ end
19
+
20
+ it 'returns URI::Generic' do
21
+ expect(proxy.uri).to be_a(URI::Generic)
22
+ end
23
+
24
+ it 'returns URL' do
25
+ expect(proxy.url).to be_a(String)
26
+ end
27
+ end
@@ -0,0 +1,16 @@
1
+ if ENV['CI'] || ENV['TRAVIS'] || ENV['COVERALLS'] || ENV['JENKINS_URL']
2
+ require 'coveralls'
3
+ Coveralls.wear!
4
+ else
5
+ require 'simplecov'
6
+ SimpleCov.start
7
+ end
8
+
9
+ require 'bundler/setup'
10
+ Bundler.setup
11
+
12
+ require 'proxy_fetcher'
13
+
14
+ RSpec.configure do |config|
15
+ config.order = 'random'
16
+ end
metadata ADDED
@@ -0,0 +1,92 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: proxy_fetcher
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.1.0
5
+ platform: ruby
6
+ authors:
7
+ - Nikita Bulai
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2017-05-19 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: nokogiri
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '1.6'
20
+ - - ">="
21
+ - !ruby/object:Gem::Version
22
+ version: '1.6'
23
+ type: :runtime
24
+ prerelease: false
25
+ version_requirements: !ruby/object:Gem::Requirement
26
+ requirements:
27
+ - - "~>"
28
+ - !ruby/object:Gem::Version
29
+ version: '1.6'
30
+ - - ">="
31
+ - !ruby/object:Gem::Version
32
+ version: '1.6'
33
+ - !ruby/object:Gem::Dependency
34
+ name: rspec
35
+ requirement: !ruby/object:Gem::Requirement
36
+ requirements:
37
+ - - "~>"
38
+ - !ruby/object:Gem::Version
39
+ version: '3.5'
40
+ type: :development
41
+ prerelease: false
42
+ version_requirements: !ruby/object:Gem::Requirement
43
+ requirements:
44
+ - - "~>"
45
+ - !ruby/object:Gem::Version
46
+ version: '3.5'
47
+ description: This gem can help your Ruby application to make HTTP(S) requests from
48
+ proxy server, fetching and validating current proxy lists from the HideMyAss service.
49
+ email: bulajnikita@gmail.com
50
+ executables: []
51
+ extensions: []
52
+ extra_rdoc_files: []
53
+ files:
54
+ - ".gitignore"
55
+ - ".travis.yml"
56
+ - Gemfile
57
+ - LICENSE
58
+ - README.md
59
+ - Rakefile
60
+ - lib/proxy_fetcher.rb
61
+ - lib/proxy_fetcher/configuration.rb
62
+ - lib/proxy_fetcher/proxy.rb
63
+ - lib/proxy_fetcher/version.rb
64
+ - proxy_fetcher.gemspec
65
+ - spec/proxy_fetcher/manager_spec.rb
66
+ - spec/proxy_fetcher/proxy_spec.rb
67
+ - spec/spec_helper.rb
68
+ homepage: http://github.com/nbulaj/proxy_fetcher
69
+ licenses:
70
+ - MIT
71
+ metadata: {}
72
+ post_install_message:
73
+ rdoc_options: []
74
+ require_paths:
75
+ - lib
76
+ required_ruby_version: !ruby/object:Gem::Requirement
77
+ requirements:
78
+ - - ">="
79
+ - !ruby/object:Gem::Version
80
+ version: 2.2.2
81
+ required_rubygems_version: !ruby/object:Gem::Requirement
82
+ requirements:
83
+ - - ">="
84
+ - !ruby/object:Gem::Version
85
+ version: '0'
86
+ requirements: []
87
+ rubyforge_project:
88
+ rubygems_version: 2.6.3
89
+ signing_key:
90
+ specification_version: 4
91
+ summary: Ruby gem for dealing with proxy lists
92
+ test_files: []