janis 0.1.2 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: dd867caecf4745d2c6ebaacf88f2d08d230a5cae
4
- data.tar.gz: 1cf3fbca12716b54a3eb2221046ec5deac906bdb
3
+ metadata.gz: d5f0f9b8e3e10fbb572bb3d39194c58edbeaea5b
4
+ data.tar.gz: eed32f6e8a3d1c1542286a024c26f9bd17a2e886
5
5
  SHA512:
6
- metadata.gz: bef546eb4304e212253765902496d132bcea5baaa2777a737fe6fc75716de3a5af016877faac9c7b0801b9a263a04a0c2ce5eba73d3ef9a0dc33ef39346b1140
7
- data.tar.gz: 5054cd249222acdbe2322cc8feeec1d1a6bcb90e1fe98abf83aa11c4a1949d27764cdcb7bdf7c5ae1d7e9d574b30e99c8c69ae9b07013c8a55aa5d7d2efcc124
6
+ metadata.gz: c32861860e3bccbaf2027dc633eb56fc5ea3b6923a56f87792ab6df020e3a8b685ddc2d755e13aebfc6a8bf911130ccf48303cedadb3ec89657833e6aa00c167
7
+ data.tar.gz: 084faba3c410b9793f0665b8ac1b780d704c599326fec582d4a42f9122e3745ee3f34dd60056dc94293acadaffda2d1a6826fa1cd21eb85ba9da58968f54c6e8
data/README.md CHANGED
@@ -1,3 +1,8 @@
1
+ #####Dependency issues are welcome to be reported in this repo at Issues section. Please include:
2
+ 1. Your Operating System + architecture (Example: "Ubuntu 32 bits").
3
+ 2. Full error backtrace.
4
+ 3. Your ruby version (you can see it by typing "ruby -v" in your command prompt.
5
+
1
6
  # Janis
2
7
 
3
8
  Janis will help you find proxy servers quickly, by grabbing them from a list of many (hopefully available and up-to-date) proxy listing websites. You can also tell Janis to parse from a specific website and it will do it if it knows how to. If it doesn't you can improve it by adding new Parsers (more on this on Usage section).
@@ -17,6 +22,17 @@ And then execute:
17
22
  Or install it yourself as:
18
23
 
19
24
  $ gem install janis
25
+
26
+ Then download the latest version of PhantomJS from http://phantomjs.org/download.html, according
27
+ to your platform.
28
+
29
+ Place the PhantomJs executable somewhere in your PATH.
30
+
31
+ On Unix, you can see your path from your shell by typing '$PATH'.
32
+ Common folders to place phantomjs binary in are /usr/bin and usr/local/bin.
33
+
34
+ On Windows, you can consult your PATH from your system settings in "Environment Variables" section.
35
+ C:\windows\system32\ is a common location you can place phantomjs.exe in.
20
36
 
21
37
  ## Usage
22
38
  From your own script/app or from irb, require the gem with:
@@ -85,8 +101,6 @@ If there's a proxy listing website you consider reliable and up-to-date which yo
85
101
 
86
102
  After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake test` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
87
103
 
88
- To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
89
-
90
104
  ## Contributing
91
105
 
92
106
  Bug reports and pull requests are welcome on GitHub at https://github.com/mgiagante/janis.
data/Rakefile CHANGED
@@ -2,9 +2,9 @@ require "bundler/gem_tasks"
2
2
  require "rake/testtask"
3
3
 
4
4
  Rake::TestTask.new(:test) do |t|
5
- t.libs << "test"
5
+ t.libs << "spec"
6
6
  t.libs << "lib"
7
- t.test_files = FileList['test/**/*_test.rb']
7
+ t.test_files = FileList['spec/**/*_spec.rb']
8
8
  end
9
9
 
10
10
  task :default => :test
data/janis.gemspec CHANGED
@@ -24,4 +24,5 @@ Gem::Specification.new do |spec|
24
24
  spec.add_development_dependency "pry"
25
25
  spec.add_runtime_dependency "nokogiri", ">= 1.6"
26
26
  spec.add_runtime_dependency "poltergeist"
27
+ spec.add_runtime_dependency "net-ping"
27
28
  end
data/lib/janis.rb CHANGED
@@ -1,37 +1,67 @@
1
1
  require 'janis/version'
2
2
  require 'janis/parsing'
3
- require 'yaml'
3
+ require 'janis/testing'
4
+
5
+
6
+
7
+ # TODO: Sites to be supported for scraping
8
+ # http://incloak.es/proxy-list/
9
+ # http://spys.ru/free-proxy-list/
10
+ # http://www.samair.ru/proxy/
11
+ # http://www.proxys.com.ar/
4
12
 
5
13
  module Janis
6
14
 
7
15
  IP_PORT_SEPARATOR = ':'
8
- PROXY_LIST_PATH = File.dirname(__FILE__) + '/../proxy_server_list.yml'
9
16
 
10
- def self.find(amount, path_to_list = PROXY_LIST_PATH)
17
+ def self.find(amount, opts = {})
11
18
 
12
- proxy_list = YAML.load_file("#{path_to_list}").split(' ')
13
- results = []
14
-
15
- proxy_list.each do |url|
16
- if results.size < amount
17
- parsed_from_url = Parsing.parse(url) unless url.include?('#') # Elements should look like ["1.1.1.1:8080", "2.2.2.2:9090"]
18
- results_from_this_url = parsed_from_url.map { |entry| convert_to_hash(entry) }
19
- # Result should look like [ { ip: "1.1.1.1", port: "8080" }, { ip: "2.2.2.2", port: "9090" } ]
20
- results += results_from_this_url
19
+ # Makes sure opts[:websites] is a subset of the supported websites. Otherwise, it takes the whole list.
20
+ if opts[:websites]
21
+ opts[:websites].each do |website|
22
+ raise "#{website} is not supported!" unless Janis.supported_websites.include?(website)
21
23
  end
22
- end
24
+ websites = opts[:websites]
25
+ else
26
+ websites = Janis.supported_websites
27
+ end
23
28
 
24
- results[0..amount - 1]
25
-
29
+ total_results = []
30
+
31
+ websites.each do |website|
32
+ if total_results.size < amount
33
+ new_results = Parsing.parse_from(website).map { |entry| build_proxy_hash(entry, website) }
34
+ total_results += new_results
35
+ end
36
+ end
37
+ opts[:criteria] ? Janis::Testing.filter_results(criteria, total_results[0..amount - 1]) : total_results[0..amount -1]
26
38
  end
27
-
39
+
40
+ def self.supported_websites
41
+ Janis::Parsing::SpecificParsers::ProxyWebsiteParser.subclasses.map { |klass| self.website_name_for(klass.to_s)}
42
+ end
43
+
28
44
  private
29
45
 
30
- def self.convert_to_hash(proxy_string)
46
+ def self.build_proxy_hash(proxy_string, website)
31
47
  {
32
48
  ip: proxy_string.split(IP_PORT_SEPARATOR).first,
33
- port: proxy_string.split(IP_PORT_SEPARATOR).last
49
+ port: proxy_string.split(IP_PORT_SEPARATOR).last,
50
+ source: website
34
51
  }
35
52
  end
36
53
 
54
+ #TODO: This should be probably moved to a name helper module
55
+ def self.website_name_for(parser_klass_name)
56
+ parser_klass_name.gsub(/::/, '/').
57
+ gsub(/([A-Z]+)([A-Z][a-z])/,'\1_\2').
58
+ gsub(/([a-z\d])([A-Z])/,'\1_\2').
59
+ tr("-", "_").
60
+ gsub("_Parser","").
61
+ split('/').
62
+ last.
63
+ downcase.to_sym
64
+ #TODO: converts a parser class name to a :symbol_in_snake_case website name
65
+ end
66
+
37
67
  end
@@ -7,14 +7,23 @@ module Janis
7
7
 
8
8
  class ParserFactory
9
9
 
10
- def self.create_parser_for(url)
11
- parsers = Janis::Parsing::SpecificParsers::ProxyWebsiteParser.subclasses
12
- raise "No parsers available!" if parsers.empty?
13
- parser_class = parsers.find do |klass|
14
- klass.url == url
15
- end
16
- raise "No parser available for #{url}" unless parser_class
17
- parser_class.new
10
+ attr_reader :parser_klasses
11
+
12
+ def initialize
13
+ @parser_klasses = Janis::Parsing::SpecificParsers::ProxyWebsiteParser.subclasses
14
+ end
15
+
16
+ def create_parser(website_name)
17
+ namespacing_prefix = "Janis::Parsing::SpecificParsers::"
18
+ @parser_klasses.find { |parser_klass| parser_klass.to_s == namespacing_prefix + parser_klass_name_for(website_name) }.new
19
+ end
20
+
21
+ private
22
+
23
+ #TODO: This should be probably moved to a name helper module
24
+ # website_name should be a :symbol_in_snake_lower_case. eg: :hide_my_ass will mean HideMyAssParser
25
+ def parser_klass_name_for(website_name)
26
+ website_name.to_s.split('_').map { |word| word.capitalize}.join + "Parser"
18
27
  end
19
28
 
20
29
  end
data/lib/janis/parsing.rb CHANGED
@@ -5,8 +5,8 @@ module Janis
5
5
 
6
6
  module Parsing
7
7
 
8
- def self.parse(url)
9
- ParserFactory.create_parser_for(url).parse
8
+ def self.parse_from(website)
9
+ ParserFactory.new.create_parser(website).parse
10
10
  end
11
11
 
12
12
  end
@@ -4,6 +4,18 @@ module Janis
4
4
 
5
5
  module SpecificParsers
6
6
 
7
+ class Proxy
8
+
9
+ def initialize(attribs = {})
10
+ @attribs = attribs
11
+ end
12
+
13
+ def method_missing?(message)
14
+ @attribs[message] || super
15
+ end
16
+
17
+ end
18
+
7
19
  class ProxyWebsiteParser
8
20
 
9
21
  attr_reader :url
@@ -1,5 +1,5 @@
1
1
  require 'nokogiri';
2
- require './lib/janis/specific_parsers/parsing_tools/capybara_with_phantom_js'
2
+ require_relative 'parsing_tools/capybara_with_phantom_js'
3
3
 
4
4
  module Janis
5
5
 
@@ -0,0 +1,21 @@
1
+ require 'net/ping'
2
+
3
+ module Janis
4
+
5
+ module Testing
6
+
7
+ def self.connectable?(proxy)
8
+ host, port = proxy.split(':')
9
+ return Net::Ping::TCP.new(host, port).ping
10
+ end
11
+
12
+ def self.filter_results(criteria = [], results)
13
+ criteria.each do |criterion| # A criterion is a method that returns true or false about a proxy, like #connectable?
14
+ results.select! { |proxy| Janis::Testing.send(criterion, "#{proxy[:ip]}:#{proxy[:port]}") }
15
+ end
16
+ results
17
+ end
18
+
19
+ end
20
+
21
+ end
data/lib/janis/version.rb CHANGED
@@ -1,3 +1,3 @@
1
1
  module Janis
2
- VERSION = "0.1.2"
2
+ VERSION = "0.1.3"
3
3
  end
metadata CHANGED
@@ -1,97 +1,111 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: janis
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.2
4
+ version: 0.1.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - Mariano Giagante
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2016-05-15 00:00:00.000000000 Z
11
+ date: 2016-08-08 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
15
15
  requirement: !ruby/object:Gem::Requirement
16
16
  requirements:
17
- - - "~>"
17
+ - - ~>
18
18
  - !ruby/object:Gem::Version
19
19
  version: '1.10'
20
20
  type: :development
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
- - - "~>"
24
+ - - ~>
25
25
  - !ruby/object:Gem::Version
26
26
  version: '1.10'
27
27
  - !ruby/object:Gem::Dependency
28
28
  name: rake
29
29
  requirement: !ruby/object:Gem::Requirement
30
30
  requirements:
31
- - - "~>"
31
+ - - ~>
32
32
  - !ruby/object:Gem::Version
33
33
  version: '10.0'
34
34
  type: :development
35
35
  prerelease: false
36
36
  version_requirements: !ruby/object:Gem::Requirement
37
37
  requirements:
38
- - - "~>"
38
+ - - ~>
39
39
  - !ruby/object:Gem::Version
40
40
  version: '10.0'
41
41
  - !ruby/object:Gem::Dependency
42
42
  name: minitest
43
43
  requirement: !ruby/object:Gem::Requirement
44
44
  requirements:
45
- - - ">="
45
+ - - '>='
46
46
  - !ruby/object:Gem::Version
47
47
  version: '5.4'
48
48
  type: :development
49
49
  prerelease: false
50
50
  version_requirements: !ruby/object:Gem::Requirement
51
51
  requirements:
52
- - - ">="
52
+ - - '>='
53
53
  - !ruby/object:Gem::Version
54
54
  version: '5.4'
55
55
  - !ruby/object:Gem::Dependency
56
56
  name: pry
57
57
  requirement: !ruby/object:Gem::Requirement
58
58
  requirements:
59
- - - ">="
59
+ - - '>='
60
60
  - !ruby/object:Gem::Version
61
61
  version: '0'
62
62
  type: :development
63
63
  prerelease: false
64
64
  version_requirements: !ruby/object:Gem::Requirement
65
65
  requirements:
66
- - - ">="
66
+ - - '>='
67
67
  - !ruby/object:Gem::Version
68
68
  version: '0'
69
69
  - !ruby/object:Gem::Dependency
70
70
  name: nokogiri
71
71
  requirement: !ruby/object:Gem::Requirement
72
72
  requirements:
73
- - - ">="
73
+ - - '>='
74
74
  - !ruby/object:Gem::Version
75
75
  version: '1.6'
76
76
  type: :runtime
77
77
  prerelease: false
78
78
  version_requirements: !ruby/object:Gem::Requirement
79
79
  requirements:
80
- - - ">="
80
+ - - '>='
81
81
  - !ruby/object:Gem::Version
82
82
  version: '1.6'
83
83
  - !ruby/object:Gem::Dependency
84
84
  name: poltergeist
85
85
  requirement: !ruby/object:Gem::Requirement
86
86
  requirements:
87
- - - ">="
87
+ - - '>='
88
88
  - !ruby/object:Gem::Version
89
89
  version: '0'
90
90
  type: :runtime
91
91
  prerelease: false
92
92
  version_requirements: !ruby/object:Gem::Requirement
93
93
  requirements:
94
- - - ">="
94
+ - - '>='
95
+ - !ruby/object:Gem::Version
96
+ version: '0'
97
+ - !ruby/object:Gem::Dependency
98
+ name: net-ping
99
+ requirement: !ruby/object:Gem::Requirement
100
+ requirements:
101
+ - - '>='
102
+ - !ruby/object:Gem::Version
103
+ version: '0'
104
+ type: :runtime
105
+ prerelease: false
106
+ version_requirements: !ruby/object:Gem::Requirement
107
+ requirements:
108
+ - - '>='
95
109
  - !ruby/object:Gem::Version
96
110
  version: '0'
97
111
  description: It uses a source list with several testes websites to provide proxy servers
@@ -102,8 +116,8 @@ executables: []
102
116
  extensions: []
103
117
  extra_rdoc_files: []
104
118
  files:
105
- - ".gitignore"
106
- - ".travis.yml"
119
+ - .gitignore
120
+ - .travis.yml
107
121
  - Gemfile
108
122
  - LICENSE
109
123
  - README.md
@@ -118,11 +132,10 @@ files:
118
132
  - lib/janis/specific_parsers/hide_my_ass.rb
119
133
  - lib/janis/specific_parsers/parsing_tools/capybara_with_phantom_js.rb
120
134
  - lib/janis/specific_parsers/proxy-list_org.rb
121
- - lib/janis/specific_parsers/simple.rb
122
135
  - lib/janis/specific_parsers/template.rb
136
+ - lib/janis/testing.rb
123
137
  - lib/janis/validations.rb
124
138
  - lib/janis/version.rb
125
- - proxy_server_list.yml
126
139
  homepage: http://www.github.com/mgiagante/janis
127
140
  licenses:
128
141
  - MIT
@@ -133,12 +146,12 @@ require_paths:
133
146
  - lib
134
147
  required_ruby_version: !ruby/object:Gem::Requirement
135
148
  requirements:
136
- - - ">="
149
+ - - '>='
137
150
  - !ruby/object:Gem::Version
138
151
  version: '0'
139
152
  required_rubygems_version: !ruby/object:Gem::Requirement
140
153
  requirements:
141
- - - ">="
154
+ - - '>='
142
155
  - !ruby/object:Gem::Version
143
156
  version: '0'
144
157
  requirements: []
@@ -148,4 +161,3 @@ signing_key:
148
161
  specification_version: 4
149
162
  summary: Janis grabs proxy servers from many websites for you!
150
163
  test_files: []
151
- has_rdoc:
@@ -1,23 +0,0 @@
1
- module Janis
2
-
3
- module Parsing
4
-
5
- module SpecificParsers
6
- class SimpleParser < ProxyWebsiteParser
7
- PROXY_REGEX = /\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}/
8
- def initialize
9
- super
10
- @html_doc = obtain_html_doc
11
- end
12
- def parse
13
- @result ||= @html_doc.to_s.scan(PROXY_REGEX)
14
- end
15
- def self.url
16
- 'file://./test/html/simple.html'
17
- end
18
- end
19
- end
20
-
21
- end
22
-
23
- end
@@ -1,5 +0,0 @@
1
- http://proxy-list.org
2
- #http://incloak.es/proxy-list/
3
- #http://spys.ru/free-proxy-list/
4
- #http://www.samair.ru/proxy/
5
- #http://www.proxys.com.ar/