ripli 0.3.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/CODE_OF_CONDUCT.md +74 -0
- data/LICENSE +21 -0
- data/README.md +117 -0
- data/bin/ripli.rb +21 -0
- data/lib/ripli.rb +12 -0
- data/lib/ripli/customparser.rb +47 -0
- data/lib/ripli/customparser_template.rb +55 -0
- data/lib/ripli/hidemyname.rb +50 -0
- data/lib/ripli/proxyscan.rb +25 -0
- data/lib/ripli/proxyscrape.rb +25 -0
- data/ripli.gemspec +30 -0
- metadata +128 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: f9f70f95bd2e6d69f3bedd949f896525fe1d103081cb4bb062190f27c212bfc5
|
4
|
+
data.tar.gz: 958852957eb8d2fa589a309b9f57a482898a7b25fa85e59cfc10fa5715374dc8
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: e68153398914f40cb13c9a1f2dbbe8acddeb0255da488b1eed9ca87d3315488283d8f48da9f0f635c9915410cf7f7c027d0ce6a0146e501825683df2b154fbd3
|
7
|
+
data.tar.gz: caee5cd925d4a4a3560938710a3f2341079bc7f9f09311268281ebb03c8d12d793b6f276a2873543f44f97ab2825e9acf17c92179a9461795eaa49efafbdd9d0
|
data/CODE_OF_CONDUCT.md
ADDED
@@ -0,0 +1,74 @@
|
|
1
|
+
# Contributor Covenant Code of Conduct
|
2
|
+
|
3
|
+
## Our Pledge
|
4
|
+
|
5
|
+
In the interest of fostering an open and welcoming environment, we as
|
6
|
+
contributors and maintainers pledge to making participation in our project and
|
7
|
+
our community a harassment-free experience for everyone, regardless of age, body
|
8
|
+
size, disability, ethnicity, gender identity and expression, level of experience,
|
9
|
+
nationality, personal appearance, race, religion, or sexual identity and
|
10
|
+
orientation.
|
11
|
+
|
12
|
+
## Our Standards
|
13
|
+
|
14
|
+
Examples of behavior that contributes to creating a positive environment
|
15
|
+
include:
|
16
|
+
|
17
|
+
* Using welcoming and inclusive language
|
18
|
+
* Being respectful of differing viewpoints and experiences
|
19
|
+
* Gracefully accepting constructive criticism
|
20
|
+
* Focusing on what is best for the community
|
21
|
+
* Showing empathy towards other community members
|
22
|
+
|
23
|
+
Examples of unacceptable behavior by participants include:
|
24
|
+
|
25
|
+
* The use of sexualized language or imagery and unwelcome sexual attention or
|
26
|
+
advances
|
27
|
+
* Trolling, insulting/derogatory comments, and personal or political attacks
|
28
|
+
* Public or private harassment
|
29
|
+
* Publishing others' private information, such as a physical or electronic
|
30
|
+
address, without explicit permission
|
31
|
+
* Other conduct which could reasonably be considered inappropriate in a
|
32
|
+
professional setting
|
33
|
+
|
34
|
+
## Our Responsibilities
|
35
|
+
|
36
|
+
Project maintainers are responsible for clarifying the standards of acceptable
|
37
|
+
behavior and are expected to take appropriate and fair corrective action in
|
38
|
+
response to any instances of unacceptable behavior.
|
39
|
+
|
40
|
+
Project maintainers have the right and responsibility to remove, edit, or
|
41
|
+
reject comments, commits, code, wiki edits, issues, and other contributions
|
42
|
+
that are not aligned to this Code of Conduct, or to ban temporarily or
|
43
|
+
permanently any contributor for other behaviors that they deem inappropriate,
|
44
|
+
threatening, offensive, or harmful.
|
45
|
+
|
46
|
+
## Scope
|
47
|
+
|
48
|
+
This Code of Conduct applies both within project spaces and in public spaces
|
49
|
+
when an individual is representing the project or its community. Examples of
|
50
|
+
representing a project or community include using an official project e-mail
|
51
|
+
address, posting via an official social media account, or acting as an appointed
|
52
|
+
representative at an online or offline event. Representation of a project may be
|
53
|
+
further defined and clarified by project maintainers.
|
54
|
+
|
55
|
+
## Enforcement
|
56
|
+
|
57
|
+
Instances of abusive, harassing, or otherwise unacceptable behavior may be
|
58
|
+
reported by contacting the project team at royalunited@protonmail.ch. All
|
59
|
+
complaints will be reviewed and investigated and will result in a response that
|
60
|
+
is deemed necessary and appropriate to the circumstances. The project team is
|
61
|
+
obligated to maintain confidentiality with regard to the reporter of an incident.
|
62
|
+
Further details of specific enforcement policies may be posted separately.
|
63
|
+
|
64
|
+
Project maintainers who do not follow or enforce the Code of Conduct in good
|
65
|
+
faith may face temporary or permanent repercussions as determined by other
|
66
|
+
members of the project's leadership.
|
67
|
+
|
68
|
+
## Attribution
|
69
|
+
|
70
|
+
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
|
71
|
+
available at [https://contributor-covenant.org/version/1/4][version]
|
72
|
+
|
73
|
+
[homepage]: https://contributor-covenant.org
|
74
|
+
[version]: https://contributor-covenant.org/version/1/4/
|
data/LICENSE
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
The MIT License (MIT)
|
2
|
+
|
3
|
+
Copyright (c) 2020 linuxander
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
11
|
+
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
13
|
+
all copies or substantial portions of the Software.
|
14
|
+
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
21
|
+
THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,117 @@
|
|
1
|
+
# Ripli
|
2
|
+
|
3
|
+
---
|
4
|
+
## Ruby Proxychains List Downloader
|
5
|
+
|
6
|
+
`Ripli` scrap proxy servers from internet and save them in **proxychains format**
|
7
|
+
There's so many free proxies over the globe, but it's dam hard (read impossible) to find
|
8
|
+
a quality source with list in format `type address port` instead of `address:port`.
|
9
|
+
It's created as Ruby gem, but it's main purpose is simple terminal scripting.
|
10
|
+
This gem is start for hacktoberfest 2020, to create something useful during the month.
|
11
|
+
What Ripli will do, and how to help and contribute, read below.
|
12
|
+
|
13
|
+
---
|
14
|
+
## Installation
|
15
|
+
|
16
|
+
```bash
|
17
|
+
git clone https://www.github.com/cybersecrs/ripli && cd ripli && bin/setup
|
18
|
+
```
|
19
|
+
|
20
|
+
This will download gem from github, and install dependecies. What it actually install is
|
21
|
+
`gem 'mechanize', gem 'optimist'` and `proxychains` if not installed.
|
22
|
+
|
23
|
+
---
|
24
|
+
## Usage
|
25
|
+
|
26
|
+
Executing `bin/ripli.rb` start scrap and download *proxychains lists.* Log folder is created
|
27
|
+
inside root directory, containing folders with proxy lists for each scrapped site. At the end,
|
28
|
+
proxy lists are combined from all sources, again separated by type, inside log directory.
|
29
|
+
|
30
|
+
---
|
31
|
+
## Development
|
32
|
+
|
33
|
+
Ripli is imagined to be a module with separate classes for each proxy site. To keep it simple and extendable each class
|
34
|
+
must run and save separate lists based on proxy type.
|
35
|
+
|
36
|
+
Proxies max timeout is 1000ms, elite only.
|
37
|
+
|
38
|
+
---
|
39
|
+
## TO-DO
|
40
|
+
|
41
|
+
* Check each proxy for country, anonymity, dns-leak and *bad-proxy*.
|
42
|
+
* Write tutorials in Serbian language about web scrapping with Ruby
|
43
|
+
|
44
|
+
---
|
45
|
+
## Contributing
|
46
|
+
|
47
|
+
This project is designed for [HACKTOBERFEST-2020](https://hacktoberfest.digitalocean.com/), starting from *OCTOBER, 3TH*.
|
48
|
+
Pull requests will be accepted on GitHub at https://github.com/cybersecrs/ripli/, if you follow the rules.
|
49
|
+
This project is intended to be fun and simple project that explain beginners the `Power of Ruby`,
|
50
|
+
`internet-security basics` and `web-scrapping`, while creating something really useful for all web users.
|
51
|
+
Contributors are expected to adhere to the [code of conduct](https://github.com/[USERNAMEcybersecrs/ripli/blob/master/CODE_OF_CONDUCT.md).
|
52
|
+
|
53
|
+
---
|
54
|
+
## Contact
|
55
|
+
|
56
|
+
- E-mail address: linuxander88@gmail.com
|
57
|
+
- GitHub Page: [Ripli web page](https://www.cybersecrs.github.io/ripli)
|
58
|
+
- PGP key:
|
59
|
+
```
|
60
|
+
-----BEGIN PGP PUBLIC KEY BLOCK-----
|
61
|
+
|
62
|
+
mQINBF9zgAUBEADIfQF44fJ7CV5JMbb5PsV+vMPXU1rxpb3IL4ttOrroS0O8cS4s
|
63
|
+
Vwu/3jtMcyXE4fVr7pg/v6EqgeONlUUMu1tC1pil2j2Tg02zZxPcsqsLU1KyEymv
|
64
|
+
aCSYDb1Z9ocmso6idfdHEsDYymrJOTi7knWOBrxtZMFa0QzhKQR7kTYCssX9s9w8
|
65
|
+
gi7EZzj+UdMXTJQM7zOsvlomLtN/+64M0RTyGvdkKnpsSXJ9vwnhx04PdZt5GVlx
|
66
|
+
HcaaRs6FKq6DcH0uuFBYbJArpiS/VQ642pURx8HwE+xJ7MbjjcgMUu9ufs9KdhWP
|
67
|
+
E27LLoVQNEIAYEbnYLOGCDWLtOlNbfbOK4wJRS8zotRLcaJE3lSlmY/nGm71TS3T
|
68
|
+
1NilopayxPHg0WYT3OX1r8j4DuC9Sk83pWw15NLM/1kr2Yx6j3yF+jydBBaDnTJE
|
69
|
+
0EQ5I62V+ZsXznwjy1I8hCEst0lUOE4KuR9P3ejYAWQow6rQbolCjuLQ8Kg/O0RG
|
70
|
+
eme77nKTkzqWi6TyBUKT7w4x+G4GZ3ibfmHNGSPgH89lyHARPjy/S1ytXL856ZJN
|
71
|
+
a7T862r4YThA3tRz5bBOLM7Zl1NqNxYNK3eTndmThHNp8TfX0SODYvMsiKu0frEm
|
72
|
+
OQjRl6J3aWmpET0fpoblOmn7Z0cS58Vq4sgLN8EGzcMjn+p+Oys4DjGopwARAQAB
|
73
|
+
tDpDeWJlcnNlY1JTIChDeWJlcnNlY3VyaXR5IFNlcmJpYSkgPGxpbnV4YW5kZXJA
|
74
|
+
c2VjbWFpbC5wcm8+iQJTBBMBCgA9FiEE+CxAzYEgOj/ZNNcBI55/Xhe4yzkFAl9z
|
75
|
+
gAUCGwMFCQHguRsFCwkIBwMFFQoJCAsEFgIBAAIeAQIXgAAKCRAjnn9eF7jLOZCL
|
76
|
+
D/9HkoCCAOra+6rOl5sN0f55I3x9byrHvHaGlUadeh0koltit1CL+AtSeDkQeP/6
|
77
|
+
pVsasUtwvVaD0zcm+QAEe4bZDvqhUjs7zCH4hOGWW2ZASpYVDjTBF1/GvDkQPfKG
|
78
|
+
lybM9scYKy17/f6/NOZLQAEio0h2Ib89Xj13EJ1ayXHhiebl8quGuoXdbWwNvu92
|
79
|
+
LDMXNj6fArcNyvt/ghm2NrQnntlKr0WOlFnmvDxOhC41X4NT5IpOo2G805BRAT+M
|
80
|
+
nqdF4O54orcCtadkHRzl3JuYpwyqR7aM1b5OTDvFOC/mYsB2AfxuhQNDHui8bKp4
|
81
|
+
qdAOfNR6t5MR8aI7wQIU685LXqMtGa97/zXM+RmD0qNw/AlzK5TRzQHvlrm6w61p
|
82
|
+
g4GSH4dhVZZyUXT2WEF+mZQ0UEv6oegXUuql4eHcSXo0dlQS0nN/7HslbAVA8hle
|
83
|
+
/OyZvlP6+Mn4vligo1qG8FqZSDzisI0OO45L10CN9VCoeDG75EbWzl+r+mZC9OZN
|
84
|
+
C9eVVfa7iZlSOQkj9A58ygHecgq5iSJniHyJhUOXjvl2zs4v/uekAyviL9kFlsLm
|
85
|
+
lnuWQobmXmYZ/65Bz6omkDuu6rLpERJvkSECak1yeApFiqc+ph/2i9wJhKH8yf6V
|
86
|
+
ekT5ThdPV8YYMaseH0u6zpTVOH0nvfv7srKCLqDKkCnuvLkCDQRfc4AFARAAxN0P
|
87
|
+
HpJ/8In2ohGDnXXLPBrH4X8/L3ZXkO+rmpAwtO7jPxBNtm5dK7iec7WThtpJeDtC
|
88
|
+
zCaa4PdPWmf/UUPTqn5+mpaoDatC4zZLVGdt/S4qwzCv6akk4KoCQqVfxI+XaTqJ
|
89
|
+
xjTotHgxrDsv4IvqlKDgII8xQNmJI29VX+Hy4j7j/BvoYixSH9/e5t4SidVQoBx7
|
90
|
+
/aD46Ho6UNRXjsOIGFJScVOzuTY1JwO+Fjz7U0hM4xyeP62YgYuL80DpZ1maKqwe
|
91
|
+
RP8bnjxC+Js9uZ8DYdgp7GsJgWVF8UdfVI4/uxrL0A+hFTvxgrJvBv2mN2bAHIBb
|
92
|
+
ysFcj38/GsbKbPfNyNyD/TALH/aW/Vk5tVwrfQd3JJ6BB49oSaiIwk1jlSTfkBSn
|
93
|
+
yN6C6kMO45P85oLc9FyQVDvFHkYhb5vOkfsIF5eP4TNSnfHES/5TM5bk0v/Ygsp+
|
94
|
+
IvMDOinWI79fKsCcmMNCe3QccYQLjommS24ebeNUoqKjOq0A+cOYsS8KTLIshwIr
|
95
|
+
Nt5zIfOqHHTk4dL+XjkhO58FLYReRNp1hprnUQ0P2u7FEnvdluchKLfd0Oqnjrgf
|
96
|
+
qGHIRcnp+/+81E48ZkIzveSnRbUieQriuDtiE15SQ3erPjjkPd6gqP263nuybckS
|
97
|
+
+KP/jkePH06iprWZhVvfzt/6zaiHJ22xBDWOfssAEQEAAYkCPAQYAQoAJhYhBPgs
|
98
|
+
QM2BIDo/2TTXASOef14XuMs5BQJfc4AFAhsMBQkB4LkbAAoJECOef14XuMs5Z7IQ
|
99
|
+
AIKTeuZZUaNOTfse5GYrYfDQtZhuS8w0zOEpZRluKphCTgZZhpbCAwDeLe6YQom0
|
100
|
+
Wu7YZEBK6MGNGXyZHsjiOoSaeYQqF0quJFwlSSr5vwBZ4xUWj4bl01fHxNX032Qt
|
101
|
+
cqldSpp1sAu9BebJVloM7tUvBC41WyvWsZh6JqRnM1x6QH7rwcg8aWDj9O4k8YA3
|
102
|
+
weNkQ0cJXD2J6LE8bRyN/MF4OYLACbZyXzhRd1Zt/NADZo5nzGkpNCentsUODI9H
|
103
|
+
RAzEa1fkRvqsuSoMfJmw8aXPmpsX6eieIbToS/SJ2eSwV3TjE+V3jM+dfWv0HkNK
|
104
|
+
v7kK393wuCraRhX2IMhA8G/bfro34fNgtmjU7JbIatYgsYP+8YUdhzgeOqgVdc56
|
105
|
+
vFbRQkXbGgPMLDVE1kdQDz8PDK8bbctOGmrV1V2y3RrTifzIHXetSKgtxUYXbXLA
|
106
|
+
PObFwpp+RrLyYwQFWl8tPR7bOjJxkmPTEDCEwhlBc+xBNbgXv4I+i70NSpWv6O+e
|
107
|
+
LQRompzqqEmD+qVHnw8U+1AaCLcbcRjjutELnxdT0oHT9vGD18clB/QS7A/pzJvg
|
108
|
+
EFUFaXNmmIp94TbrBguvD4/bTywHaRDsrqwK80utBK8bBSNN+GePZxHTu4+nbaaO
|
109
|
+
CMkDfMiEODLGbsWXBmZkcWXEnuovIoCUbJE+8K6EVGTO
|
110
|
+
=FfF9
|
111
|
+
-----END PGP PUBLIC KEY BLOCK-----
|
112
|
+
```
|
113
|
+
|
114
|
+
---
|
115
|
+
## License
|
116
|
+
|
117
|
+
The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
|
data/bin/ripli.rb
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
# frozen_string_literal: true
|
3
|
+
|
4
|
+
require_relative '../lib/ripli.rb'
|
5
|
+
require 'optimist'
|
6
|
+
require 'pry'
|
7
|
+
|
8
|
+
AVAILABLE_TYPES = %w[https socks4 socks5].freeze
|
9
|
+
|
10
|
+
opts = Optimist.options do
|
11
|
+
opt :type, 'Types of proxies to scrape', type: :strings, default: AVAILABLE_TYPES
|
12
|
+
end
|
13
|
+
|
14
|
+
if (opts.type.uniq - AVAILABLE_TYPES).any?
|
15
|
+
raise "Incorrect proxy type: #{opts.type.uniq - AVAILABLE_TYPES}, available types: #{AVAILABLE_TYPES}"
|
16
|
+
end
|
17
|
+
|
18
|
+
Ripli::CustomParser.descendants.each { |custom_parser| custom_parser.new.shell_exec!(opts.type) }
|
19
|
+
|
20
|
+
Collect.new.list("https", "socks4", "socks5")
|
21
|
+
|
data/lib/ripli.rb
ADDED
@@ -0,0 +1,12 @@
|
|
1
|
+
require_relative 'ripli/proxyscrape'
|
2
|
+
require_relative 'ripli/hidemyname'
|
3
|
+
require_relative 'ripli/proxyscan'
|
4
|
+
|
5
|
+
class Collect
|
6
|
+
def list(*proxy)
|
7
|
+
proxy.each { |type| @list = ""
|
8
|
+
Dir.glob(File.join('**', "#{type}.txt")).each { |file|
|
9
|
+
File.readlines(file).each { |line| @list << line; puts line }
|
10
|
+
File.write("log/#{type}.list", "#{@list}") } }
|
11
|
+
end
|
12
|
+
end
|
@@ -0,0 +1,47 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module Ripli
|
4
|
+
require 'mechanize'
|
5
|
+
require 'logger'
|
6
|
+
|
7
|
+
class CustomParser
|
8
|
+
LOG_DIR = 'log'
|
9
|
+
DEFAULT_MAX_TIMEOUT = 1000
|
10
|
+
DEFAULT_MECHANIZE_TIMEOUT = 10
|
11
|
+
|
12
|
+
def initialize
|
13
|
+
@dir = "#{LOG_DIR}/#{self.class.name.split('::').last.downcase}"
|
14
|
+
@mechanize = Mechanize.new do |agent|
|
15
|
+
agent.open_timeout = DEFAULT_MECHANIZE_TIMEOUT
|
16
|
+
agent.read_timeout = DEFAULT_MECHANIZE_TIMEOUT
|
17
|
+
end
|
18
|
+
Dir.mkdir(LOG_DIR) unless Dir.exist?(LOG_DIR)
|
19
|
+
Dir.mkdir(@dir) unless Dir.exist?(@dir)
|
20
|
+
@logger = Logger.new(STDOUT)
|
21
|
+
end
|
22
|
+
|
23
|
+
def shell_exec!(types)
|
24
|
+
types.each(&method(:save_proxy_chains))
|
25
|
+
end
|
26
|
+
|
27
|
+
protected
|
28
|
+
|
29
|
+
def parse(_type, _opts = {})
|
30
|
+
raise "Class #{self.class.name} should implement method #parse(type, opts = {})"
|
31
|
+
end
|
32
|
+
|
33
|
+
def save_proxy_chains(type)
|
34
|
+
File.open("#{@dir}/#{type}.txt", 'wb') do |file|
|
35
|
+
proxies = parse(type.to_sym).uniq
|
36
|
+
@logger.info "Find #{proxies.size} proxies type #{type} by #{self.class.name}, saved into: #{file.path}"
|
37
|
+
proxies.each { |proxy| file << "#{proxy}\n" }
|
38
|
+
end
|
39
|
+
end
|
40
|
+
end
|
41
|
+
end
|
42
|
+
|
43
|
+
class Class
|
44
|
+
def descendants
|
45
|
+
ObjectSpace.each_object(::Class).select {|klass| klass < self }
|
46
|
+
end
|
47
|
+
end
|
@@ -0,0 +1,55 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
# Template for site scraping class
|
4
|
+
|
5
|
+
require_relative 'customparser'
|
6
|
+
|
7
|
+
module Ripli
|
8
|
+
# class should be inherited from CustomParser
|
9
|
+
# class name should be related with sitename
|
10
|
+
class CustomParserTemplate < CustomParser
|
11
|
+
# from superclass you inherit constants:
|
12
|
+
# LOG_DIR = 'log' -> directory to save files with proxies
|
13
|
+
# DEFAULT_MAX_TIMEOUT = 1000 -> max timeout of proxy response in ms
|
14
|
+
CONSTANT = 'Your constants' # declare your constants here
|
15
|
+
|
16
|
+
# define it if you need initialize some instance variables
|
17
|
+
# or perform some preparations (creating directories, etc)
|
18
|
+
def initialize
|
19
|
+
super # required for creating logger and directory
|
20
|
+
# define @mechanize = Mechanize.new { |agent| agent.open_timeout...} if you need add some options to mechanize agent
|
21
|
+
# your code here
|
22
|
+
end
|
23
|
+
|
24
|
+
# required method!
|
25
|
+
# logic of scraping site must be here
|
26
|
+
# type -- proxy type: [:https, :socks4, :socks5]
|
27
|
+
# opts -- additional params if you need
|
28
|
+
# return -- array of stings in format: "<type>\t<ip>\t\t<port>"
|
29
|
+
def parse(type, opts = {})
|
30
|
+
[]
|
31
|
+
# for downloading use @mechanize.get(url)
|
32
|
+
@logger.info 'Use @logger for print logs in STDOUT'
|
33
|
+
rescue Net::OpenTimeout, Net::ReadTimeout
|
34
|
+
# rescue exception during downloading page, DEFAULT_MECHANIZE_TIMEOUT=10s
|
35
|
+
end
|
36
|
+
|
37
|
+
# If you need additional logic of creating files
|
38
|
+
# you can redefine method save_proxy_chains(type)
|
39
|
+
#
|
40
|
+
# def save_proxy_chains(type)
|
41
|
+
# File.open("#{@dir}/#{type}.txt", 'wb') do |file|
|
42
|
+
# proxies = parse(type.to_sym).uniq
|
43
|
+
# @logger.info "Find #{proxies.size} proxies type #{type} by #{self.class.name}, saved into: #{file.path}"
|
44
|
+
# proxies.each { |proxy| file << "#{proxy}\n" }
|
45
|
+
# end
|
46
|
+
# end
|
47
|
+
|
48
|
+
# any additional methods for scraping bellow
|
49
|
+
private
|
50
|
+
|
51
|
+
def do_smth
|
52
|
+
'Do something helpful'
|
53
|
+
end
|
54
|
+
end
|
55
|
+
end
|
@@ -0,0 +1,50 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require_relative 'customparser'
|
4
|
+
|
5
|
+
module Ripli
|
6
|
+
class HideMyName < CustomParser
|
7
|
+
PROXIES_ON_PAGE = 64
|
8
|
+
THREADS_COUNT = 3
|
9
|
+
PROXY_TYPES_ON_SITE = {
|
10
|
+
https: 's',
|
11
|
+
socks4: '4',
|
12
|
+
socks5: '5'
|
13
|
+
}.freeze
|
14
|
+
BASE_URL = 'https://hidemy.name/ru/proxy-list/?maxtime=%<max_timeout>d&type=%<type>s&anon=34&start=%<start>d#list'
|
15
|
+
|
16
|
+
def parse(type, opts = {})
|
17
|
+
@type ||= type
|
18
|
+
url = format(BASE_URL,
|
19
|
+
max_timeout: opts[:max_timeout] || DEFAULT_MAX_TIMEOUT,
|
20
|
+
type: PROXY_TYPES_ON_SITE[type.to_sym],
|
21
|
+
start: opts[:start] || 0)
|
22
|
+
doc = @mechanize.get(url)
|
23
|
+
proxies = extract_proxies(doc)
|
24
|
+
proxies += paginate(doc) if opts[:start].to_i.zero?
|
25
|
+
proxies
|
26
|
+
rescue Net::OpenTimeout, Net::ReadTimeout
|
27
|
+
@log.error '[HideMyName] Sorry, site is unavailable!'
|
28
|
+
end
|
29
|
+
|
30
|
+
private
|
31
|
+
|
32
|
+
def paginate(doc)
|
33
|
+
last_page = doc.at_xpath('//div[@class="pagination"]//li[not(@class)][last()]/a')&.text.to_i
|
34
|
+
threads = (1...last_page).map do |page_number|
|
35
|
+
Thread.new { parse(@type, start: page_number * PROXIES_ON_PAGE) }
|
36
|
+
end
|
37
|
+
threads.flat_map { |t| t.join.value }
|
38
|
+
end
|
39
|
+
|
40
|
+
def extract_proxies(doc)
|
41
|
+
doc.xpath('//table/tbody/tr').map do |proxy_node|
|
42
|
+
ip = proxy_node.at_xpath('./td[1]')&.text
|
43
|
+
port = proxy_node.at_xpath('./td[2]')&.text
|
44
|
+
next if ip.nil? || port.nil?
|
45
|
+
|
46
|
+
"#{@type}\t#{ip}\t\t#{port}"
|
47
|
+
end
|
48
|
+
end
|
49
|
+
end
|
50
|
+
end
|
@@ -0,0 +1,25 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require_relative 'customparser'
|
4
|
+
|
5
|
+
module Ripli
|
6
|
+
class ProxyScan < CustomParser
|
7
|
+
BASE_URL = 'https://www.proxyscan.io/download?type='
|
8
|
+
|
9
|
+
URL_PARAMS = {
|
10
|
+
https: 'https&timeout=%d&country=all&ssl=all&anonymity=all',
|
11
|
+
socks4: 'socks4&timeout=%d&country=all',
|
12
|
+
socks5: 'socks5&timeout=%d&country=all'
|
13
|
+
}.freeze
|
14
|
+
|
15
|
+
def parse(type, opts = {})
|
16
|
+
max_timeout = opts[:max_timeout] || DEFAULT_MAX_TIMEOUT
|
17
|
+
link = [BASE_URL, URL_PARAMS[type] % max_timeout].join
|
18
|
+
response = @mechanize.get(link).body
|
19
|
+
|
20
|
+
response.split.map { |proxy| "#{type}\t#{proxy.sub(':', "\t\t")}" }
|
21
|
+
rescue Net::OpenTimeout, Net::ReadTimeout
|
22
|
+
@log.error '[ProxyScrape] Sorry, site is unavailable!'
|
23
|
+
end
|
24
|
+
end
|
25
|
+
end
|
@@ -0,0 +1,25 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require_relative 'customparser'
|
4
|
+
|
5
|
+
module Ripli
|
6
|
+
class ProxyScrape < CustomParser
|
7
|
+
BASE_URL = 'https://api.proxyscrape.com/?request=getproxies&proxytype='
|
8
|
+
|
9
|
+
URL_PARAMS = {
|
10
|
+
https: 'https&timeout=%d&country=all&ssl=all&anonymity=all',
|
11
|
+
socks4: 'socks4&timeout=%d&country=all',
|
12
|
+
socks5: 'socks5&timeout=%d&country=all'
|
13
|
+
}.freeze
|
14
|
+
|
15
|
+
def parse(type, opts = {})
|
16
|
+
max_timeout = opts[:max_timeout] || DEFAULT_MAX_TIMEOUT
|
17
|
+
link = [BASE_URL, URL_PARAMS[type] % max_timeout].join
|
18
|
+
response = @mechanize.get(link).body
|
19
|
+
|
20
|
+
response.split.map { |proxy| "#{type}\t#{proxy.sub(':', "\t\t")}" }
|
21
|
+
rescue Net::OpenTimeout, Net::ReadTimeout
|
22
|
+
@log.error '[ProxyScrape] Sorry, site is unavailable!'
|
23
|
+
end
|
24
|
+
end
|
25
|
+
end
|
data/ripli.gemspec
ADDED
@@ -0,0 +1,30 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
Gem::Specification.new do |s|
|
4
|
+
s.name = 'ripli'
|
5
|
+
s.version = '0.3.0'
|
6
|
+
s.summary = 'Ruby Proxychains Scrapper'
|
7
|
+
s.description = 'Scrap web for free proxies and save them separated by type'
|
8
|
+
s.authors = ['Linuxander']
|
9
|
+
s.files = ['lib/ripli.rb']
|
10
|
+
s.homepage = 'https://cybersecrs.github.io/projects/ripli'
|
11
|
+
s.license = 'GPL-3.0-only'
|
12
|
+
|
13
|
+
s.metadata['homepage_uri'] = 'https://cybersecrs.github.io/projects/ripli'
|
14
|
+
s.metadata['source_code_uri'] = 'https://github.com/cybersecrs/ripli'
|
15
|
+
s.metadata['bug_tracker_uri'] = 'https://github.com/cybersecrs/ripli/issues'
|
16
|
+
|
17
|
+
s.bindir = ['bin']
|
18
|
+
s.executables = ['ripli.rb']
|
19
|
+
s.require_paths = ['lib']
|
20
|
+
|
21
|
+
s.files = ['bin/ripli.rb', 'lib/ripli.rb', 'lib/ripli/proxyscrape.rb', 'lib/ripli/proxyscan.rb', 'lib/ripli/hidemyname.rb',
|
22
|
+
'lib/ripli/customparser.rb', 'lib/ripli/customparser_template.rb', 'LICENSE', 'README.md', 'ripli.gemspec', 'CODE_OF_CONDUCT.md']
|
23
|
+
|
24
|
+
s.add_runtime_dependency 'optimist'
|
25
|
+
s.add_runtime_dependency 'mechanize'
|
26
|
+
s.add_runtime_dependency 'pry'
|
27
|
+
|
28
|
+
s.add_development_dependency 'bundler'
|
29
|
+
s.add_development_dependency 'rake'
|
30
|
+
end
|
metadata
ADDED
@@ -0,0 +1,128 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: ripli
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.3.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Linuxander
|
8
|
+
autorequire:
|
9
|
+
bindir:
|
10
|
+
- bin
|
11
|
+
cert_chain: []
|
12
|
+
date: 2020-10-10 00:00:00.000000000 Z
|
13
|
+
dependencies:
|
14
|
+
- !ruby/object:Gem::Dependency
|
15
|
+
name: optimist
|
16
|
+
requirement: !ruby/object:Gem::Requirement
|
17
|
+
requirements:
|
18
|
+
- - ">="
|
19
|
+
- !ruby/object:Gem::Version
|
20
|
+
version: '0'
|
21
|
+
type: :runtime
|
22
|
+
prerelease: false
|
23
|
+
version_requirements: !ruby/object:Gem::Requirement
|
24
|
+
requirements:
|
25
|
+
- - ">="
|
26
|
+
- !ruby/object:Gem::Version
|
27
|
+
version: '0'
|
28
|
+
- !ruby/object:Gem::Dependency
|
29
|
+
name: mechanize
|
30
|
+
requirement: !ruby/object:Gem::Requirement
|
31
|
+
requirements:
|
32
|
+
- - ">="
|
33
|
+
- !ruby/object:Gem::Version
|
34
|
+
version: '0'
|
35
|
+
type: :runtime
|
36
|
+
prerelease: false
|
37
|
+
version_requirements: !ruby/object:Gem::Requirement
|
38
|
+
requirements:
|
39
|
+
- - ">="
|
40
|
+
- !ruby/object:Gem::Version
|
41
|
+
version: '0'
|
42
|
+
- !ruby/object:Gem::Dependency
|
43
|
+
name: pry
|
44
|
+
requirement: !ruby/object:Gem::Requirement
|
45
|
+
requirements:
|
46
|
+
- - ">="
|
47
|
+
- !ruby/object:Gem::Version
|
48
|
+
version: '0'
|
49
|
+
type: :runtime
|
50
|
+
prerelease: false
|
51
|
+
version_requirements: !ruby/object:Gem::Requirement
|
52
|
+
requirements:
|
53
|
+
- - ">="
|
54
|
+
- !ruby/object:Gem::Version
|
55
|
+
version: '0'
|
56
|
+
- !ruby/object:Gem::Dependency
|
57
|
+
name: bundler
|
58
|
+
requirement: !ruby/object:Gem::Requirement
|
59
|
+
requirements:
|
60
|
+
- - ">="
|
61
|
+
- !ruby/object:Gem::Version
|
62
|
+
version: '0'
|
63
|
+
type: :development
|
64
|
+
prerelease: false
|
65
|
+
version_requirements: !ruby/object:Gem::Requirement
|
66
|
+
requirements:
|
67
|
+
- - ">="
|
68
|
+
- !ruby/object:Gem::Version
|
69
|
+
version: '0'
|
70
|
+
- !ruby/object:Gem::Dependency
|
71
|
+
name: rake
|
72
|
+
requirement: !ruby/object:Gem::Requirement
|
73
|
+
requirements:
|
74
|
+
- - ">="
|
75
|
+
- !ruby/object:Gem::Version
|
76
|
+
version: '0'
|
77
|
+
type: :development
|
78
|
+
prerelease: false
|
79
|
+
version_requirements: !ruby/object:Gem::Requirement
|
80
|
+
requirements:
|
81
|
+
- - ">="
|
82
|
+
- !ruby/object:Gem::Version
|
83
|
+
version: '0'
|
84
|
+
description: Scrap web for free proxies and save them separated by type
|
85
|
+
email:
|
86
|
+
executables:
|
87
|
+
- ripli.rb
|
88
|
+
extensions: []
|
89
|
+
extra_rdoc_files: []
|
90
|
+
files:
|
91
|
+
- CODE_OF_CONDUCT.md
|
92
|
+
- LICENSE
|
93
|
+
- README.md
|
94
|
+
- bin/ripli.rb
|
95
|
+
- lib/ripli.rb
|
96
|
+
- lib/ripli/customparser.rb
|
97
|
+
- lib/ripli/customparser_template.rb
|
98
|
+
- lib/ripli/hidemyname.rb
|
99
|
+
- lib/ripli/proxyscan.rb
|
100
|
+
- lib/ripli/proxyscrape.rb
|
101
|
+
- ripli.gemspec
|
102
|
+
homepage: https://cybersecrs.github.io/projects/ripli
|
103
|
+
licenses:
|
104
|
+
- GPL-3.0-only
|
105
|
+
metadata:
|
106
|
+
homepage_uri: https://cybersecrs.github.io/projects/ripli
|
107
|
+
source_code_uri: https://github.com/cybersecrs/ripli
|
108
|
+
bug_tracker_uri: https://github.com/cybersecrs/ripli/issues
|
109
|
+
post_install_message:
|
110
|
+
rdoc_options: []
|
111
|
+
require_paths:
|
112
|
+
- lib
|
113
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
114
|
+
requirements:
|
115
|
+
- - ">="
|
116
|
+
- !ruby/object:Gem::Version
|
117
|
+
version: '0'
|
118
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
119
|
+
requirements:
|
120
|
+
- - ">="
|
121
|
+
- !ruby/object:Gem::Version
|
122
|
+
version: '0'
|
123
|
+
requirements: []
|
124
|
+
rubygems_version: 3.1.4
|
125
|
+
signing_key:
|
126
|
+
specification_version: 4
|
127
|
+
summary: Ruby Proxychains Scrapper
|
128
|
+
test_files: []
|