legitbot 1.7.3 → 1.9.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/Gemfile +13 -0
- data/README.md +40 -33
- data/legitbot.gemspec +0 -9
- data/lib/legitbot/duckduckgo.rb +1 -1
- data/lib/legitbot/gptbot.rb +21 -0
- data/lib/legitbot/ias.rb +16 -0
- data/lib/legitbot/version.rb +1 -1
- data/lib/legitbot.rb +2 -0
- data/lib/rubocop/cop/custom/ip_ranges.rb +39 -7
- data/test/ias_test.rb +39 -0
- metadata +5 -182
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 76098bb34095ff5b37ed3732b6f00baa6e8491b813f61faf0d717c0c35018885
|
|
4
|
+
data.tar.gz: 5ed3f6c8d09d019685e9a5ff33844de03b3bbf31b1a5156fc4e88264dbcc5d08
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 4dcd231f388e8134347db6c22286dfbf3e9155f5b79714c70f83fef8a1d2c2d3b990ea0b2d4f7beab4a59bfe0f1a326558d02b393340b628c55aebfd10cfdbb7
|
|
7
|
+
data.tar.gz: b66d66abe8eeace6c74bdfacab6cbe3dafa18c9c7c2325df313bc8a6842e81e510040dd1f7c1f0ff32c0692f75e976ff79089ac31a34125391d76d3bd2c2d8fc
|
data/Gemfile
CHANGED
|
@@ -2,3 +2,16 @@
|
|
|
2
2
|
|
|
3
3
|
source 'https://rubygems.org'
|
|
4
4
|
gemspec
|
|
5
|
+
|
|
6
|
+
group :development do
|
|
7
|
+
gem 'bump'
|
|
8
|
+
gem 'dns_mock'
|
|
9
|
+
gem 'jsonpath'
|
|
10
|
+
gem 'minitest'
|
|
11
|
+
gem 'minitest-hooks'
|
|
12
|
+
gem 'nokogiri'
|
|
13
|
+
gem 'rake'
|
|
14
|
+
gem 'rubocop'
|
|
15
|
+
gem 'rubocop-minitest'
|
|
16
|
+
gem 'simplecov-cobertura'
|
|
17
|
+
end
|
data/README.md
CHANGED
|
@@ -11,8 +11,8 @@ Suppose you have a Web request and you would like to check it is not diguised:
|
|
|
11
11
|
bot = Legitbot.bot(userAgent, ip)
|
|
12
12
|
```
|
|
13
13
|
|
|
14
|
-
`bot` will be `nil` if no bot signature was found in the `User-Agent`.
|
|
15
|
-
it will be an object with methods
|
|
14
|
+
`bot` will be `nil` if no bot signature was found in the `User-Agent`.
|
|
15
|
+
Otherwise, it will be an object with methods
|
|
16
16
|
|
|
17
17
|
```ruby
|
|
18
18
|
bot.detected_as # => :google
|
|
@@ -29,9 +29,9 @@ Rack::Attack.blocklist("fake Googlebot") do |req|
|
|
|
29
29
|
end
|
|
30
30
|
```
|
|
31
31
|
|
|
32
|
-
Or if you do not like all those ghoulish crawlers stealing your
|
|
33
|
-
|
|
34
|
-
|
|
32
|
+
Or if you do not like all those ghoulish crawlers stealing your content,
|
|
33
|
+
evaluating it and getting ready to invade your site with spammers, then block
|
|
34
|
+
them all:
|
|
35
35
|
|
|
36
36
|
```ruby
|
|
37
37
|
Rack::Attack.blocklist 'fake search engines' do |request|
|
|
@@ -43,26 +43,31 @@ end
|
|
|
43
43
|
|
|
44
44
|
[Semantic versioning](https://semver.org/) with the following clarifications:
|
|
45
45
|
|
|
46
|
-
|
|
47
|
-
|
|
46
|
+
- MINOR version is incremented when support for new bots is added.
|
|
47
|
+
- PATCH version is incremented when validation logic for a bot changes (IP list
|
|
48
|
+
updated, for example).
|
|
48
49
|
|
|
49
50
|
## Supported
|
|
50
51
|
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
52
|
+
- [Ahrefs](https://ahrefs.com/robot)
|
|
53
|
+
- [Alexa](https://support.alexa.com/hc/en-us/articles/360046707834-What-are-the-IP-addresses-for-Alexa-s-Certify-and-Site-Audit-crawlers-)
|
|
54
|
+
- [Amazon AdBot](https://adbot.amazon.com/index.html)
|
|
55
|
+
- [Applebot](https://support.apple.com/en-us/HT204683)
|
|
56
|
+
- [Baidu spider](http://help.baidu.com/question?prod_en=master&class=498&id=1000973)
|
|
57
|
+
- [Bingbot](https://blogs.bing.com/webmaster/2012/08/31/how-to-verify-that-bingbot-is-bingbot/)
|
|
58
|
+
- [DuckDuckGo bot](https://duckduckgo.com/duckduckbot)
|
|
59
|
+
- [Facebook crawler](https://developers.facebook.com/docs/sharing/webmasters/crawler)
|
|
60
|
+
- [Google crawlers](https://support.google.com/webmasters/answer/1061943)
|
|
61
|
+
- [IAS](https://integralads.com/ias-privacy-data-management/policies/site-indexing-policy/)
|
|
62
|
+
- [OpenAI GPTBot](https://platform.openai.com/docs/gptbot)
|
|
63
|
+
- [Oracle Data Cloud Crawler](https://www.oracle.com/corporate/acquisitions/grapeshot/crawler.html)
|
|
64
|
+
- [Petal search engine](http://aspiegel.com/petalbot)
|
|
65
|
+
- [Pinterest](https://help.pinterest.com/en/articles/about-pinterest-crawler-0)
|
|
66
|
+
- [Twitterbot](https://developer.twitter.com/en/docs/tweets/optimize-with-cards/guides/getting-started),
|
|
67
|
+
the list of IPs is in the
|
|
68
|
+
[Troubleshooting page](https://developer.twitter.com/en/docs/tweets/optimize-with-cards/guides/troubleshooting-cards)
|
|
69
|
+
- [Yandex robots](https://yandex.com/support/webmaster/robot-workings/check-yandex-robots.xml)
|
|
70
|
+
- [You.com](https://about.you.com/youbot/)
|
|
66
71
|
|
|
67
72
|
## License
|
|
68
73
|
|
|
@@ -70,16 +75,18 @@ Apache 2.0
|
|
|
70
75
|
|
|
71
76
|
## Other projects
|
|
72
77
|
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
78
|
+
- Play Framework variant in Scala:
|
|
79
|
+
[play-legitbot](https://github.com/osinka/play-legitbot)
|
|
80
|
+
- Article
|
|
81
|
+
[When (Fake) Googlebots Attack Your Rails App](http://jessewolgamott.com/blog/2015/11/17/when-fake-googlebots-attack-your-rails-app/)
|
|
82
|
+
- [Voight-Kampff](https://github.com/biola/Voight-Kampff) is a Ruby gem that
|
|
76
83
|
detects bots by `User-Agent`
|
|
77
|
-
|
|
78
|
-
middleware to detect crawlers by few different request headers, including
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
classify IP as a search engine, but also label them as suspicious
|
|
82
|
-
reports the number of days since the last activity. My implementation of
|
|
84
|
+
- [crawler_detect](https://github.com/loadkpi/crawler_detect) is a Ruby gem and
|
|
85
|
+
Rack middleware to detect crawlers by few different request headers, including
|
|
86
|
+
`User-Agent`
|
|
87
|
+
- Project Honeypot's [http:BL](https://www.projecthoneypot.org/httpbl_api.php)
|
|
88
|
+
can not only classify IP as a search engine, but also label them as suspicious
|
|
89
|
+
and reports the number of days since the last activity. My implementation of
|
|
83
90
|
the protocol in Scala is [here](https://github.com/osinka/httpbl).
|
|
84
|
-
|
|
85
|
-
to validate bots.
|
|
91
|
+
- [CIDRAM](https://github.com/CIDRAM/CIDRAM) is a PHP routing manager with
|
|
92
|
+
built-in support to validate bots.
|
data/legitbot.gemspec
CHANGED
|
@@ -22,15 +22,6 @@ Gem::Specification.new do |spec|
|
|
|
22
22
|
spec.required_ruby_version = '>= 2.6.0'
|
|
23
23
|
spec.add_dependency 'fast_interval_tree', '~> 0.2', '>= 0.2.2'
|
|
24
24
|
spec.add_dependency 'irrc', '~> 0.2', '>= 0.2.1'
|
|
25
|
-
spec.add_development_dependency 'bump', '~> 0.8', '>= 0.8.0'
|
|
26
|
-
spec.add_development_dependency 'dns_mock', '~> 1.5.0', '>= 1.5.0'
|
|
27
|
-
spec.add_development_dependency 'minitest', '~> 5.1', '>= 5.1.0'
|
|
28
|
-
spec.add_development_dependency 'minitest-hooks', '~> 1.5', '>= 1.5.0'
|
|
29
|
-
spec.add_development_dependency 'nokogiri', '~> 1.14', '>= 1.14.3'
|
|
30
|
-
spec.add_development_dependency 'rake', '~> 13.0', '>= 13.0.0'
|
|
31
|
-
spec.add_development_dependency 'rubocop', '~> 1.50.0', '>= 1.50.0'
|
|
32
|
-
spec.add_development_dependency 'rubocop-minitest', '~> 0.31.0', '>= 0.31.0'
|
|
33
|
-
spec.add_development_dependency 'simplecov-cobertura', '~> 2.0', '>= 2.0'
|
|
34
25
|
|
|
35
26
|
spec.files = `git ls-files`.split($INPUT_RECORD_SEPARATOR)
|
|
36
27
|
spec.rdoc_options = ['--charset=UTF-8']
|
data/lib/legitbot/duckduckgo.rb
CHANGED
|
@@ -3,7 +3,7 @@
|
|
|
3
3
|
module Legitbot # :nodoc:
|
|
4
4
|
# https://duckduckgo.com/duckduckbot
|
|
5
5
|
class DuckDuckGo < BotMatch
|
|
6
|
-
# @fetch:url https://
|
|
6
|
+
# @fetch:url https://duckduckgo.com/duckduckgo-help-pages/results/duckduckbot/
|
|
7
7
|
# @fetch:selector section.main article.content ul > li
|
|
8
8
|
ip_ranges %w[
|
|
9
9
|
20.185.79.15
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module Legitbot # :nodoc:
|
|
4
|
+
# https://platform.openai.com/docs/gptbot
|
|
5
|
+
class GPTBot < BotMatch
|
|
6
|
+
# @fetch:url https://openai.com/gptbot-ranges.txt
|
|
7
|
+
ip_ranges %w[
|
|
8
|
+
20.15.240.64/28
|
|
9
|
+
20.15.240.80/28
|
|
10
|
+
20.15.240.96/28
|
|
11
|
+
20.15.240.176/28
|
|
12
|
+
20.15.241.0/28
|
|
13
|
+
20.15.242.128/28
|
|
14
|
+
20.15.242.144/28
|
|
15
|
+
20.15.242.192/28
|
|
16
|
+
40.83.2.64/28
|
|
17
|
+
]
|
|
18
|
+
end
|
|
19
|
+
|
|
20
|
+
rule Legitbot::GPTBot, %w[GPTBot]
|
|
21
|
+
end
|
data/lib/legitbot/ias.rb
ADDED
|
@@ -0,0 +1,16 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module Legitbot # :nodoc:
|
|
4
|
+
# https://integralads.com/ias-privacy-data-management/policies/site-indexing-policy/
|
|
5
|
+
class Ias < BotMatch
|
|
6
|
+
# @fetch:url https://integralads.com/policy-docs/iasbot.json
|
|
7
|
+
# @fetch:jsonpath $.publicIPs[*].ipv4
|
|
8
|
+
ip_ranges %w[
|
|
9
|
+
3.217.168.199
|
|
10
|
+
3.226.51.67
|
|
11
|
+
18.214.43.70
|
|
12
|
+
]
|
|
13
|
+
end
|
|
14
|
+
|
|
15
|
+
rule Legitbot::Ias, %w[ias_crawler ias_wombles]
|
|
16
|
+
end
|
data/lib/legitbot/version.rb
CHANGED
data/lib/legitbot.rb
CHANGED
|
@@ -12,6 +12,8 @@ require_relative 'legitbot/bing'
|
|
|
12
12
|
require_relative 'legitbot/duckduckgo'
|
|
13
13
|
require_relative 'legitbot/facebook'
|
|
14
14
|
require_relative 'legitbot/google'
|
|
15
|
+
require_relative 'legitbot/gptbot'
|
|
16
|
+
require_relative 'legitbot/ias'
|
|
15
17
|
require_relative 'legitbot/oracle'
|
|
16
18
|
require_relative 'legitbot/petalbot'
|
|
17
19
|
require_relative 'legitbot/pinterest'
|
|
@@ -3,6 +3,7 @@
|
|
|
3
3
|
require 'ipaddr'
|
|
4
4
|
require 'net/http'
|
|
5
5
|
require 'nokogiri'
|
|
6
|
+
require 'jsonpath'
|
|
6
7
|
require 'rubocop'
|
|
7
8
|
require 'uri'
|
|
8
9
|
|
|
@@ -24,8 +25,9 @@ module RuboCop
|
|
|
24
25
|
params = fetch_params(node)
|
|
25
26
|
return unless mandatory_params?(params)
|
|
26
27
|
|
|
27
|
-
existing_ips = read_node_ips
|
|
28
|
-
new_ips = fetch_ips(**params)
|
|
28
|
+
existing_ips = normalise_list(read_node_ips(value))
|
|
29
|
+
new_ips = normalise_list(fetch_ips(**params))
|
|
30
|
+
return unless new_ips
|
|
29
31
|
return if existing_ips == new_ips
|
|
30
32
|
|
|
31
33
|
register_offense(value, new_ips, **params)
|
|
@@ -34,16 +36,46 @@ module RuboCop
|
|
|
34
36
|
|
|
35
37
|
private
|
|
36
38
|
|
|
37
|
-
def fetch_ips(url:, selector:)
|
|
39
|
+
def fetch_ips(url:, selector: nil, jsonpath: nil)
|
|
40
|
+
body = get_url url
|
|
41
|
+
return unless body
|
|
42
|
+
return parse_html(body, selector) if selector
|
|
43
|
+
return parse_json(body, jsonpath) if jsonpath
|
|
44
|
+
|
|
45
|
+
parse_text(body)
|
|
46
|
+
end
|
|
47
|
+
|
|
48
|
+
def get_url(url)
|
|
38
49
|
response = Net::HTTP.get_response URI(url)
|
|
50
|
+
unless response.is_a?(Net::HTTPOK)
|
|
51
|
+
add_global_offense "Could not fetch IPs from #{url} , HTTP status code #{response.code}"
|
|
52
|
+
return
|
|
53
|
+
end
|
|
54
|
+
|
|
39
55
|
response.value
|
|
56
|
+
response.body
|
|
57
|
+
end
|
|
40
58
|
|
|
41
|
-
|
|
42
|
-
document
|
|
59
|
+
def parse_html(body, selector)
|
|
60
|
+
document = Nokogiri::HTML body
|
|
61
|
+
document.css(selector).map(&:content)
|
|
62
|
+
end
|
|
63
|
+
|
|
64
|
+
def parse_json(body, jsonpath)
|
|
65
|
+
document = JSON.parse body
|
|
66
|
+
JsonPath.new(jsonpath).on(document)
|
|
67
|
+
end
|
|
68
|
+
|
|
69
|
+
def parse_text(body)
|
|
70
|
+
body.lines.map(&:chomp)
|
|
43
71
|
end
|
|
44
72
|
|
|
45
73
|
def read_node_ips(value)
|
|
46
|
-
value.child_nodes.map(&:value)
|
|
74
|
+
value.child_nodes.map(&:value)
|
|
75
|
+
end
|
|
76
|
+
|
|
77
|
+
def normalise_list(ips)
|
|
78
|
+
ips.sort_by(&IPAddr.method(:new))
|
|
47
79
|
end
|
|
48
80
|
|
|
49
81
|
def register_offense(node, new_ips, **params)
|
|
@@ -54,7 +86,7 @@ module RuboCop
|
|
|
54
86
|
end
|
|
55
87
|
|
|
56
88
|
def mandatory_params?(params)
|
|
57
|
-
params.include?(:url)
|
|
89
|
+
params.include?(:url)
|
|
58
90
|
end
|
|
59
91
|
|
|
60
92
|
def fetch_params(node)
|
data/test/ias_test.rb
ADDED
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require_relative 'test_helper'
|
|
4
|
+
|
|
5
|
+
class IasTest < Minitest::Test
|
|
6
|
+
def test_malicious_ip
|
|
7
|
+
ip = '149.210.164.47'
|
|
8
|
+
match = Legitbot::Ias.new ip
|
|
9
|
+
|
|
10
|
+
refute_predicate match, :valid?
|
|
11
|
+
end
|
|
12
|
+
|
|
13
|
+
def test_valid_ip
|
|
14
|
+
ip = '18.214.43.70'
|
|
15
|
+
match = Legitbot::Ias.new ip
|
|
16
|
+
|
|
17
|
+
assert_predicate match, :valid?
|
|
18
|
+
end
|
|
19
|
+
|
|
20
|
+
def test_malicious_ua
|
|
21
|
+
bot = Legitbot.bot(
|
|
22
|
+
'IAS Crawler (ias_crawler; http://integralads.com/site-indexing-policy/)',
|
|
23
|
+
'18.214.43.72'
|
|
24
|
+
)
|
|
25
|
+
|
|
26
|
+
assert bot
|
|
27
|
+
refute_predicate bot, :valid?
|
|
28
|
+
end
|
|
29
|
+
|
|
30
|
+
def test_valid_ua
|
|
31
|
+
bot = Legitbot.bot(
|
|
32
|
+
'IAS Crawler (ias_crawler; http://integralads.com/site-indexing-policy/)',
|
|
33
|
+
'18.214.43.70'
|
|
34
|
+
)
|
|
35
|
+
|
|
36
|
+
assert bot
|
|
37
|
+
assert_predicate bot, :valid?
|
|
38
|
+
end
|
|
39
|
+
end
|
metadata
CHANGED
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: legitbot
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 1.
|
|
4
|
+
version: 1.9.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Alexander Azarov
|
|
8
8
|
autorequire:
|
|
9
9
|
bindir: bin
|
|
10
10
|
cert_chain: []
|
|
11
|
-
date: 2023-
|
|
11
|
+
date: 2023-08-08 00:00:00.000000000 Z
|
|
12
12
|
dependencies:
|
|
13
13
|
- !ruby/object:Gem::Dependency
|
|
14
14
|
name: fast_interval_tree
|
|
@@ -50,186 +50,6 @@ dependencies:
|
|
|
50
50
|
- - ">="
|
|
51
51
|
- !ruby/object:Gem::Version
|
|
52
52
|
version: 0.2.1
|
|
53
|
-
- !ruby/object:Gem::Dependency
|
|
54
|
-
name: bump
|
|
55
|
-
requirement: !ruby/object:Gem::Requirement
|
|
56
|
-
requirements:
|
|
57
|
-
- - "~>"
|
|
58
|
-
- !ruby/object:Gem::Version
|
|
59
|
-
version: '0.8'
|
|
60
|
-
- - ">="
|
|
61
|
-
- !ruby/object:Gem::Version
|
|
62
|
-
version: 0.8.0
|
|
63
|
-
type: :development
|
|
64
|
-
prerelease: false
|
|
65
|
-
version_requirements: !ruby/object:Gem::Requirement
|
|
66
|
-
requirements:
|
|
67
|
-
- - "~>"
|
|
68
|
-
- !ruby/object:Gem::Version
|
|
69
|
-
version: '0.8'
|
|
70
|
-
- - ">="
|
|
71
|
-
- !ruby/object:Gem::Version
|
|
72
|
-
version: 0.8.0
|
|
73
|
-
- !ruby/object:Gem::Dependency
|
|
74
|
-
name: dns_mock
|
|
75
|
-
requirement: !ruby/object:Gem::Requirement
|
|
76
|
-
requirements:
|
|
77
|
-
- - "~>"
|
|
78
|
-
- !ruby/object:Gem::Version
|
|
79
|
-
version: 1.5.0
|
|
80
|
-
- - ">="
|
|
81
|
-
- !ruby/object:Gem::Version
|
|
82
|
-
version: 1.5.0
|
|
83
|
-
type: :development
|
|
84
|
-
prerelease: false
|
|
85
|
-
version_requirements: !ruby/object:Gem::Requirement
|
|
86
|
-
requirements:
|
|
87
|
-
- - "~>"
|
|
88
|
-
- !ruby/object:Gem::Version
|
|
89
|
-
version: 1.5.0
|
|
90
|
-
- - ">="
|
|
91
|
-
- !ruby/object:Gem::Version
|
|
92
|
-
version: 1.5.0
|
|
93
|
-
- !ruby/object:Gem::Dependency
|
|
94
|
-
name: minitest
|
|
95
|
-
requirement: !ruby/object:Gem::Requirement
|
|
96
|
-
requirements:
|
|
97
|
-
- - "~>"
|
|
98
|
-
- !ruby/object:Gem::Version
|
|
99
|
-
version: '5.1'
|
|
100
|
-
- - ">="
|
|
101
|
-
- !ruby/object:Gem::Version
|
|
102
|
-
version: 5.1.0
|
|
103
|
-
type: :development
|
|
104
|
-
prerelease: false
|
|
105
|
-
version_requirements: !ruby/object:Gem::Requirement
|
|
106
|
-
requirements:
|
|
107
|
-
- - "~>"
|
|
108
|
-
- !ruby/object:Gem::Version
|
|
109
|
-
version: '5.1'
|
|
110
|
-
- - ">="
|
|
111
|
-
- !ruby/object:Gem::Version
|
|
112
|
-
version: 5.1.0
|
|
113
|
-
- !ruby/object:Gem::Dependency
|
|
114
|
-
name: minitest-hooks
|
|
115
|
-
requirement: !ruby/object:Gem::Requirement
|
|
116
|
-
requirements:
|
|
117
|
-
- - "~>"
|
|
118
|
-
- !ruby/object:Gem::Version
|
|
119
|
-
version: '1.5'
|
|
120
|
-
- - ">="
|
|
121
|
-
- !ruby/object:Gem::Version
|
|
122
|
-
version: 1.5.0
|
|
123
|
-
type: :development
|
|
124
|
-
prerelease: false
|
|
125
|
-
version_requirements: !ruby/object:Gem::Requirement
|
|
126
|
-
requirements:
|
|
127
|
-
- - "~>"
|
|
128
|
-
- !ruby/object:Gem::Version
|
|
129
|
-
version: '1.5'
|
|
130
|
-
- - ">="
|
|
131
|
-
- !ruby/object:Gem::Version
|
|
132
|
-
version: 1.5.0
|
|
133
|
-
- !ruby/object:Gem::Dependency
|
|
134
|
-
name: nokogiri
|
|
135
|
-
requirement: !ruby/object:Gem::Requirement
|
|
136
|
-
requirements:
|
|
137
|
-
- - "~>"
|
|
138
|
-
- !ruby/object:Gem::Version
|
|
139
|
-
version: '1.14'
|
|
140
|
-
- - ">="
|
|
141
|
-
- !ruby/object:Gem::Version
|
|
142
|
-
version: 1.14.3
|
|
143
|
-
type: :development
|
|
144
|
-
prerelease: false
|
|
145
|
-
version_requirements: !ruby/object:Gem::Requirement
|
|
146
|
-
requirements:
|
|
147
|
-
- - "~>"
|
|
148
|
-
- !ruby/object:Gem::Version
|
|
149
|
-
version: '1.14'
|
|
150
|
-
- - ">="
|
|
151
|
-
- !ruby/object:Gem::Version
|
|
152
|
-
version: 1.14.3
|
|
153
|
-
- !ruby/object:Gem::Dependency
|
|
154
|
-
name: rake
|
|
155
|
-
requirement: !ruby/object:Gem::Requirement
|
|
156
|
-
requirements:
|
|
157
|
-
- - "~>"
|
|
158
|
-
- !ruby/object:Gem::Version
|
|
159
|
-
version: '13.0'
|
|
160
|
-
- - ">="
|
|
161
|
-
- !ruby/object:Gem::Version
|
|
162
|
-
version: 13.0.0
|
|
163
|
-
type: :development
|
|
164
|
-
prerelease: false
|
|
165
|
-
version_requirements: !ruby/object:Gem::Requirement
|
|
166
|
-
requirements:
|
|
167
|
-
- - "~>"
|
|
168
|
-
- !ruby/object:Gem::Version
|
|
169
|
-
version: '13.0'
|
|
170
|
-
- - ">="
|
|
171
|
-
- !ruby/object:Gem::Version
|
|
172
|
-
version: 13.0.0
|
|
173
|
-
- !ruby/object:Gem::Dependency
|
|
174
|
-
name: rubocop
|
|
175
|
-
requirement: !ruby/object:Gem::Requirement
|
|
176
|
-
requirements:
|
|
177
|
-
- - "~>"
|
|
178
|
-
- !ruby/object:Gem::Version
|
|
179
|
-
version: 1.50.0
|
|
180
|
-
- - ">="
|
|
181
|
-
- !ruby/object:Gem::Version
|
|
182
|
-
version: 1.50.0
|
|
183
|
-
type: :development
|
|
184
|
-
prerelease: false
|
|
185
|
-
version_requirements: !ruby/object:Gem::Requirement
|
|
186
|
-
requirements:
|
|
187
|
-
- - "~>"
|
|
188
|
-
- !ruby/object:Gem::Version
|
|
189
|
-
version: 1.50.0
|
|
190
|
-
- - ">="
|
|
191
|
-
- !ruby/object:Gem::Version
|
|
192
|
-
version: 1.50.0
|
|
193
|
-
- !ruby/object:Gem::Dependency
|
|
194
|
-
name: rubocop-minitest
|
|
195
|
-
requirement: !ruby/object:Gem::Requirement
|
|
196
|
-
requirements:
|
|
197
|
-
- - "~>"
|
|
198
|
-
- !ruby/object:Gem::Version
|
|
199
|
-
version: 0.31.0
|
|
200
|
-
- - ">="
|
|
201
|
-
- !ruby/object:Gem::Version
|
|
202
|
-
version: 0.31.0
|
|
203
|
-
type: :development
|
|
204
|
-
prerelease: false
|
|
205
|
-
version_requirements: !ruby/object:Gem::Requirement
|
|
206
|
-
requirements:
|
|
207
|
-
- - "~>"
|
|
208
|
-
- !ruby/object:Gem::Version
|
|
209
|
-
version: 0.31.0
|
|
210
|
-
- - ">="
|
|
211
|
-
- !ruby/object:Gem::Version
|
|
212
|
-
version: 0.31.0
|
|
213
|
-
- !ruby/object:Gem::Dependency
|
|
214
|
-
name: simplecov-cobertura
|
|
215
|
-
requirement: !ruby/object:Gem::Requirement
|
|
216
|
-
requirements:
|
|
217
|
-
- - "~>"
|
|
218
|
-
- !ruby/object:Gem::Version
|
|
219
|
-
version: '2.0'
|
|
220
|
-
- - ">="
|
|
221
|
-
- !ruby/object:Gem::Version
|
|
222
|
-
version: '2.0'
|
|
223
|
-
type: :development
|
|
224
|
-
prerelease: false
|
|
225
|
-
version_requirements: !ruby/object:Gem::Requirement
|
|
226
|
-
requirements:
|
|
227
|
-
- - "~>"
|
|
228
|
-
- !ruby/object:Gem::Version
|
|
229
|
-
version: '2.0'
|
|
230
|
-
- - ">="
|
|
231
|
-
- !ruby/object:Gem::Version
|
|
232
|
-
version: '2.0'
|
|
233
53
|
description: Does Web request come from a real search engine or from an impersonating
|
|
234
54
|
agent?
|
|
235
55
|
email: self@alaz.me
|
|
@@ -261,6 +81,8 @@ files:
|
|
|
261
81
|
- lib/legitbot/duckduckgo.rb
|
|
262
82
|
- lib/legitbot/facebook.rb
|
|
263
83
|
- lib/legitbot/google.rb
|
|
84
|
+
- lib/legitbot/gptbot.rb
|
|
85
|
+
- lib/legitbot/ias.rb
|
|
264
86
|
- lib/legitbot/legitbot.rb
|
|
265
87
|
- lib/legitbot/oracle.rb
|
|
266
88
|
- lib/legitbot/petalbot.rb
|
|
@@ -284,6 +106,7 @@ files:
|
|
|
284
106
|
- test/botmatch_test.rb
|
|
285
107
|
- test/facebook_test.rb
|
|
286
108
|
- test/google_test.rb
|
|
109
|
+
- test/ias_test.rb
|
|
287
110
|
- test/legitbot/validators/domains_test.rb
|
|
288
111
|
- test/legitbot/validators/ip_ranges_test.rb
|
|
289
112
|
- test/legitbot_test.rb
|