gman 6.0.1 → 7.0.4
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +5 -5
- data/.github/CODEOWNERS +3 -0
- data/.github/ISSUE_TEMPLATE/bug_report.md +28 -0
- data/.github/ISSUE_TEMPLATE/feature_request.md +21 -0
- data/.github/config.yml +23 -0
- data/.github/funding.yml +1 -0
- data/.github/no-response.yml +15 -0
- data/.github/release-drafter.yml +4 -0
- data/.github/settings.yml +33 -0
- data/.github/stale.yml +29 -0
- data/.gitignore +1 -0
- data/.rspec +2 -0
- data/.rubocop.yml +17 -5
- data/.rubocop_todo.yml +84 -0
- data/.ruby-version +1 -1
- data/Gemfile +2 -0
- data/bin/gman +6 -4
- data/bin/gman_filter +5 -7
- data/config/domains.txt +8446 -173
- data/config/vendor/academic.txt +8038 -0
- data/config/vendor/dotgovs.csv +5786 -5560
- data/docs/CODE_OF_CONDUCT.md +46 -0
- data/docs/CONTRIBUTING.md +92 -0
- data/{README.md → docs/README.md} +3 -3
- data/docs/SECURITY.md +3 -0
- data/docs/_config.yml +2 -0
- data/gman.gemspec +18 -17
- data/lib/gman.rb +25 -21
- data/lib/gman/country_codes.rb +17 -17
- data/lib/gman/domain_list.rb +123 -41
- data/lib/gman/identifier.rb +59 -21
- data/lib/gman/importer.rb +39 -40
- data/lib/gman/locality.rb +23 -21
- data/lib/gman/version.rb +3 -1
- data/script/add +2 -0
- data/script/alphabetize +2 -0
- data/script/cibuild +1 -1
- data/script/dedupe +2 -1
- data/script/profile +2 -1
- data/script/prune +5 -3
- data/script/reconcile-us +6 -3
- data/script/vendor +1 -1
- data/script/vendor-federal-de +3 -3
- data/script/vendor-municipal-de +3 -3
- data/script/vendor-nl +4 -1
- data/script/vendor-public-suffix +7 -6
- data/script/vendor-se +3 -3
- data/script/vendor-swot +43 -0
- data/script/vendor-us +8 -5
- data/spec/fixtures/domains.txt +4 -0
- data/{test → spec}/fixtures/obama.txt +0 -0
- data/spec/gman/bin_spec.rb +101 -0
- data/spec/gman/country_code_spec.rb +39 -0
- data/spec/gman/domain_list_spec.rb +110 -0
- data/spec/gman/domains_spec.rb +25 -0
- data/spec/gman/identifier_spec.rb +218 -0
- data/spec/gman/importer_spec.rb +236 -0
- data/spec/gman/locality_spec.rb +24 -0
- data/spec/gman_spec.rb +74 -0
- data/spec/spec_helper.rb +31 -0
- metadata +86 -73
- data/CONTRIBUTING.md +0 -22
- data/Rakefile +0 -22
- data/test/fixtures/domains.txt +0 -2
- data/test/helper.rb +0 -40
- data/test/test_gman.rb +0 -62
- data/test/test_gman_bin.rb +0 -75
- data/test/test_gman_country_codes.rb +0 -18
- data/test/test_gman_domains.rb +0 -33
- data/test/test_gman_filter.rb +0 -17
- data/test/test_gman_identifier.rb +0 -106
- data/test/test_gman_importer.rb +0 -250
- data/test/test_gman_locality.rb +0 -10
@@ -0,0 +1,46 @@
|
|
1
|
+
# Contributor Covenant Code of Conduct
|
2
|
+
|
3
|
+
## Our Pledge
|
4
|
+
|
5
|
+
In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation.
|
6
|
+
|
7
|
+
## Our Standards
|
8
|
+
|
9
|
+
Examples of behavior that contributes to creating a positive environment include:
|
10
|
+
|
11
|
+
* Using welcoming and inclusive language
|
12
|
+
* Being respectful of differing viewpoints and experiences
|
13
|
+
* Gracefully accepting constructive criticism
|
14
|
+
* Focusing on what is best for the community
|
15
|
+
* Showing empathy towards other community members
|
16
|
+
|
17
|
+
Examples of unacceptable behavior by participants include:
|
18
|
+
|
19
|
+
* The use of sexualized language or imagery and unwelcome sexual attention or advances
|
20
|
+
* Trolling, insulting/derogatory comments, and personal or political attacks
|
21
|
+
* Public or private harassment
|
22
|
+
* Publishing others' private information, such as a physical or electronic address, without explicit permission
|
23
|
+
* Other conduct which could reasonably be considered inappropriate in a professional setting
|
24
|
+
|
25
|
+
## Our Responsibilities
|
26
|
+
|
27
|
+
Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
|
28
|
+
|
29
|
+
Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
|
30
|
+
|
31
|
+
## Scope
|
32
|
+
|
33
|
+
This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.
|
34
|
+
|
35
|
+
## Enforcement
|
36
|
+
|
37
|
+
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at ben@balter.com. The project team will review and investigate all complaints, and will respond in a way that it deems appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
|
38
|
+
|
39
|
+
Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership.
|
40
|
+
|
41
|
+
## Attribution
|
42
|
+
|
43
|
+
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [http://contributor-covenant.org/version/1/4][version]
|
44
|
+
|
45
|
+
[homepage]: http://contributor-covenant.org
|
46
|
+
[version]: http://contributor-covenant.org/version/1/4/
|
@@ -0,0 +1,92 @@
|
|
1
|
+
# Contributing to Gman
|
2
|
+
|
3
|
+
Hi there! We're thrilled that you'd like to contribute to Gman. Your help is essential for keeping it great.
|
4
|
+
|
5
|
+
Gman is an open source project supported by the efforts of an entire community and built one contribution at a time by users like you. We'd love for you to get involved. Whatever your level of skill or however much time you can give, your contribution is greatly appreciated. There are many ways to contribute, from writing tutorials or blog posts, improving the documentation, submitting bug reports and feature requests, helping other users by commenting on issues, or writing code which can be incorporated into Gman itself.
|
6
|
+
|
7
|
+
Following these guidelines helps to communicate that you respect the time of the developers managing and developing this open source project. In return, they should reciprocate that respect in addressing your issue, assessing changes, and helping you finalize your pull requests.
|
8
|
+
|
9
|
+
|
10
|
+
|
11
|
+
## How to report a bug
|
12
|
+
|
13
|
+
Think you found a bug? Please check [the list of open issues](https://github.com/benbalter/gman/issues) to see if your bug has already been reported. If it hasn't please [submit a new issue](https://github.com/benbalter/gman/issues/new).
|
14
|
+
|
15
|
+
Here are a few tips for writing *great* bug reports:
|
16
|
+
|
17
|
+
* Describe the specific problem (e.g., "widget doesn't turn clockwise" versus "getting an error")
|
18
|
+
* Include the steps to reproduce the bug, what you expected to happen, and what happened instead
|
19
|
+
* Check that you are using the latest version of the project and its dependencies
|
20
|
+
* Include what version of the project your using, as well as any relevant dependencies
|
21
|
+
* Only include one bug per issue. If you have discovered two bugs, please file two issues
|
22
|
+
* Include screenshots or screencasts whenever possible
|
23
|
+
* Even if you don't know how to fix the bug, including a failing test may help others track it down
|
24
|
+
|
25
|
+
**If you find a security vulnerability, do not open an issue. Please email ben@balter.com instead.**
|
26
|
+
|
27
|
+
## How to suggest a feature or enhancement
|
28
|
+
|
29
|
+
If you find yourself wishing for a feature that doesn't exist in Gman, you are probably not alone. There are bound to be others out there with similar needs. Many of the features that Gman has today have been added because our users saw the need.
|
30
|
+
|
31
|
+
Feature requests are welcome. But take a moment to find out whether your idea fits with the scope and goals of the project. It's up to you to make a strong case to convince the project's developers of the merits of this feature. Please provide as much detail and context as possible, including describing the problem you're trying to solve.
|
32
|
+
|
33
|
+
[Open an issue](https://github.com/benbalter/gman/issues/new) which describes the feature you would like to see, why you want it, how it should work, etc.
|
34
|
+
|
35
|
+
## Domains
|
36
|
+
|
37
|
+
Domains live in [`config/domains.txt`](../config/domains.txt) as a list of TLDs and SLD+TLDs.
|
38
|
+
|
39
|
+
Right now, the only valid government top level domains (TLDs), represent the US government and are `.gov`, and `.mil`. Secondary domains (e.g., `gov.uk`, or `mil.au`) represent non-US government entities.
|
40
|
+
|
41
|
+
To add or remove a domain from the list of known government domains, simply edit the `domains.txt` file.
|
42
|
+
|
43
|
+
|
44
|
+
## Your first contribution
|
45
|
+
|
46
|
+
We'd love for you to contribute to the project. Unsure where to begin contributing to Gman? You can start by looking through these "good first issue" and "help wanted" issues:
|
47
|
+
|
48
|
+
* [Good first issues](https://github.com/benbalter/gman/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) - issues which should only require a few lines of code and a test or two
|
49
|
+
* [Help wanted issues](https://github.com/benbalter/gman/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22) - issues which may be a bit more involved, but are specifically seeking community contributions
|
50
|
+
|
51
|
+
*p.s. Feel free to ask for help; everyone is a beginner at first* :smiley_cat:
|
52
|
+
|
53
|
+
## How to propose changes
|
54
|
+
|
55
|
+
Here's a few general guidelines for proposing changes:
|
56
|
+
|
57
|
+
* If you are changing any user-facing functionality, please be sure to update the documentation
|
58
|
+
* If you are adding a new behavior or changing an existing behavior, please be sure to update the corresponding test(s)
|
59
|
+
* Each pull request should implement **one** feature or bug fix. If you want to add or fix more than one thing, submit more than one pull request
|
60
|
+
* Do not commit changes to files that are irrelevant to your feature or bug fix
|
61
|
+
* Don't bump the version number in your pull request (it will be bumped prior to release)
|
62
|
+
* Write [a good commit message](http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html)
|
63
|
+
|
64
|
+
At a high level, [the process for proposing changes](https://guides.github.com/introduction/flow/) is:
|
65
|
+
|
66
|
+
1. [Fork](https://github.com/benbalter/gman/fork) and clone the project
|
67
|
+
2. Configure and install the dependencies: `script/bootstrap`
|
68
|
+
3. Make sure the tests pass on your machine: `script/cibuild`
|
69
|
+
4. Create a descriptively named branch: `git checkout -b my-branch-name`
|
70
|
+
5. Make your change, add tests and documentation, and make sure the tests still pass
|
71
|
+
6. Push to your fork and [submit a pull request](https://github.com/benbalter/gman/compare) describing your change
|
72
|
+
7. Pat your self on the back and wait for your pull request to be reviewed and merged
|
73
|
+
|
74
|
+
**Interesting in submitting your first Pull Request?** It's easy! You can learn how from this *free* series [How to Contribute to an Open Source Project on GitHub](https://egghead.io/series/how-to-contribute-to-an-open-source-project-on-github)
|
75
|
+
|
76
|
+
## Bootstrapping your local development environment
|
77
|
+
|
78
|
+
`script/bootstrap`
|
79
|
+
|
80
|
+
## Running tests
|
81
|
+
|
82
|
+
`script/cibuild`
|
83
|
+
|
84
|
+
## Code of conduct
|
85
|
+
|
86
|
+
This project is governed by [the Contributor Covenant Code of Conduct](CODE_OF_CONDUCT.md). By participating, you are expected to uphold this code.
|
87
|
+
|
88
|
+
## Additional Resources
|
89
|
+
|
90
|
+
* [Contributing to Open Source on GitHub](https://guides.github.com/activities/contributing-to-open-source/)
|
91
|
+
* [Using Pull Requests](https://help.github.com/articles/using-pull-requests/)
|
92
|
+
* [GitHub Help](https://help.github.com)
|
@@ -1,6 +1,6 @@
|
|
1
|
-
# Gman
|
1
|
+
# Gman
|
2
2
|
|
3
|
-
[![Build Status](https://travis-ci.org/benbalter/gman.png)](https://travis-ci.org/benbalter/gman) [![Gem Version](https://badge.fury.io/rb/gman.png)](http://badge.fury.io/rb/gman)
|
3
|
+
[![Build Status](https://travis-ci.org/benbalter/gman.png)](https://travis-ci.org/benbalter/gman) [![Gem Version](https://badge.fury.io/rb/gman.png)](http://badge.fury.io/rb/gman) [![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)](http://makeapullrequest.com)
|
4
4
|
|
5
5
|
A ruby gem to check if the owner of a given email address or website is working for THE MAN (a.k.a verifies government domains). It will also provide you with metadata about the domain, such as the country, state, city, or agency, where applicable. It does this by leveraging the power of [Naughty or Nice](https://github.com/benbalter/naughty_or_nice), the [Public Suffix List](http://publicsuffix.org/), and the associated [Ruby Gem](https://github.com/weppos/publicsuffix-ruby).
|
6
6
|
|
@@ -72,7 +72,7 @@ domain.country.name #=> "United States"
|
|
72
72
|
domain.country.alpha2 #=> "US"
|
73
73
|
domain.country.alpha3 #=> "USA"
|
74
74
|
domain.country.currency #=> "USD"
|
75
|
-
domain.
|
75
|
+
domain.country.calling_code #=> "+1"
|
76
76
|
```
|
77
77
|
|
78
78
|
### Check if a country is on the US Sanctions list
|
data/docs/SECURITY.md
ADDED
data/docs/_config.yml
ADDED
data/gman.gemspec
CHANGED
@@ -1,14 +1,16 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
1
3
|
require File.expand_path './lib/gman/version', File.dirname(__FILE__)
|
2
4
|
|
3
5
|
Gem::Specification.new do |s|
|
4
6
|
s.name = 'gman'
|
5
|
-
s.summary = <<-
|
7
|
+
s.summary = <<-SUMMARY
|
6
8
|
Check if a given domain or email address belong to a governemnt entity
|
7
|
-
|
8
|
-
s.description = <<-
|
9
|
+
SUMMARY
|
10
|
+
s.description = <<-DESC
|
9
11
|
A ruby gem to check if the owner of a given email address is working for
|
10
12
|
THE MAN.
|
11
|
-
|
13
|
+
DESC
|
12
14
|
s.version = Gman::VERSION
|
13
15
|
s.authors = ['Ben Balter']
|
14
16
|
s.email = 'ben.balter@github.com'
|
@@ -20,24 +22,23 @@ Gem::Specification.new do |s|
|
|
20
22
|
s.executables = `git ls-files -- bin/*`.split("\n").map do |f|
|
21
23
|
File.basename(f)
|
22
24
|
end
|
23
|
-
s.require_paths = ['lib']
|
24
25
|
|
25
26
|
s.require_paths = ['lib']
|
26
|
-
s.required_ruby_version = '~> 2.
|
27
|
+
s.required_ruby_version = '~> 2.5'
|
27
28
|
|
28
|
-
s.add_dependency('swot', '~> 1.0')
|
29
|
-
s.add_dependency('iso_country_codes', '~> 0.6')
|
30
|
-
s.add_dependency('naughty_or_nice', '~> 2.0')
|
31
29
|
s.add_dependency('colored', '~> 1.2')
|
30
|
+
s.add_dependency('iso_country_codes', '~> 0.6')
|
31
|
+
s.add_dependency('naughty_or_nice', '= 2.1.1')
|
32
|
+
s.add_dependency('public_suffix', '>= 3.0')
|
32
33
|
|
33
|
-
s.add_development_dependency('rake', '~> 10.4')
|
34
|
-
s.add_development_dependency('shoulda', '~> 3.5')
|
35
|
-
s.add_development_dependency('rdoc', '~> 4.2')
|
36
|
-
s.add_development_dependency('bundler', '~> 1.10')
|
37
|
-
s.add_development_dependency('pry', '~> 0.10')
|
38
|
-
s.add_development_dependency('parallel', '~> 1.6')
|
39
|
-
s.add_development_dependency('mechanize', '~> 2.7')
|
40
34
|
s.add_development_dependency('addressable', '~> 2.3')
|
35
|
+
s.add_development_dependency('mechanize', '~> 2.7')
|
36
|
+
s.add_development_dependency('parallel', '~> 1.6')
|
37
|
+
s.add_development_dependency('pry', '~> 0.10')
|
38
|
+
s.add_development_dependency('rspec', '~> 3.5')
|
39
|
+
s.add_development_dependency('rubocop', '~> 1.0')
|
40
|
+
s.add_development_dependency('rubocop-performance', '~> 1.5')
|
41
|
+
s.add_development_dependency('rubocop-rspec', '~> 2.0')
|
41
42
|
s.add_development_dependency('ruby-prof', '~> 0.15')
|
42
|
-
s.add_development_dependency('
|
43
|
+
s.add_development_dependency('swot', '~> 1.0')
|
43
44
|
end
|
data/lib/gman.rb
CHANGED
@@ -1,38 +1,41 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
$LOAD_PATH.unshift(File.dirname(__FILE__))
|
4
|
+
|
1
5
|
require 'naughty_or_nice'
|
2
|
-
require 'swot'
|
3
6
|
require 'iso_country_codes'
|
4
7
|
require 'csv'
|
5
8
|
require_relative 'gman/version'
|
6
9
|
require_relative 'gman/country_codes'
|
7
|
-
require_relative 'gman/locality'
|
8
10
|
require_relative 'gman/identifier'
|
9
11
|
|
10
12
|
class Gman
|
11
13
|
include NaughtyOrNice
|
12
14
|
|
15
|
+
autoload :DomainList, 'gman/domain_list'
|
16
|
+
autoload :Importer, 'gman/importer'
|
17
|
+
autoload :Locality, 'gman/locality'
|
18
|
+
|
13
19
|
class << self
|
14
|
-
# returns an instance of our custom public suffix list
|
15
|
-
# list behaves like PublicSuffix::List
|
16
|
-
# but is limited to our whitelisted domains
|
17
20
|
def list
|
18
|
-
@list ||=
|
21
|
+
@list ||= DomainList.new(path: list_path)
|
22
|
+
end
|
23
|
+
|
24
|
+
def academic_list
|
25
|
+
@academic_list ||= DomainList.new(path: academic_list_path)
|
19
26
|
end
|
20
27
|
|
21
28
|
def config_path
|
22
|
-
File.expand_path '../config', File.dirname(__FILE__)
|
29
|
+
@config_path ||= File.expand_path '../config', File.dirname(__FILE__)
|
23
30
|
end
|
24
31
|
|
25
32
|
# Returns the absolute path to the domain list
|
26
33
|
def list_path
|
27
|
-
|
28
|
-
File.expand_path '../test/fixtures/domains.txt', File.dirname(__FILE__)
|
29
|
-
else
|
30
|
-
File.expand_path 'domains.txt', config_path
|
31
|
-
end
|
34
|
+
File.expand_path 'domains.txt', config_path
|
32
35
|
end
|
33
36
|
|
34
|
-
def
|
35
|
-
|
37
|
+
def academic_list_path
|
38
|
+
File.expand_path 'vendor/academic.txt', config_path
|
36
39
|
end
|
37
40
|
end
|
38
41
|
|
@@ -43,25 +46,26 @@ class Gman
|
|
43
46
|
@valid ||= begin
|
44
47
|
return false unless valid_domain?
|
45
48
|
return false if academic?
|
49
|
+
|
46
50
|
locality? || public_suffix_valid?
|
47
51
|
end
|
48
52
|
end
|
49
53
|
|
54
|
+
def locality?
|
55
|
+
Locality.valid?(domain)
|
56
|
+
end
|
57
|
+
|
50
58
|
private
|
51
59
|
|
52
60
|
def valid_domain?
|
53
|
-
|
61
|
+
@valid_domain ||= !domain.nil? && !academic?
|
54
62
|
end
|
55
63
|
|
56
64
|
def academic?
|
57
|
-
domain &&
|
65
|
+
@academic ||= domain && Gman.academic_list.valid?(to_s)
|
58
66
|
end
|
59
67
|
|
60
|
-
# domain is on the domain list and
|
61
|
-
# domain is not explicitly blacklisted and
|
62
|
-
# domain matches a standard public suffix list rule
|
63
68
|
def public_suffix_valid?
|
64
|
-
|
65
|
-
!rule.nil? && rule.type != :exception && rule.allow?(".#{domain}")
|
69
|
+
@public_suffix_valid ||= Gman.list.valid?(to_s)
|
66
70
|
end
|
67
71
|
end
|
data/lib/gman/country_codes.rb
CHANGED
@@ -1,19 +1,21 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
1
3
|
class Gman
|
2
4
|
# Map last part of TLD to alpha2 country code
|
3
5
|
ALPHA2_MAP = {
|
4
|
-
ac:
|
5
|
-
uk:
|
6
|
-
su:
|
7
|
-
tp:
|
8
|
-
yu:
|
9
|
-
gov:
|
10
|
-
mil:
|
11
|
-
org:
|
12
|
-
com:
|
13
|
-
net:
|
14
|
-
edu:
|
6
|
+
ac: 'sh',
|
7
|
+
uk: 'gb',
|
8
|
+
su: 'ru',
|
9
|
+
tp: 'tl',
|
10
|
+
yu: 'rs',
|
11
|
+
gov: 'us',
|
12
|
+
mil: 'us',
|
13
|
+
org: 'us',
|
14
|
+
com: 'us',
|
15
|
+
net: 'us',
|
16
|
+
edu: 'us',
|
15
17
|
travel: 'us',
|
16
|
-
info:
|
18
|
+
info: 'us'
|
17
19
|
}.freeze
|
18
20
|
|
19
21
|
# Returns the two character alpha county code represented by the domain
|
@@ -21,13 +23,10 @@ class Gman
|
|
21
23
|
# e.g., United States = US, United Kingdom = GB
|
22
24
|
def alpha2
|
23
25
|
return unless domain
|
26
|
+
|
24
27
|
@alpha2 ||= begin
|
25
28
|
alpha2 = domain.tld.split('.').last
|
26
|
-
|
27
|
-
ALPHA2_MAP[alpha2.to_sym]
|
28
|
-
else
|
29
|
-
alpha2
|
30
|
-
end
|
29
|
+
ALPHA2_MAP[alpha2.to_sym] || alpha2
|
31
30
|
end
|
32
31
|
end
|
33
32
|
|
@@ -38,6 +37,7 @@ class Gman
|
|
38
37
|
# Gman.new("foo.gov").country.currency => "USD"
|
39
38
|
def country
|
40
39
|
return @country if defined? @country
|
40
|
+
|
41
41
|
@country ||= begin
|
42
42
|
IsoCountryCodes.find(alpha2) if alpha2
|
43
43
|
rescue IsoCountryCodes::UnknownCodeError
|
data/lib/gman/domain_list.rb
CHANGED
@@ -1,39 +1,106 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
1
3
|
class Gman
|
2
4
|
class DomainList
|
3
|
-
|
4
|
-
|
5
|
+
COMMENT_REGEX = %r{//[/\s]*(.*)$}i.freeze
|
6
|
+
|
7
|
+
attr_writer :data, :path, :contents
|
8
|
+
|
9
|
+
class << self
|
10
|
+
# The current, government domain list
|
11
|
+
def current
|
12
|
+
DomainList.new(path: Gman.list_path)
|
13
|
+
end
|
14
|
+
|
15
|
+
def from_file(path)
|
16
|
+
DomainList.new(path: path)
|
17
|
+
end
|
18
|
+
|
19
|
+
def from_hash(hash)
|
20
|
+
DomainList.new(data: hash)
|
21
|
+
end
|
5
22
|
|
6
|
-
|
23
|
+
def from_public_suffix(string)
|
24
|
+
DomainList.new(contents: string)
|
25
|
+
end
|
26
|
+
alias from_string from_public_suffix
|
27
|
+
end
|
28
|
+
|
29
|
+
def initialize(path: nil, contents: nil, data: nil)
|
30
|
+
@path = path
|
31
|
+
@contents = contents
|
32
|
+
@data = data.reject { |_, domains| domains.compact.empty? } if data
|
33
|
+
end
|
34
|
+
|
35
|
+
# Returns the raw content of the domain list as a string
|
36
|
+
def contents
|
37
|
+
@contents ||= if path
|
38
|
+
File.new(path, 'r:utf-8').read
|
39
|
+
else
|
40
|
+
to_s
|
41
|
+
end
|
42
|
+
end
|
43
|
+
|
44
|
+
# Returns the parsed contents of the domain list as a hash
|
45
|
+
# in the form for group => domains
|
46
|
+
def data
|
47
|
+
@data ||= string_to_hash(contents)
|
48
|
+
end
|
49
|
+
alias to_h data
|
7
50
|
|
8
|
-
|
9
|
-
|
51
|
+
# Returns the path to the domain list on disk
|
52
|
+
def path
|
53
|
+
@path ||= Gman.list_path
|
10
54
|
end
|
11
55
|
|
56
|
+
# returns an instance of our custom public suffix list
|
57
|
+
# list behaves like PublicSuffix::List
|
58
|
+
# but is limited to our whitelisted domains
|
59
|
+
def public_suffix_list
|
60
|
+
@public_suffix_list ||= PublicSuffix::List.parse(contents)
|
61
|
+
end
|
62
|
+
|
63
|
+
# domain is on the domain list
|
64
|
+
def valid?(domain)
|
65
|
+
rule = public_suffix_list.find(domain, default: nil)
|
66
|
+
!(rule.nil? || rule.is_a?(PublicSuffix::Rule::Exception))
|
67
|
+
end
|
68
|
+
|
69
|
+
# Returns an array of strings representing the list groups
|
12
70
|
def groups
|
13
|
-
|
71
|
+
data.keys
|
14
72
|
end
|
15
73
|
|
74
|
+
# Return an array of strings representing all domains on the list
|
16
75
|
def domains
|
17
|
-
|
76
|
+
data.values.flatten.compact.sort.uniq
|
18
77
|
end
|
19
78
|
|
79
|
+
# Return the total number of domains in the list
|
20
80
|
def count
|
21
81
|
domains.count
|
22
82
|
end
|
23
83
|
|
84
|
+
# Alphabetize groups and domains within each group
|
85
|
+
# We need to ensure exceptions appear after their coresponding rules
|
24
86
|
def alphabetize
|
25
|
-
@
|
26
|
-
@
|
87
|
+
@data = data.sort_by { |k, _v| k.downcase }.to_h
|
88
|
+
@data.map do |_group, domains|
|
89
|
+
domains.sort! { |a, b| sort_with_exceptions(a, b) }
|
90
|
+
domains.uniq!
|
91
|
+
end
|
27
92
|
end
|
28
93
|
|
94
|
+
# Write the domain list to disk
|
29
95
|
def write
|
30
96
|
alphabetize
|
31
|
-
File.write(
|
97
|
+
File.write(path, to_public_suffix)
|
32
98
|
end
|
33
99
|
|
34
|
-
|
35
|
-
|
36
|
-
|
100
|
+
# The string representation of the domain list, in public suffix format
|
101
|
+
def to_s
|
102
|
+
current_group = output = +''
|
103
|
+
data.sort_by { |group, _| group.downcase }.each do |group, domains|
|
37
104
|
if group != current_group
|
38
105
|
output << "\n\n" unless current_group.empty? # first entry
|
39
106
|
output << "// #{group}\n"
|
@@ -43,44 +110,59 @@ class Gman
|
|
43
110
|
end
|
44
111
|
output
|
45
112
|
end
|
113
|
+
alias to_public_suffix to_s
|
46
114
|
|
47
|
-
|
48
|
-
|
49
|
-
|
115
|
+
# Given a domain, find any domain on the list that includes that domain
|
116
|
+
# E.g., `fcc.gov` would be the parent of `data.fcc.gov`
|
117
|
+
def parent_domain(domain)
|
118
|
+
domains.find { |c| domain =~ /\.#{Regexp.escape(c)}$/ }
|
50
119
|
end
|
51
120
|
|
52
|
-
|
53
|
-
|
54
|
-
|
55
|
-
|
121
|
+
private
|
122
|
+
|
123
|
+
# Parse a public-suffix formatted string into a hash of groups => [domains]
|
124
|
+
def string_to_hash(string)
|
125
|
+
return unless string
|
126
|
+
|
127
|
+
lines = string_to_array(string)
|
128
|
+
array_to_hash(lines)
|
56
129
|
end
|
57
130
|
|
58
|
-
def
|
59
|
-
|
131
|
+
def string_to_array(string)
|
132
|
+
string.gsub(/\r\n?/, "\n").split("\n")
|
60
133
|
end
|
61
134
|
|
62
|
-
|
63
|
-
|
64
|
-
|
65
|
-
|
66
|
-
|
67
|
-
|
68
|
-
|
69
|
-
|
70
|
-
domains.each do |line|
|
71
|
-
if line =~ COMMENT_REGEX
|
72
|
-
group = COMMENT_REGEX.match(line)[1]
|
73
|
-
else
|
74
|
-
safe_push(domain_hash, group, line.downcase)
|
75
|
-
end
|
135
|
+
def array_to_hash(lines)
|
136
|
+
domain_hash = {}
|
137
|
+
group = ''
|
138
|
+
lines.each do |line|
|
139
|
+
if COMMENT_REGEX.match?(line)
|
140
|
+
group = COMMENT_REGEX.match(line)[1]
|
141
|
+
else
|
142
|
+
safe_push(domain_hash, group, line.downcase)
|
76
143
|
end
|
77
|
-
domain_hash
|
78
144
|
end
|
145
|
+
domain_hash
|
146
|
+
end
|
147
|
+
|
148
|
+
# Add a value to an array in a hash, creating the array if necessary
|
149
|
+
# hash - the hash
|
150
|
+
# key - the key within that hash to add the value to
|
151
|
+
# value - the single value to push into the array at hash[key]
|
152
|
+
def safe_push(hash, key, value)
|
153
|
+
return if value.empty?
|
154
|
+
|
155
|
+
hash[key] ||= []
|
156
|
+
hash[key].push value
|
157
|
+
end
|
79
158
|
|
80
|
-
|
81
|
-
|
82
|
-
|
83
|
-
|
159
|
+
def sort_with_exceptions(left, right)
|
160
|
+
if left.start_with?('!') && !right.start_with?('!')
|
161
|
+
1
|
162
|
+
elsif right.start_with?('!') && !left.start_with?('!')
|
163
|
+
-1
|
164
|
+
else
|
165
|
+
left <=> right
|
84
166
|
end
|
85
167
|
end
|
86
168
|
end
|