gman 6.0.1 → 7.0.4

Sign up to get free protection for your applications and to get access to all the features.
Files changed (73) hide show
  1. checksums.yaml +5 -5
  2. data/.github/CODEOWNERS +3 -0
  3. data/.github/ISSUE_TEMPLATE/bug_report.md +28 -0
  4. data/.github/ISSUE_TEMPLATE/feature_request.md +21 -0
  5. data/.github/config.yml +23 -0
  6. data/.github/funding.yml +1 -0
  7. data/.github/no-response.yml +15 -0
  8. data/.github/release-drafter.yml +4 -0
  9. data/.github/settings.yml +33 -0
  10. data/.github/stale.yml +29 -0
  11. data/.gitignore +1 -0
  12. data/.rspec +2 -0
  13. data/.rubocop.yml +17 -5
  14. data/.rubocop_todo.yml +84 -0
  15. data/.ruby-version +1 -1
  16. data/Gemfile +2 -0
  17. data/bin/gman +6 -4
  18. data/bin/gman_filter +5 -7
  19. data/config/domains.txt +8446 -173
  20. data/config/vendor/academic.txt +8038 -0
  21. data/config/vendor/dotgovs.csv +5786 -5560
  22. data/docs/CODE_OF_CONDUCT.md +46 -0
  23. data/docs/CONTRIBUTING.md +92 -0
  24. data/{README.md → docs/README.md} +3 -3
  25. data/docs/SECURITY.md +3 -0
  26. data/docs/_config.yml +2 -0
  27. data/gman.gemspec +18 -17
  28. data/lib/gman.rb +25 -21
  29. data/lib/gman/country_codes.rb +17 -17
  30. data/lib/gman/domain_list.rb +123 -41
  31. data/lib/gman/identifier.rb +59 -21
  32. data/lib/gman/importer.rb +39 -40
  33. data/lib/gman/locality.rb +23 -21
  34. data/lib/gman/version.rb +3 -1
  35. data/script/add +2 -0
  36. data/script/alphabetize +2 -0
  37. data/script/cibuild +1 -1
  38. data/script/dedupe +2 -1
  39. data/script/profile +2 -1
  40. data/script/prune +5 -3
  41. data/script/reconcile-us +6 -3
  42. data/script/vendor +1 -1
  43. data/script/vendor-federal-de +3 -3
  44. data/script/vendor-municipal-de +3 -3
  45. data/script/vendor-nl +4 -1
  46. data/script/vendor-public-suffix +7 -6
  47. data/script/vendor-se +3 -3
  48. data/script/vendor-swot +43 -0
  49. data/script/vendor-us +8 -5
  50. data/spec/fixtures/domains.txt +4 -0
  51. data/{test → spec}/fixtures/obama.txt +0 -0
  52. data/spec/gman/bin_spec.rb +101 -0
  53. data/spec/gman/country_code_spec.rb +39 -0
  54. data/spec/gman/domain_list_spec.rb +110 -0
  55. data/spec/gman/domains_spec.rb +25 -0
  56. data/spec/gman/identifier_spec.rb +218 -0
  57. data/spec/gman/importer_spec.rb +236 -0
  58. data/spec/gman/locality_spec.rb +24 -0
  59. data/spec/gman_spec.rb +74 -0
  60. data/spec/spec_helper.rb +31 -0
  61. metadata +86 -73
  62. data/CONTRIBUTING.md +0 -22
  63. data/Rakefile +0 -22
  64. data/test/fixtures/domains.txt +0 -2
  65. data/test/helper.rb +0 -40
  66. data/test/test_gman.rb +0 -62
  67. data/test/test_gman_bin.rb +0 -75
  68. data/test/test_gman_country_codes.rb +0 -18
  69. data/test/test_gman_domains.rb +0 -33
  70. data/test/test_gman_filter.rb +0 -17
  71. data/test/test_gman_identifier.rb +0 -106
  72. data/test/test_gman_importer.rb +0 -250
  73. data/test/test_gman_locality.rb +0 -10
@@ -0,0 +1,46 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation.
6
+
7
+ ## Our Standards
8
+
9
+ Examples of behavior that contributes to creating a positive environment include:
10
+
11
+ * Using welcoming and inclusive language
12
+ * Being respectful of differing viewpoints and experiences
13
+ * Gracefully accepting constructive criticism
14
+ * Focusing on what is best for the community
15
+ * Showing empathy towards other community members
16
+
17
+ Examples of unacceptable behavior by participants include:
18
+
19
+ * The use of sexualized language or imagery and unwelcome sexual attention or advances
20
+ * Trolling, insulting/derogatory comments, and personal or political attacks
21
+ * Public or private harassment
22
+ * Publishing others' private information, such as a physical or electronic address, without explicit permission
23
+ * Other conduct which could reasonably be considered inappropriate in a professional setting
24
+
25
+ ## Our Responsibilities
26
+
27
+ Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
28
+
29
+ Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
30
+
31
+ ## Scope
32
+
33
+ This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.
34
+
35
+ ## Enforcement
36
+
37
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at ben@balter.com. The project team will review and investigate all complaints, and will respond in a way that it deems appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
38
+
39
+ Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership.
40
+
41
+ ## Attribution
42
+
43
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [http://contributor-covenant.org/version/1/4][version]
44
+
45
+ [homepage]: http://contributor-covenant.org
46
+ [version]: http://contributor-covenant.org/version/1/4/
@@ -0,0 +1,92 @@
1
+ # Contributing to Gman
2
+
3
+ Hi there! We're thrilled that you'd like to contribute to Gman. Your help is essential for keeping it great.
4
+
5
+ Gman is an open source project supported by the efforts of an entire community and built one contribution at a time by users like you. We'd love for you to get involved. Whatever your level of skill or however much time you can give, your contribution is greatly appreciated. There are many ways to contribute, from writing tutorials or blog posts, improving the documentation, submitting bug reports and feature requests, helping other users by commenting on issues, or writing code which can be incorporated into Gman itself.
6
+
7
+ Following these guidelines helps to communicate that you respect the time of the developers managing and developing this open source project. In return, they should reciprocate that respect in addressing your issue, assessing changes, and helping you finalize your pull requests.
8
+
9
+
10
+
11
+ ## How to report a bug
12
+
13
+ Think you found a bug? Please check [the list of open issues](https://github.com/benbalter/gman/issues) to see if your bug has already been reported. If it hasn't please [submit a new issue](https://github.com/benbalter/gman/issues/new).
14
+
15
+ Here are a few tips for writing *great* bug reports:
16
+
17
+ * Describe the specific problem (e.g., "widget doesn't turn clockwise" versus "getting an error")
18
+ * Include the steps to reproduce the bug, what you expected to happen, and what happened instead
19
+ * Check that you are using the latest version of the project and its dependencies
20
+ * Include what version of the project your using, as well as any relevant dependencies
21
+ * Only include one bug per issue. If you have discovered two bugs, please file two issues
22
+ * Include screenshots or screencasts whenever possible
23
+ * Even if you don't know how to fix the bug, including a failing test may help others track it down
24
+
25
+ **If you find a security vulnerability, do not open an issue. Please email ben@balter.com instead.**
26
+
27
+ ## How to suggest a feature or enhancement
28
+
29
+ If you find yourself wishing for a feature that doesn't exist in Gman, you are probably not alone. There are bound to be others out there with similar needs. Many of the features that Gman has today have been added because our users saw the need.
30
+
31
+ Feature requests are welcome. But take a moment to find out whether your idea fits with the scope and goals of the project. It's up to you to make a strong case to convince the project's developers of the merits of this feature. Please provide as much detail and context as possible, including describing the problem you're trying to solve.
32
+
33
+ [Open an issue](https://github.com/benbalter/gman/issues/new) which describes the feature you would like to see, why you want it, how it should work, etc.
34
+
35
+ ## Domains
36
+
37
+ Domains live in [`config/domains.txt`](../config/domains.txt) as a list of TLDs and SLD+TLDs.
38
+
39
+ Right now, the only valid government top level domains (TLDs), represent the US government and are `.gov`, and `.mil`. Secondary domains (e.g., `gov.uk`, or `mil.au`) represent non-US government entities.
40
+
41
+ To add or remove a domain from the list of known government domains, simply edit the `domains.txt` file.
42
+
43
+
44
+ ## Your first contribution
45
+
46
+ We'd love for you to contribute to the project. Unsure where to begin contributing to Gman? You can start by looking through these "good first issue" and "help wanted" issues:
47
+
48
+ * [Good first issues](https://github.com/benbalter/gman/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) - issues which should only require a few lines of code and a test or two
49
+ * [Help wanted issues](https://github.com/benbalter/gman/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22) - issues which may be a bit more involved, but are specifically seeking community contributions
50
+
51
+ *p.s. Feel free to ask for help; everyone is a beginner at first* :smiley_cat:
52
+
53
+ ## How to propose changes
54
+
55
+ Here's a few general guidelines for proposing changes:
56
+
57
+ * If you are changing any user-facing functionality, please be sure to update the documentation
58
+ * If you are adding a new behavior or changing an existing behavior, please be sure to update the corresponding test(s)
59
+ * Each pull request should implement **one** feature or bug fix. If you want to add or fix more than one thing, submit more than one pull request
60
+ * Do not commit changes to files that are irrelevant to your feature or bug fix
61
+ * Don't bump the version number in your pull request (it will be bumped prior to release)
62
+ * Write [a good commit message](http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html)
63
+
64
+ At a high level, [the process for proposing changes](https://guides.github.com/introduction/flow/) is:
65
+
66
+ 1. [Fork](https://github.com/benbalter/gman/fork) and clone the project
67
+ 2. Configure and install the dependencies: `script/bootstrap`
68
+ 3. Make sure the tests pass on your machine: `script/cibuild`
69
+ 4. Create a descriptively named branch: `git checkout -b my-branch-name`
70
+ 5. Make your change, add tests and documentation, and make sure the tests still pass
71
+ 6. Push to your fork and [submit a pull request](https://github.com/benbalter/gman/compare) describing your change
72
+ 7. Pat your self on the back and wait for your pull request to be reviewed and merged
73
+
74
+ **Interesting in submitting your first Pull Request?** It's easy! You can learn how from this *free* series [How to Contribute to an Open Source Project on GitHub](https://egghead.io/series/how-to-contribute-to-an-open-source-project-on-github)
75
+
76
+ ## Bootstrapping your local development environment
77
+
78
+ `script/bootstrap`
79
+
80
+ ## Running tests
81
+
82
+ `script/cibuild`
83
+
84
+ ## Code of conduct
85
+
86
+ This project is governed by [the Contributor Covenant Code of Conduct](CODE_OF_CONDUCT.md). By participating, you are expected to uphold this code.
87
+
88
+ ## Additional Resources
89
+
90
+ * [Contributing to Open Source on GitHub](https://guides.github.com/activities/contributing-to-open-source/)
91
+ * [Using Pull Requests](https://help.github.com/articles/using-pull-requests/)
92
+ * [GitHub Help](https://help.github.com)
@@ -1,6 +1,6 @@
1
- # Gman Gem
1
+ # Gman
2
2
 
3
- [![Build Status](https://travis-ci.org/benbalter/gman.png)](https://travis-ci.org/benbalter/gman) [![Gem Version](https://badge.fury.io/rb/gman.png)](http://badge.fury.io/rb/gman)
3
+ [![Build Status](https://travis-ci.org/benbalter/gman.png)](https://travis-ci.org/benbalter/gman) [![Gem Version](https://badge.fury.io/rb/gman.png)](http://badge.fury.io/rb/gman) [![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)](http://makeapullrequest.com)
4
4
 
5
5
  A ruby gem to check if the owner of a given email address or website is working for THE MAN (a.k.a verifies government domains). It will also provide you with metadata about the domain, such as the country, state, city, or agency, where applicable. It does this by leveraging the power of [Naughty or Nice](https://github.com/benbalter/naughty_or_nice), the [Public Suffix List](http://publicsuffix.org/), and the associated [Ruby Gem](https://github.com/weppos/publicsuffix-ruby).
6
6
 
@@ -72,7 +72,7 @@ domain.country.name #=> "United States"
72
72
  domain.country.alpha2 #=> "US"
73
73
  domain.country.alpha3 #=> "USA"
74
74
  domain.country.currency #=> "USD"
75
- domain.conutry.calling_code #=> "+1"
75
+ domain.country.calling_code #=> "+1"
76
76
  ```
77
77
 
78
78
  ### Check if a country is on the US Sanctions list
@@ -0,0 +1,3 @@
1
+ # Security Policy
2
+
3
+ To report a security vulnerability, please email [ben@balter.com](mailto:ben@balter.com).
@@ -0,0 +1,2 @@
1
+ title: Gman
2
+ description: A ruby gem to verify if the owner of a given email address, website, or domain is a government agency.
@@ -1,14 +1,16 @@
1
+ # frozen_string_literal: true
2
+
1
3
  require File.expand_path './lib/gman/version', File.dirname(__FILE__)
2
4
 
3
5
  Gem::Specification.new do |s|
4
6
  s.name = 'gman'
5
- s.summary = <<-EOF
7
+ s.summary = <<-SUMMARY
6
8
  Check if a given domain or email address belong to a governemnt entity
7
- EOF
8
- s.description = <<-EOF
9
+ SUMMARY
10
+ s.description = <<-DESC
9
11
  A ruby gem to check if the owner of a given email address is working for
10
12
  THE MAN.
11
- EOF
13
+ DESC
12
14
  s.version = Gman::VERSION
13
15
  s.authors = ['Ben Balter']
14
16
  s.email = 'ben.balter@github.com'
@@ -20,24 +22,23 @@ Gem::Specification.new do |s|
20
22
  s.executables = `git ls-files -- bin/*`.split("\n").map do |f|
21
23
  File.basename(f)
22
24
  end
23
- s.require_paths = ['lib']
24
25
 
25
26
  s.require_paths = ['lib']
26
- s.required_ruby_version = '~> 2.0'
27
+ s.required_ruby_version = '~> 2.5'
27
28
 
28
- s.add_dependency('swot', '~> 1.0')
29
- s.add_dependency('iso_country_codes', '~> 0.6')
30
- s.add_dependency('naughty_or_nice', '~> 2.0')
31
29
  s.add_dependency('colored', '~> 1.2')
30
+ s.add_dependency('iso_country_codes', '~> 0.6')
31
+ s.add_dependency('naughty_or_nice', '= 2.1.1')
32
+ s.add_dependency('public_suffix', '>= 3.0')
32
33
 
33
- s.add_development_dependency('rake', '~> 10.4')
34
- s.add_development_dependency('shoulda', '~> 3.5')
35
- s.add_development_dependency('rdoc', '~> 4.2')
36
- s.add_development_dependency('bundler', '~> 1.10')
37
- s.add_development_dependency('pry', '~> 0.10')
38
- s.add_development_dependency('parallel', '~> 1.6')
39
- s.add_development_dependency('mechanize', '~> 2.7')
40
34
  s.add_development_dependency('addressable', '~> 2.3')
35
+ s.add_development_dependency('mechanize', '~> 2.7')
36
+ s.add_development_dependency('parallel', '~> 1.6')
37
+ s.add_development_dependency('pry', '~> 0.10')
38
+ s.add_development_dependency('rspec', '~> 3.5')
39
+ s.add_development_dependency('rubocop', '~> 1.0')
40
+ s.add_development_dependency('rubocop-performance', '~> 1.5')
41
+ s.add_development_dependency('rubocop-rspec', '~> 2.0')
41
42
  s.add_development_dependency('ruby-prof', '~> 0.15')
42
- s.add_development_dependency('rubocop', '~> 0.37')
43
+ s.add_development_dependency('swot', '~> 1.0')
43
44
  end
@@ -1,38 +1,41 @@
1
+ # frozen_string_literal: true
2
+
3
+ $LOAD_PATH.unshift(File.dirname(__FILE__))
4
+
1
5
  require 'naughty_or_nice'
2
- require 'swot'
3
6
  require 'iso_country_codes'
4
7
  require 'csv'
5
8
  require_relative 'gman/version'
6
9
  require_relative 'gman/country_codes'
7
- require_relative 'gman/locality'
8
10
  require_relative 'gman/identifier'
9
11
 
10
12
  class Gman
11
13
  include NaughtyOrNice
12
14
 
15
+ autoload :DomainList, 'gman/domain_list'
16
+ autoload :Importer, 'gman/importer'
17
+ autoload :Locality, 'gman/locality'
18
+
13
19
  class << self
14
- # returns an instance of our custom public suffix list
15
- # list behaves like PublicSuffix::List
16
- # but is limited to our whitelisted domains
17
20
  def list
18
- @list ||= PublicSuffix::List.parse(list_contents)
21
+ @list ||= DomainList.new(path: list_path)
22
+ end
23
+
24
+ def academic_list
25
+ @academic_list ||= DomainList.new(path: academic_list_path)
19
26
  end
20
27
 
21
28
  def config_path
22
- File.expand_path '../config', File.dirname(__FILE__)
29
+ @config_path ||= File.expand_path '../config', File.dirname(__FILE__)
23
30
  end
24
31
 
25
32
  # Returns the absolute path to the domain list
26
33
  def list_path
27
- if ENV['GMAN_STUB_DOMAINS']
28
- File.expand_path '../test/fixtures/domains.txt', File.dirname(__FILE__)
29
- else
30
- File.expand_path 'domains.txt', config_path
31
- end
34
+ File.expand_path 'domains.txt', config_path
32
35
  end
33
36
 
34
- def list_contents
35
- @list_contents ||= File.new(list_path, 'r:utf-8').read
37
+ def academic_list_path
38
+ File.expand_path 'vendor/academic.txt', config_path
36
39
  end
37
40
  end
38
41
 
@@ -43,25 +46,26 @@ class Gman
43
46
  @valid ||= begin
44
47
  return false unless valid_domain?
45
48
  return false if academic?
49
+
46
50
  locality? || public_suffix_valid?
47
51
  end
48
52
  end
49
53
 
54
+ def locality?
55
+ Locality.valid?(domain)
56
+ end
57
+
50
58
  private
51
59
 
52
60
  def valid_domain?
53
- domain && domain.valid? && !academic?
61
+ @valid_domain ||= !domain.nil? && !academic?
54
62
  end
55
63
 
56
64
  def academic?
57
- domain && Swot.is_academic?(domain)
65
+ @academic ||= domain && Gman.academic_list.valid?(to_s)
58
66
  end
59
67
 
60
- # domain is on the domain list and
61
- # domain is not explicitly blacklisted and
62
- # domain matches a standard public suffix list rule
63
68
  def public_suffix_valid?
64
- rule = Gman.list.find(to_s)
65
- !rule.nil? && rule.type != :exception && rule.allow?(".#{domain}")
69
+ @public_suffix_valid ||= Gman.list.valid?(to_s)
66
70
  end
67
71
  end
@@ -1,19 +1,21 @@
1
+ # frozen_string_literal: true
2
+
1
3
  class Gman
2
4
  # Map last part of TLD to alpha2 country code
3
5
  ALPHA2_MAP = {
4
- ac: 'sh',
5
- uk: 'gb',
6
- su: 'ru',
7
- tp: 'tl',
8
- yu: 'rs',
9
- gov: 'us',
10
- mil: 'us',
11
- org: 'us',
12
- com: 'us',
13
- net: 'us',
14
- edu: 'us',
6
+ ac: 'sh',
7
+ uk: 'gb',
8
+ su: 'ru',
9
+ tp: 'tl',
10
+ yu: 'rs',
11
+ gov: 'us',
12
+ mil: 'us',
13
+ org: 'us',
14
+ com: 'us',
15
+ net: 'us',
16
+ edu: 'us',
15
17
  travel: 'us',
16
- info: 'us'
18
+ info: 'us'
17
19
  }.freeze
18
20
 
19
21
  # Returns the two character alpha county code represented by the domain
@@ -21,13 +23,10 @@ class Gman
21
23
  # e.g., United States = US, United Kingdom = GB
22
24
  def alpha2
23
25
  return unless domain
26
+
24
27
  @alpha2 ||= begin
25
28
  alpha2 = domain.tld.split('.').last
26
- if ALPHA2_MAP[alpha2.to_sym]
27
- ALPHA2_MAP[alpha2.to_sym]
28
- else
29
- alpha2
30
- end
29
+ ALPHA2_MAP[alpha2.to_sym] || alpha2
31
30
  end
32
31
  end
33
32
 
@@ -38,6 +37,7 @@ class Gman
38
37
  # Gman.new("foo.gov").country.currency => "USD"
39
38
  def country
40
39
  return @country if defined? @country
40
+
41
41
  @country ||= begin
42
42
  IsoCountryCodes.find(alpha2) if alpha2
43
43
  rescue IsoCountryCodes::UnknownCodeError
@@ -1,39 +1,106 @@
1
+ # frozen_string_literal: true
2
+
1
3
  class Gman
2
4
  class DomainList
3
- attr_accessor :list
4
- alias to_h list
5
+ COMMENT_REGEX = %r{//[/\s]*(.*)$}i.freeze
6
+
7
+ attr_writer :data, :path, :contents
8
+
9
+ class << self
10
+ # The current, government domain list
11
+ def current
12
+ DomainList.new(path: Gman.list_path)
13
+ end
14
+
15
+ def from_file(path)
16
+ DomainList.new(path: path)
17
+ end
18
+
19
+ def from_hash(hash)
20
+ DomainList.new(data: hash)
21
+ end
5
22
 
6
- COMMENT_REGEX = %r{//[/\s]*(.*)$}i
23
+ def from_public_suffix(string)
24
+ DomainList.new(contents: string)
25
+ end
26
+ alias from_string from_public_suffix
27
+ end
28
+
29
+ def initialize(path: nil, contents: nil, data: nil)
30
+ @path = path
31
+ @contents = contents
32
+ @data = data.reject { |_, domains| domains.compact.empty? } if data
33
+ end
34
+
35
+ # Returns the raw content of the domain list as a string
36
+ def contents
37
+ @contents ||= if path
38
+ File.new(path, 'r:utf-8').read
39
+ else
40
+ to_s
41
+ end
42
+ end
43
+
44
+ # Returns the parsed contents of the domain list as a hash
45
+ # in the form for group => domains
46
+ def data
47
+ @data ||= string_to_hash(contents)
48
+ end
49
+ alias to_h data
7
50
 
8
- def initialize(list)
9
- @list = list.reject { |_group, domains| domains.compact.empty? }
51
+ # Returns the path to the domain list on disk
52
+ def path
53
+ @path ||= Gman.list_path
10
54
  end
11
55
 
56
+ # returns an instance of our custom public suffix list
57
+ # list behaves like PublicSuffix::List
58
+ # but is limited to our whitelisted domains
59
+ def public_suffix_list
60
+ @public_suffix_list ||= PublicSuffix::List.parse(contents)
61
+ end
62
+
63
+ # domain is on the domain list
64
+ def valid?(domain)
65
+ rule = public_suffix_list.find(domain, default: nil)
66
+ !(rule.nil? || rule.is_a?(PublicSuffix::Rule::Exception))
67
+ end
68
+
69
+ # Returns an array of strings representing the list groups
12
70
  def groups
13
- list.keys
71
+ data.keys
14
72
  end
15
73
 
74
+ # Return an array of strings representing all domains on the list
16
75
  def domains
17
- list.values.flatten.compact.sort.uniq
76
+ data.values.flatten.compact.sort.uniq
18
77
  end
19
78
 
79
+ # Return the total number of domains in the list
20
80
  def count
21
81
  domains.count
22
82
  end
23
83
 
84
+ # Alphabetize groups and domains within each group
85
+ # We need to ensure exceptions appear after their coresponding rules
24
86
  def alphabetize
25
- @list = @list.sort_by { |k, _v| k.downcase }.to_h
26
- @list.each { |_group, domains| domains.sort!.uniq! }
87
+ @data = data.sort_by { |k, _v| k.downcase }.to_h
88
+ @data.map do |_group, domains|
89
+ domains.sort! { |a, b| sort_with_exceptions(a, b) }
90
+ domains.uniq!
91
+ end
27
92
  end
28
93
 
94
+ # Write the domain list to disk
29
95
  def write
30
96
  alphabetize
31
- File.write(Gman.list_path, to_public_suffix)
97
+ File.write(path, to_public_suffix)
32
98
  end
33
99
 
34
- def to_public_suffix
35
- current_group = output = ''
36
- list.sort_by { |group, _domains| group.downcase }.each do |group, domains|
100
+ # The string representation of the domain list, in public suffix format
101
+ def to_s
102
+ current_group = output = +''
103
+ data.sort_by { |group, _| group.downcase }.each do |group, domains|
37
104
  if group != current_group
38
105
  output << "\n\n" unless current_group.empty? # first entry
39
106
  output << "// #{group}\n"
@@ -43,44 +110,59 @@ class Gman
43
110
  end
44
111
  output
45
112
  end
113
+ alias to_public_suffix to_s
46
114
 
47
- def self.current
48
- current = File.open(Gman.list_path).read
49
- DomainList.from_public_suffix(current)
115
+ # Given a domain, find any domain on the list that includes that domain
116
+ # E.g., `fcc.gov` would be the parent of `data.fcc.gov`
117
+ def parent_domain(domain)
118
+ domains.find { |c| domain =~ /\.#{Regexp.escape(c)}$/ }
50
119
  end
51
120
 
52
- def self.from_public_suffix(string)
53
- string = string.gsub(/\r\n?/, "\n").split("\n")
54
- hash = array_to_hash(string)
55
- DomainList.new(hash)
121
+ private
122
+
123
+ # Parse a public-suffix formatted string into a hash of groups => [domains]
124
+ def string_to_hash(string)
125
+ return unless string
126
+
127
+ lines = string_to_array(string)
128
+ array_to_hash(lines)
56
129
  end
57
130
 
58
- def parent_domain(domain)
59
- domains.find { |c| domain =~ /\.#{Regexp.escape(c)}$/ }
131
+ def string_to_array(string)
132
+ string.gsub(/\r\n?/, "\n").split("\n")
60
133
  end
61
134
 
62
- class << self
63
- private
64
-
65
- # Given an array of comments/domains in public suffix format
66
- # Converts to a hash in the form of :group => [domain1, domain2...]
67
- def array_to_hash(domains)
68
- domain_hash = {}
69
- group = ''
70
- domains.each do |line|
71
- if line =~ COMMENT_REGEX
72
- group = COMMENT_REGEX.match(line)[1]
73
- else
74
- safe_push(domain_hash, group, line.downcase)
75
- end
135
+ def array_to_hash(lines)
136
+ domain_hash = {}
137
+ group = ''
138
+ lines.each do |line|
139
+ if COMMENT_REGEX.match?(line)
140
+ group = COMMENT_REGEX.match(line)[1]
141
+ else
142
+ safe_push(domain_hash, group, line.downcase)
76
143
  end
77
- domain_hash
78
144
  end
145
+ domain_hash
146
+ end
147
+
148
+ # Add a value to an array in a hash, creating the array if necessary
149
+ # hash - the hash
150
+ # key - the key within that hash to add the value to
151
+ # value - the single value to push into the array at hash[key]
152
+ def safe_push(hash, key, value)
153
+ return if value.empty?
154
+
155
+ hash[key] ||= []
156
+ hash[key].push value
157
+ end
79
158
 
80
- def safe_push(hash, key, value)
81
- return if value.empty?
82
- hash[key] ||= []
83
- hash[key].push value
159
+ def sort_with_exceptions(left, right)
160
+ if left.start_with?('!') && !right.start_with?('!')
161
+ 1
162
+ elsif right.start_with?('!') && !left.start_with?('!')
163
+ -1
164
+ else
165
+ left <=> right
84
166
  end
85
167
  end
86
168
  end