LittleWeasel 3.0.4 → 4.0.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (152) hide show
  1. checksums.yaml +5 -5
  2. data/.gitignore +3 -0
  3. data/.reek.yml +17 -0
  4. data/.rspec +4 -2
  5. data/.rubocop.yml +187 -0
  6. data/.ruby-version +1 -1
  7. data/.yardopts +2 -0
  8. data/Gemfile +3 -1
  9. data/LittleWeasel.gemspec +31 -18
  10. data/README.md +408 -42
  11. data/Rakefile +296 -3
  12. data/lib/LittleWeasel.rb +5 -184
  13. data/lib/LittleWeasel/block_results.rb +81 -0
  14. data/lib/LittleWeasel/configure.rb +98 -0
  15. data/lib/LittleWeasel/dictionary.rb +125 -0
  16. data/lib/LittleWeasel/dictionary_key.rb +48 -0
  17. data/lib/LittleWeasel/dictionary_manager.rb +85 -0
  18. data/lib/LittleWeasel/errors/dictionary_file_already_loaded_error.rb +9 -0
  19. data/lib/LittleWeasel/errors/dictionary_file_empty_error.rb +8 -0
  20. data/lib/LittleWeasel/errors/dictionary_file_not_found_error.rb +8 -0
  21. data/lib/LittleWeasel/errors/dictionary_file_too_large_error.rb +16 -0
  22. data/lib/LittleWeasel/errors/language_required_error.rb +8 -0
  23. data/lib/LittleWeasel/errors/must_override_error.rb +8 -0
  24. data/lib/LittleWeasel/filters/en_us/currency_filter.rb +19 -0
  25. data/lib/LittleWeasel/filters/en_us/numeric_filter.rb +19 -0
  26. data/lib/LittleWeasel/filters/en_us/single_character_word_filter.rb +21 -0
  27. data/lib/LittleWeasel/filters/word_filter.rb +59 -0
  28. data/lib/LittleWeasel/filters/word_filter_managable.rb +80 -0
  29. data/lib/LittleWeasel/filters/word_filter_validatable.rb +31 -0
  30. data/lib/LittleWeasel/filters/word_filterable.rb +19 -0
  31. data/lib/LittleWeasel/filters/word_filters_validatable.rb +29 -0
  32. data/lib/LittleWeasel/metadata/dictionary_metadata.rb +145 -0
  33. data/lib/LittleWeasel/metadata/invalid_words_metadata.rb +134 -0
  34. data/lib/LittleWeasel/metadata/invalid_words_service_results.rb +45 -0
  35. data/lib/LittleWeasel/metadata/metadata_observable_validatable.rb +22 -0
  36. data/lib/LittleWeasel/metadata/metadata_observerable.rb +90 -0
  37. data/lib/LittleWeasel/metadata/metadatable.rb +136 -0
  38. data/lib/LittleWeasel/modules/class_name_to_symbol.rb +26 -0
  39. data/lib/LittleWeasel/modules/configurable.rb +26 -0
  40. data/lib/LittleWeasel/modules/deep_dup.rb +11 -0
  41. data/lib/LittleWeasel/modules/dictionary_cache_keys.rb +34 -0
  42. data/lib/LittleWeasel/modules/dictionary_cache_servicable.rb +26 -0
  43. data/lib/LittleWeasel/modules/dictionary_cache_validatable.rb +20 -0
  44. data/lib/LittleWeasel/modules/dictionary_creator_servicable.rb +27 -0
  45. data/lib/LittleWeasel/modules/dictionary_file_loader.rb +67 -0
  46. data/lib/LittleWeasel/modules/dictionary_key_validatable.rb +19 -0
  47. data/lib/LittleWeasel/modules/dictionary_keyable.rb +24 -0
  48. data/lib/LittleWeasel/modules/dictionary_loader_servicable.rb +27 -0
  49. data/lib/LittleWeasel/modules/dictionary_metadata_servicable.rb +29 -0
  50. data/lib/LittleWeasel/modules/dictionary_metadata_validatable.rb +17 -0
  51. data/lib/LittleWeasel/modules/dictionary_sourceable.rb +26 -0
  52. data/lib/LittleWeasel/modules/dictionary_validatable.rb +30 -0
  53. data/lib/LittleWeasel/modules/language.rb +23 -0
  54. data/lib/LittleWeasel/modules/language_validatable.rb +16 -0
  55. data/lib/LittleWeasel/modules/locale.rb +40 -0
  56. data/lib/LittleWeasel/modules/order_validatable.rb +18 -0
  57. data/lib/LittleWeasel/modules/orderable.rb +17 -0
  58. data/lib/LittleWeasel/modules/region.rb +23 -0
  59. data/lib/LittleWeasel/modules/region_validatable.rb +16 -0
  60. data/lib/LittleWeasel/modules/tag_validatable.rb +16 -0
  61. data/lib/LittleWeasel/modules/taggable.rb +31 -0
  62. data/lib/LittleWeasel/modules/word_results_validatable.rb +28 -0
  63. data/lib/LittleWeasel/preprocessors/en_us/capitalize_preprocessor.rb +22 -0
  64. data/lib/LittleWeasel/preprocessors/preprocessed_word.rb +28 -0
  65. data/lib/LittleWeasel/preprocessors/preprocessed_word_validatable.rb +55 -0
  66. data/lib/LittleWeasel/preprocessors/preprocessed_words.rb +55 -0
  67. data/lib/LittleWeasel/preprocessors/preprocessed_words_validatable.rb +27 -0
  68. data/lib/LittleWeasel/preprocessors/word_preprocessable.rb +19 -0
  69. data/lib/LittleWeasel/preprocessors/word_preprocessor.rb +122 -0
  70. data/lib/LittleWeasel/preprocessors/word_preprocessor_managable.rb +114 -0
  71. data/lib/LittleWeasel/preprocessors/word_preprocessor_validatable.rb +40 -0
  72. data/lib/LittleWeasel/preprocessors/word_preprocessors_validatable.rb +24 -0
  73. data/lib/LittleWeasel/services/dictionary_cache_service.rb +262 -0
  74. data/lib/LittleWeasel/services/dictionary_creator_service.rb +94 -0
  75. data/lib/LittleWeasel/services/dictionary_file_loader_service.rb +37 -0
  76. data/lib/LittleWeasel/services/dictionary_killer_service.rb +35 -0
  77. data/lib/LittleWeasel/services/dictionary_loader_service.rb +59 -0
  78. data/lib/LittleWeasel/services/dictionary_metadata_service.rb +114 -0
  79. data/lib/LittleWeasel/services/invalid_words_service.rb +59 -0
  80. data/lib/LittleWeasel/version.rb +3 -1
  81. data/lib/LittleWeasel/word_results.rb +146 -0
  82. data/spec/factories/dictionary.rb +43 -0
  83. data/spec/factories/dictionary_cache_service.rb +95 -0
  84. data/spec/factories/dictionary_creator_service.rb +16 -0
  85. data/spec/factories/dictionary_file_loader_service.rb +13 -0
  86. data/spec/factories/dictionary_hash.rb +39 -0
  87. data/spec/factories/dictionary_key.rb +14 -0
  88. data/spec/factories/dictionary_killer_service.rb +14 -0
  89. data/spec/factories/dictionary_loader_service.rb +14 -0
  90. data/spec/factories/dictionary_manager.rb +10 -0
  91. data/spec/factories/dictionary_metadata.rb +16 -0
  92. data/spec/factories/dictionary_metadata_service.rb +16 -0
  93. data/spec/factories/numeric_filter.rb +12 -0
  94. data/spec/factories/preprocessed_word.rb +16 -0
  95. data/spec/factories/preprocessed_words.rb +41 -0
  96. data/spec/factories/single_character_word_filter.rb +12 -0
  97. data/spec/factories/word_results.rb +16 -0
  98. data/spec/lib/LittleWeasel/block_results_spec.rb +248 -0
  99. data/spec/lib/LittleWeasel/configure_spec.rb +74 -0
  100. data/spec/lib/LittleWeasel/dictionary_key_spec.rb +118 -0
  101. data/spec/lib/LittleWeasel/dictionary_manager_spec.rb +116 -0
  102. data/spec/lib/LittleWeasel/dictionary_spec.rb +289 -0
  103. data/spec/lib/LittleWeasel/filters/en_us/currency_filter_spec.rb +80 -0
  104. data/spec/lib/LittleWeasel/filters/en_us/numeric_filter_spec.rb +66 -0
  105. data/spec/lib/LittleWeasel/filters/en_us/single_character_word_filter_spec.rb +58 -0
  106. data/spec/lib/LittleWeasel/filters/word_filter_managable_spec.rb +180 -0
  107. data/spec/lib/LittleWeasel/filters/word_filter_spec.rb +151 -0
  108. data/spec/lib/LittleWeasel/filters/word_filter_validatable_spec.rb +94 -0
  109. data/spec/lib/LittleWeasel/filters/word_filters_validatable_spec.rb +48 -0
  110. data/spec/lib/LittleWeasel/integraton_tests/dictionary_integration_spec.rb +201 -0
  111. data/spec/lib/LittleWeasel/metadata/dictionary_creator_servicable_spec.rb +54 -0
  112. data/spec/lib/LittleWeasel/metadata/dictionary_metadata_spec.rb +209 -0
  113. data/spec/lib/LittleWeasel/metadata/invalid_words_metadata_spec.rb +155 -0
  114. data/spec/lib/LittleWeasel/metadata/metadata_observerable_spec.rb +31 -0
  115. data/spec/lib/LittleWeasel/metadata/metadatable_spec.rb +35 -0
  116. data/spec/lib/LittleWeasel/modules/class_name_to_symbol_spec.rb +21 -0
  117. data/spec/lib/LittleWeasel/modules/dictionary_file_loader_spec.rb +125 -0
  118. data/spec/lib/LittleWeasel/modules/dictionary_sourceable_spec.rb +44 -0
  119. data/spec/lib/LittleWeasel/modules/language_spec.rb +52 -0
  120. data/spec/lib/LittleWeasel/modules/locale_spec.rb +140 -0
  121. data/spec/lib/LittleWeasel/modules/region_spec.rb +52 -0
  122. data/spec/lib/LittleWeasel/preprocessors/en_us/capitalize_preprocessor_spec.rb +34 -0
  123. data/spec/lib/LittleWeasel/preprocessors/preprocessed_word_spec.rb +105 -0
  124. data/spec/lib/LittleWeasel/preprocessors/preprocessed_word_validatable_spec.rb +143 -0
  125. data/spec/lib/LittleWeasel/preprocessors/preprocessed_words_spec.rb +77 -0
  126. data/spec/lib/LittleWeasel/preprocessors/preprocessed_words_validatable_spec.rb +58 -0
  127. data/spec/lib/LittleWeasel/preprocessors/word_preprocessor_managable_spec.rb +216 -0
  128. data/spec/lib/LittleWeasel/preprocessors/word_preprocessor_spec.rb +175 -0
  129. data/spec/lib/LittleWeasel/preprocessors/word_preprocessor_validatable_spec.rb +109 -0
  130. data/spec/lib/LittleWeasel/preprocessors/word_preprocessors_validatable_spec.rb +49 -0
  131. data/spec/lib/LittleWeasel/services/dictionary_cache_service_spec.rb +444 -0
  132. data/spec/lib/LittleWeasel/services/dictionary_creator_service_spec.rb +119 -0
  133. data/spec/lib/LittleWeasel/services/dictionary_file_loader_service_spec.rb +71 -0
  134. data/spec/lib/LittleWeasel/services/dictionary_loader_service_spec.rb +50 -0
  135. data/spec/lib/LittleWeasel/services/dictionary_metadata_service_spec.rb +279 -0
  136. data/spec/lib/LittleWeasel/word_results_spec.rb +275 -0
  137. data/spec/lib/LittleWeasel/workflow/workflow_spec.rb +20 -0
  138. data/spec/spec_helper.rb +117 -6
  139. data/spec/support/factory_bot.rb +15 -0
  140. data/spec/support/file_helpers.rb +32 -0
  141. data/spec/support/files/empty-dictionary.txt +0 -0
  142. data/{lib/dictionary → spec/support/files/en-US-big.txt} +262156 -31488
  143. data/spec/support/files/en-US-tagged.txt +26 -0
  144. data/spec/support/files/en-US.txt +26 -0
  145. data/spec/support/files/en.txt +26 -0
  146. data/spec/support/files/es-ES.txt +27 -0
  147. data/spec/support/files/es.txt +27 -0
  148. data/spec/support/general_helpers.rb +68 -0
  149. data/spec/support/shared_contexts.rb +108 -0
  150. data/spec/support/shared_examples.rb +105 -0
  151. metadata +408 -65
  152. data/spec/checker/checker_spec.rb +0 -286
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: df58b4978d62918800204788d128be7549284bee
4
- data.tar.gz: 9a023735953290cea1a6e8f52f14110d86ee8d64
2
+ SHA256:
3
+ metadata.gz: d55524bb27846962bd5cd9b83bf2b141c6f34ca723721630d980963be51f1e24
4
+ data.tar.gz: 8f7efc3fb0db218dd6355f194b92352d12f901955c5388fe8f97e108cac7426b
5
5
  SHA512:
6
- metadata.gz: 67e12da5910b23fcf50a2a8d3f3fcc28c1c31d87dd3e1df05727359b1ddb362f95c917e00646d568b0d1558530a7379b1b7eb3ee6eb2a8d321d2282b093fd984
7
- data.tar.gz: 2155e97a76545a9b5608b7ae7d7cf12c8bb0b26403d1ad7435a42efabc0ae332e6c19775a63910488341222e4e7fb2764c0efc8f3383a126cd2c05c41ab53e72
6
+ metadata.gz: 4027802eab13cb465e419dc8a95a6b3d6d905b6fffcc772bb33d485239fe2260bef778a0f33b526d5bc040dff59fb2c5cfb712b4a47cd4aa29e117c603fea886
7
+ data.tar.gz: a54d138d51f8c66b42ff0a17880f6633370207223822dd6200766fd00fa70e6e19c4a29a9148183b586ed49f35eaf58d22b52d745059799f7147ec6a5c07a96d
data/.gitignore CHANGED
@@ -16,3 +16,6 @@ spec/reports
16
16
  test/tmp
17
17
  test/version_tmp
18
18
  tmp
19
+ LittleWeasel.sublime-*
20
+ scratch.txt
21
+ readme.txt
data/.reek.yml ADDED
@@ -0,0 +1,17 @@
1
+ exclude_paths:
2
+ - vendor
3
+ detectors:
4
+ TooManyInstanceVariables:
5
+ exclude:
6
+ - "LittleWeasel::Configuration"
7
+ - "LittleWeasel::WordResults"
8
+ # private methods do not have to depend on instance state
9
+ # https://github.com/troessner/reek/blob/master/docs/Utility-Function.md
10
+ UtilityFunction:
11
+ public_methods_only: true
12
+ # Check for variable name that doesn't communicate its intent well enough
13
+ # https://github.com/troessner/reek/blob/master/docs/Uncommunicative-Variable-Name.md
14
+ UncommunicativeVariableName:
15
+ accept:
16
+ - /^_$/
17
+ - /^e$/
data/.rspec CHANGED
@@ -1,2 +1,4 @@
1
- --color
2
- --profile
1
+ --require spec_helper
2
+ --format d
3
+ --force-color
4
+ --warnings
data/.rubocop.yml ADDED
@@ -0,0 +1,187 @@
1
+ require:
2
+ - rubocop-performance
3
+ - rubocop-rspec
4
+
5
+ AllCops:
6
+ TargetRubyVersion: 3.0.1
7
+ NewCops: enable
8
+ Exclude:
9
+ - '.git/**/*'
10
+ - '.idea/**/*'
11
+ - 'init/*'
12
+ - 'Rakefile'
13
+ - '*.gemspec'
14
+ - 'spec/**/*'
15
+ - 'vendor/**/*'
16
+
17
+ # Align the elements of a hash literal if they span more than one line.
18
+ Layout/HashAlignment:
19
+ EnforcedLastArgumentHashStyle: always_ignore
20
+
21
+ # Alignment of parameters in multi-line method definition.
22
+ # The `with_fixed_indentation` style aligns the following lines with one
23
+ # level of indentation relative to the start of the line with the method
24
+ # definition.
25
+ #
26
+ # def my_method(a,
27
+ # b)
28
+ Layout/ParameterAlignment:
29
+ EnforcedStyle: with_fixed_indentation
30
+
31
+ # Alignment of parameters in multi-line method call.
32
+ # The `with_fixed_indentation` style aligns the following lines with one
33
+ # level of indentation relative to the start of the line with the method call.
34
+ #
35
+ # my_method(a,
36
+ # b)
37
+ Layout/ArgumentAlignment:
38
+ EnforcedStyle: with_fixed_indentation
39
+
40
+ # a = case n
41
+ # when 0
42
+ # x * 2
43
+ # else
44
+ # y / 3
45
+ # end
46
+ Layout/CaseIndentation:
47
+ EnforcedStyle: end
48
+
49
+ # Enforces a configured order of definitions within a class body
50
+ Layout/ClassStructure:
51
+ Enabled: true
52
+
53
+ # Align `end` with the matching keyword or starting expression except for
54
+ # assignments, where it should be aligned with the LHS.
55
+ Layout/EndAlignment:
56
+ EnforcedStyleAlignWith: variable
57
+ AutoCorrect: true
58
+
59
+ # The `consistent` style enforces that the first element in an array
60
+ # literal where the opening bracket and the first element are on
61
+ # seprate lines is indented the same as an array literal which is not
62
+ # defined inside a method call.
63
+ Layout/FirstArrayElementIndentation:
64
+ EnforcedStyle: consistent
65
+
66
+ # The `consistent` style enforces that the first key in a hash
67
+ # literal where the opening brace and the first key are on
68
+ # seprate lines is indented the same as a hash literal which is not
69
+ # defined inside a method call.
70
+ Layout/FirstHashElementIndentation:
71
+ EnforcedStyle: consistent
72
+
73
+ # Indent multi-line methods instead of aligning with periods
74
+ Layout/MultilineMethodCallIndentation:
75
+ EnforcedStyle: indented
76
+
77
+ # Allow `debug` in tasks for now
78
+ Lint/Debugger:
79
+ Exclude:
80
+ - 'RakeFile'
81
+
82
+ # A calculated magnitude based on number of assignments, branches, and
83
+ # conditions.
84
+ # NOTE: This is temporarily disabled until we can eliminate existing Rubocop
85
+ # complaints
86
+ Metrics/AbcSize:
87
+ Enabled: false
88
+
89
+ # Avoid long blocks with many lines.
90
+ Metrics/BlockLength:
91
+ Exclude:
92
+ - 'RakeFile'
93
+ - 'db/seeds.rb'
94
+ - 'spec/**/*.rb'
95
+
96
+ # Avoid classes longer than 100 lines of code.
97
+ # NOTE: This is temporarily disabled until we can eliminate existing Rubocop
98
+ # complaints
99
+ Metrics/ClassLength:
100
+ Max: 200
101
+ Exclude:
102
+ - 'spec/**/*.rb'
103
+
104
+ # A complexity metric that is strongly correlated to the number of test cases
105
+ # needed to validate a method.
106
+ Metrics/CyclomaticComplexity:
107
+ Max: 9
108
+
109
+ # Limit lines to 80 characters
110
+ Layout/LineLength:
111
+ Exclude:
112
+ - 'RakeFile'
113
+ - 'spec/**/*.rb'
114
+
115
+ # Avoid methods longer than 15 lines of code.
116
+ Metrics/MethodLength:
117
+ Max: 20
118
+ IgnoredMethods:
119
+ - swagger_path
120
+ - operation
121
+
122
+
123
+ # A complexity metric geared towards measuring complexity for a human reader.
124
+ Metrics/PerceivedComplexity:
125
+ Max: 10
126
+
127
+ Naming/FileName:
128
+ Exclude:
129
+ - 'lib/LittleWeasel.rb'
130
+
131
+ # Allow `downcase == ` instead of forcing `casecmp`
132
+ Performance/Casecmp:
133
+ Enabled: false
134
+
135
+ # Require children definitions to be nested or compact in classes and modules
136
+ Style/ClassAndModuleChildren:
137
+ Enabled: false
138
+
139
+ # Document classes and non-namespace modules.
140
+ # (Disabled for now, may revisit later)
141
+ Style/Documentation:
142
+ Enabled: false
143
+
144
+ # Checks the formatting of empty method definitions.
145
+ Style/EmptyMethod:
146
+ EnforcedStyle: expanded
147
+
148
+ # Add the frozen_string_literal comment to the top of files to help transition
149
+ # to frozen string literals by default.
150
+ Style/FrozenStringLiteralComment:
151
+ EnforcedStyle: always
152
+
153
+ # Check for conditionals that can be replaced with guard clauses
154
+ Style/GuardClause:
155
+ Enabled: false
156
+
157
+ Style/MixinUsage:
158
+ Exclude:
159
+ - 'RakeFile'
160
+
161
+ # Avoid multi-line method signatures.
162
+ Style/MultilineMethodSignature:
163
+ Enabled: true
164
+
165
+ # Don't use option hashes when you can use keyword arguments.
166
+ Style/OptionHash:
167
+ Enabled: true
168
+
169
+ # Use return instead of return nil.
170
+ Style/ReturnNil:
171
+ Enabled: true
172
+
173
+ # Allow code like `return x, y` as it's occasionally handy.
174
+ Style/RedundantReturn:
175
+ AllowMultipleReturnValues: true
176
+
177
+ # Prefer symbols instead of strings as hash keys.
178
+ Style/StringHashKeys:
179
+ Enabled: true
180
+
181
+ # Checks if configured preferred methods are used over non-preferred.
182
+ Style/StringMethods:
183
+ Enabled: true
184
+
185
+ # Checks for use of parentheses around ternary conditions.
186
+ Style/TernaryParentheses:
187
+ EnforcedStyle: require_parentheses_when_complex
data/.ruby-version CHANGED
@@ -1 +1 @@
1
- 2.0.0-p481
1
+ 3.0.1
data/.yardopts ADDED
@@ -0,0 +1,2 @@
1
+ --protected
2
+ --private
data/Gemfile CHANGED
@@ -1,4 +1,6 @@
1
+ # frozen_string_literal: true
2
+
1
3
  source 'https://rubygems.org'
2
4
 
3
5
  # Specify your gem's dependencies in LittleWeasel.gemspec
4
- gemspec
6
+ gemspec
data/LittleWeasel.gemspec CHANGED
@@ -1,28 +1,41 @@
1
- # coding: utf-8
2
- lib = File.expand_path('../lib', __FILE__)
1
+ # frozen_string_literal: true
2
+
3
+ lib = File.expand_path('lib', __dir__)
3
4
  $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
5
  require 'LittleWeasel/version'
5
6
 
6
7
  Gem::Specification.new do |spec|
7
- spec.name = "LittleWeasel"
8
+ spec.name = 'LittleWeasel'
8
9
  spec.version = LittleWeasel::VERSION
9
- spec.authors = ["Gene M. Angelo, Jr."]
10
- spec.email = ["public.gma@gmail.com"]
11
- spec.description = %q{Simple spellchecker for single, or multiple word blocks.}
12
- spec.summary = %q{Simply checks a word or group of words for validity against an english dictionary file.}
13
- spec.homepage = "http://www.geneangelo.com"
14
- spec.license = "MIT"
10
+ spec.authors = ['Gene M. Angelo, Jr.']
11
+ spec.email = ['public.gma@gmail.com']
12
+ spec.description = 'Simple spellchecker for single, or multiple word blocks.'
13
+ spec.summary = 'Simply checks a word or group of words for validity against an english dictionary file.'
14
+ spec.homepage = 'http://www.geneangelo.com'
15
+ spec.license = 'MIT'
15
16
 
16
- spec.files = `git ls-files`.split($/)
17
+ spec.files = `git ls-files`.split($INPUT_RECORD_SEPARATOR)
17
18
  spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
18
19
  spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
19
- spec.require_paths = ["lib"]
20
+ spec.require_paths = ['lib']
20
21
 
21
- spec.required_ruby_version = '~> 2.0'
22
- spec.add_runtime_dependency 'activesupport', '~> 4.1', '>= 4.1.1'
23
- spec.add_development_dependency "bundler", "~> 1.3"
24
- spec.add_development_dependency "rake", '~> 0'
25
- spec.add_development_dependency "rspec", '~> 3.0', '>= 3.0.0'
26
- spec.add_development_dependency "yard", "0.8.6.1"
27
- spec.add_development_dependency "redcarpet", '~> 2.3', '>= 2.3.0'
22
+ spec.required_ruby_version = '~> 3.0', '>= 3.0.1'
23
+ spec.add_runtime_dependency 'activesupport', '~> 6.1', '>= 6.1.3.2'
24
+ spec.add_development_dependency 'benchmark-ips', '~> 2.3'
25
+ spec.add_development_dependency 'bundler', '~> 2.2', '>= 2.2.17'
26
+ spec.add_development_dependency 'factory_bot', '~> 6.2'
27
+ spec.add_development_dependency 'pry-byebug', '~> 3.9'
28
+ spec.add_development_dependency 'rake', '~> 0'
29
+ spec.add_development_dependency 'redcarpet', '~> 3.5', '>= 3.5.1'
30
+ spec.add_development_dependency 'reek', '~> 6.0', '>= 6.0.4'
31
+ spec.add_development_dependency 'rspec', '~> 3.10'
32
+ # This verson of rubocop is returning errors.
33
+ # spec.add_development_dependency 'rubocop', '~> 1.14'
34
+ spec.add_development_dependency 'rubocop', '~> 1.9.1'
35
+ spec.add_development_dependency 'rubocop-performance', '~> 1.11', '>= 1.11.3'
36
+ spec.add_development_dependency 'rubocop-rspec', '~> 2.3'
37
+ spec.add_development_dependency 'simplecov', '~> 0.21.2'
38
+ # Needed for yard
39
+ spec.add_development_dependency 'webrick', '~> 1.7'
40
+ spec.add_development_dependency 'yard', '~> 0.9.26'
28
41
  end
data/README.md CHANGED
@@ -1,42 +1,408 @@
1
- # LittleWeasel
2
-
3
- Simple spell checker using an english dictionary.
4
- Forked from https://github.com/stewartmckee/spell_checker
5
-
6
- ## Installation
7
-
8
- Add this line to your application's Gemfile:
9
-
10
- gem 'LittleWeasel'
11
-
12
- And then execute:
13
-
14
- $ bundle
15
-
16
- Or install it yourself as:
17
-
18
- $ gem install LittleWeasel
19
-
20
- ## Usage
21
-
22
- require 'LittleWeasel'
23
-
24
- LittleWeasel::Checker.instance.exists?('word', options|nil) # true if exists in the dictionary, false otherwise.
25
-
26
- LittleWeasel::Checker.instance.exists?('Multiple words', options|nil) # true if exists in the dictionary, false otherwise.
27
-
28
- ## Contributing
29
-
30
- Not taking contributions just yet.
31
-
32
- ## License
33
-
34
- (The MIT License)
35
-
36
- Copyright © 2013-2014 Gene M. Angelo, Jr.
37
-
38
- Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the ‘Software’), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
39
-
40
- The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
41
-
42
- THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
1
+ [![GitHub version](http://badge.fury.io/gh/gangelo%2FLittleWeasel.svg)](https://badge.fury.io/gh/gangelo%2FLittleWeasel)
2
+ [![Gem Version](https://badge.fury.io/rb/LittleWeasel.svg)](https://badge.fury.io/rb/LittleWeasel)
3
+
4
+ [![](http://ruby-gem-downloads-badge.herokuapp.com/LittleWeasel?type=total)](http://www.rubydoc.info/gems/LittleWeasel/)
5
+ [![Documentation](http://img.shields.io/badge/docs-rdoc.info-blue.svg)](http://www.rubydoc.info/gems/LittleWeasel/)
6
+
7
+ [![Report Issues](https://img.shields.io/badge/report-issues-red.svg)](https://github.com/gangelo/simple_command_dispatcher/issues)
8
+
9
+ [![License](http://img.shields.io/badge/license-MIT-yellowgreen.svg)](#license)
10
+
11
+ # LittleWeasel
12
+
13
+ ## Table of Contents
14
+ - [About LittleWeasel](#about-littleweasel)
15
+ * [Usage](#usage)
16
+ * [Creating Dictionaries](#creating-dictionaries)
17
+ + [Creating a Dictionary from Memory](#creating-a-dictionary-from-memory)
18
+ + [Creating a Dictionary from a File on Disk](#creating-a-dictionary-from-a-file-on-disk)
19
+ + [Basic Word Search Example](#basic-word-search-example)
20
+ - [Using the Dictionary#word_results API](#using-the-dictionary-word-results-api)
21
+ - [Using the Dictionary#block_results API](#using-the-dictionary-block-results-api)
22
+ + [Word Search using Word Filters and Word Preprocessors Example](#word-search-using-word-filters-and-word-preprocessors-example)
23
+ + [Word filters and word preprocessors working together example...](#word-filters-and-word-preprocessors-working-together-example)
24
+ * [Rake Tasks](#rake-tasks)
25
+ * [Installation](#installation)
26
+ * [Contributing](#contributing)
27
+ * [License](#license)
28
+
29
+ # About LittleWeasel
30
+
31
+ **LittleWeasel** is _more_ than just a spell checker for words (and word blocks, i.e. groups of words); LittleWeasel provides information about a particular word(s) through its API. LittleWeasel allows you to apply preprocessing to words through any number of word preprocessors _before_ they are checked against the dictionary(ies) you provide. In addition to this, you may provide any number of word filters that allow you to consider the validity of each word being checked, regardless of whether or not it's literally found in the dictionary. LittleWeasel will tell you exactly what word preprocessors were applied to a given word, even showing you the transformation of the original word as it passes through each preprocessor; it will also inform you of each matching word filters along the way, so you can make a decision about every word being validated.
32
+
33
+ LittleWeasel provides other features as well:
34
+
35
+ * LittleWeasel allows you to provide any number of "dictionaries" which may be in the form of a collecton of words in a file on disk _or_ an Array of words you provide, so that dictionaries may be created _dynamically_.
36
+ * Dictionaries are identified by a unique "dictionary key"; that is, a key based on locale (<language>-<REGION>, e.g. en-US) and/or optional "tag" (en-US-<tag>, e.g. en-US-slang).
37
+ * Dictionaries created from files on disk are cached; their words and metadata are shared across dictionary instances that share the same dictionary key.
38
+ * Dictionaries can have observable, metadata objects attached to them which are notified when a word or word block is being evaluated; therefore, metadata about the dictionary, words, etc. can be gathered and used. For example, LittleWeasel uses a LittleWeasel::Metadata::InvalidWordsMetadata metadata object that caches and keeps track of the total bytes of invalid words searched against the dictionary. If the total bytes of invalid words exceeds what is set in the configuration, caching of invalid words ceases. You can create your own metadata objects to gather and use your own metadata.
39
+
40
+ ## Usage
41
+
42
+ At its most basic level, there are three (3) steps to using LittleWeasel:
43
+ 1. Create a **LittleWeasel::Dictionary** object.
44
+ 2. Consume the **LittleWeasel::Dictionary#word_results** and/or **LittleWeasel::Dictionary#block_results** APIs to obtain a **LittleWeasel::WordResults** [^1] object for a particular word or word block.
45
+ 3. Interrogate the **LittleWeasel::WordResults** [^1] object returned from either of the aforementioned APIs.
46
+
47
+ Some of the more advanced LittleWeasel features include the use of **word preprocessors**, **word filters** and **dictionary metadata modules**; for these, read on.
48
+
49
+ [^1]: The _LittleWeasel::WordResults_ object returned from these APIs provides information related to the given word or words that have passed through their respective processes (i.e. word preprocessing, word filtering and dictionary checks).
50
+
51
+ ## Creating Dictionaries
52
+
53
+ ### Creating a Dictionary from Memory
54
+ ```ruby
55
+ # Create a Dictionary Manager.
56
+ dictionary_manager = LittleWeasel::DictionaryManager.new
57
+
58
+ # Create our unique key for the dictionary.
59
+ en_us_names_key = LittleWeasel::DictionaryKey.new(language: :en, region: :us, tag: :names)
60
+
61
+ # Create a dictionary of names from memory.
62
+ en_us_names_dictionary = dictionary_manager.create_dictionary_from_memory(
63
+ dictionary_key: en_us_names_key, dictionary_words: %w(Abel Bartholomew Cain Deborah Elijah))
64
+ ```
65
+
66
+ ### Creating a Dictionary from a File on Disk
67
+ ```ruby
68
+ # Create a Dictionary Manager.
69
+ dictionary_manager = LittleWeasel::DictionaryManager.new
70
+
71
+ # Create our unique key for the dictionary.
72
+ en_us_key = LittleWeasel::DictionaryKey.new(language: :en, region: :us)
73
+
74
+ # Create a dictionary from a file on disk. The below assumes the
75
+ # dictionary file name matches the dictionary key (e.g. en-US).
76
+ en_us_dictionary = dictionary_manager.create_dictionary_from_file(
77
+ dictionary_key: en_us_key, file: "dictionaries/#{en_us_key}.txt")
78
+ ```
79
+
80
+ ### Basic Word Search Example
81
+
82
+ #### Using the Dictionary#word_results API
83
+
84
+ Continued from [Creating a Dictionary from Memory](#creating-a-dictionary-from-memory) example.
85
+
86
+ ```ruby
87
+ # Get word results for 'Abel'. true is returned because the 'Abel' is found in the dictionary.
88
+ en_us_names_dictionary.word_results('Abel').word_valid?
89
+ #=> true
90
+
91
+ # Get word results for 'elijah'. false is returned because while 'Elijah' is found in the dictionary, 'elijah' is NOT (case sensitive).
92
+ en_us_names_dictionary.word_results('elijah').word_valid?
93
+ #=> false
94
+ ```
95
+
96
+ #### Using the Dictionary#block_results API
97
+
98
+ Continued from [Creating a Dictionary from a File on Disk](#creating-a-dictionary-from-a-file-on-disk) example.
99
+
100
+ ```ruby
101
+ word_block = "This is a word-block of 8 words and 2 numbers."
102
+
103
+ # Add a word filter so that numbers are considered valid.
104
+ en_us_dictionary.add_filters word_filters: [
105
+ LittleWeasel::Filters::EnUs::NumericFilter.new
106
+ ]
107
+
108
+ block_results = en_us_dictionary.block_results word_block
109
+ # Returns a LittleWeasel::BlockResults object.
110
+ # The below is formatted for readability...
111
+ # Results of calling #word_block with:
112
+ # "This is a word-block of 8 words and 2 numbers."...
113
+ block_results #=>
114
+ preprocessed_words_or_original_words #=>
115
+ ["This", "is", "a", "word-block", "of", "8", "words", "and", "2", "numbers"]
116
+
117
+ word_results[0] #=>
118
+ # The word before any word preprocessors have been applied.
119
+ original_word #=> "This"
120
+
121
+ # The word after all word preprocessors have been applied against
122
+ # the word.
123
+ preprocessed_word #=> nil
124
+
125
+ # Indicates whether or not the word was found in the literal
126
+ # dictionary (#word_valid?) OR if the word (after word preprocessing)
127
+ # was matched against a word filter (#filter_match?).
128
+ success? #=> false
129
+
130
+ # Indicates whether or not word (after word preprocessing) was found
131
+ # in the literal dictionary.
132
+ word_valid? #=> false
133
+
134
+ # Indicates whether or not the word is cached, either as a word found
135
+ # in the literal dictionary OR as an invalid word. The latter will
136
+ # only take place if LittleWeasel::Configuration#max_invalid_words_bytesize
137
+ # is greater than 0.
138
+ word_cached? #=> false
139
+
140
+ # Indicates whether or not #preprocessed_word is present due to
141
+ # word having passed through one or more word preprocessors. This
142
+ # will only return true if word preprocessors are available to the
143
+ # dictionary, turned on
144
+ # (LittleWeasel::Preprocessors::WordPreprocessor#preprocessor_on?)
145
+ # AND the word meets the criteria for word preprocessing for one or
146
+ # more word preprocessors (LittleWeasel::Preprocessors::WordPreprocessor#preprocess?).
147
+ preprocessed_word? #=> false
148
+
149
+ # Returns #preprocessed_word if word preprocessing has been applied
150
+ # or original_word if word preprocessing has NOT been applied.
151
+ preprocessed_word_or_original_word #=> "This"
152
+
153
+ # Indicates whether or not word has been matched by at least 1
154
+ # word filter.
155
+ filter_match? #=> false
156
+
157
+ # Indicates the word filters that were matched against
158
+ # word (LittleWeasel::Filters::WordFilter#filter_match?). If
159
+ # word did not match any word filters, an empty Array is returned.
160
+ filters_matched #=> []
161
+
162
+ # Indicates the word preprocessors that were applied against
163
+ # word. If no word preprocessors were applied to word, an empty
164
+ # Array is returned.
165
+ preprocessed_words #=> []
166
+
167
+ word_results[1] #=>
168
+ original_word #=> "is"
169
+ preprocessed_word #=> nil
170
+ success? #=> true
171
+ word_valid? #=> true
172
+ word_cached? #=> true
173
+ preprocessed_word? #=> false
174
+ preprocessed_word_or_original_word #=> "is"
175
+ filter_match? #=> false
176
+ filters_matched: #=> []
177
+ preprocessed_words #=> []
178
+
179
+ word_results[2] #=>
180
+ original_word #=> "a"
181
+ preprocessed_word #=> nil
182
+ success? #=> true
183
+ word_valid? #=> true
184
+ word_cached? #=> true
185
+ preprocessed_word? #=> false
186
+ preprocessed_word_or_original_word #=> "a"
187
+ filter_match? #=> false
188
+ filters_matched: #=> []
189
+ preprocessed_words #=> []
190
+
191
+ word_results[3] #=>
192
+ original_word #=> "word-block"
193
+ preprocessed_word #=> nil
194
+ success? #=> false
195
+ word_valid? #=> false
196
+ word_cached? #=> false
197
+ preprocessed_word? #=> false
198
+ preprocessed_word_or_original_word #=> "word-block"
199
+ filter_match? #=> false
200
+ filters_matched: #=> []
201
+ preprocessed_words #=> []
202
+
203
+ word_results[4] #=>
204
+ original_word #=> "of"
205
+ preprocessed_word #=> nil
206
+ success? #=> false
207
+ word_valid? #=> false
208
+ word_cached? #=> false
209
+ preprocessed_word? #=> false
210
+ preprocessed_word_or_original_word #=> "of"
211
+ filter_match? #=> false
212
+ filters_matched: #=> []
213
+ preprocessed_words #=> []
214
+
215
+ word_results[5] #=>
216
+ original_word #=> "8"
217
+ preprocessed_word #=> nil
218
+ success? #=> true
219
+ word_valid? #=> false
220
+ word_cached? #=> false
221
+ preprocessed_word? #=> false
222
+ preprocessed_word_or_original_word #=> "8"
223
+ filter_match? #=> true
224
+ filters_matched: #=> [:numeric_filter]
225
+ preprocessed_words #=> []
226
+
227
+ word_results[6] #=>
228
+ original_word #=> "words"
229
+ preprocessed_word #=> nil
230
+ success? #=> true
231
+ word_valid? #=> true
232
+ word_cached? #=> true
233
+ preprocessed_word? #=> false
234
+ preprocessed_word_or_original_word #=> "words"
235
+ filter_match? #=> false
236
+ filters_matched: #=> []
237
+ preprocessed_words #=> []
238
+
239
+ word_results[7] #=>
240
+ original_word #=> "and"
241
+ preprocessed_word #=> nil
242
+ success? #=> true
243
+ word_valid? #=> true
244
+ word_cached? #=> true
245
+ preprocessed_word? #=> false
246
+ preprocessed_word_or_original_word #=> "and"
247
+ filter_match? #=> false
248
+ filters_matched: #=> []
249
+ preprocessed_words #=> []
250
+
251
+ word_results[8] #=>
252
+ original_word #=> "2"
253
+ preprocessed_word #=> nil
254
+ success? #=> true
255
+ word_valid? #=> true
256
+ word_cached? #=> true
257
+ preprocessed_word? #=> false
258
+ preprocessed_word_or_original_word #=> "2"
259
+ filter_match? #=> true
260
+ filters_matched: #=> [:numeric_filter]
261
+ preprocessed_words #=> []
262
+
263
+ word_results[9] #=>
264
+ original_word #=> "numbers"
265
+ preprocessed_word #=> nil
266
+ success? #=> false
267
+ word_valid? #=> false
268
+ word_cached? #=> false
269
+ preprocessed_word? #=> false
270
+ preprocessed_word_or_original_word #=> "numbers"
271
+ filter_match? #=> false
272
+ filters_matched: #=> []
273
+ preprocessed_words #=> []
274
+ ```
275
+
276
+ ### Word Search using Word Filters and Word Preprocessors Example
277
+
278
+ Continued from [Creating a Dictionary from Memory](#creating-a-dictionary-from-memory) example.
279
+
280
+ Note: The below use of _word filters_ and _word preprocessors_ apply equally to both
281
+ **Dictionary#word_results** and **Dictionary#block_results** APIs.
282
+
283
+ ```ruby
284
+ # Set up any word preprocessors and/or word filters we want to use...
285
+
286
+ # Word preprocessors perform preprocessing on words prior to being passed through any word filters
287
+ # and prior to being checked against the literal dictionary words.
288
+ en_us_names_dictionary.add_preprocessors word_preprocessors: [LittleWeasel::Preprocessors::EnUs::CapitalizePreprocessor.new]
289
+
290
+ # Word filters check words for validity prior to being checked against the literal dictionary.
291
+ # In other words, word filters allow you to consider words valid (if they match the filter) despite not being found in the literal dictionary.
292
+ en_us_names_dictionary.add_filters word_filters: [LittleWeasel::Filters::EnUs::SingleCharacterWordFilter.new]
293
+
294
+ # Check some words against the dictionary; a LittleWeasel::WordResults object is returned...
295
+
296
+ # Try to find a name...
297
+ word_results = en_us_names_dictionary.word_results 'elijah'
298
+
299
+ # Returns true if the word is found in the literal dictionary (#word_valid?) or
300
+ # if the word matched any word filters (#filter_match?). true is returned because
301
+ # 'elijah' ('Elijah' after preprocessing) was found in the literal dictionary,
302
+ # despite not having matched any word filters.
303
+ word_results.successful?
304
+ #=> true
305
+
306
+ # Returns true because 'elijah' ('Elijah' after preprocessing) was found in the
307
+ # literal dictionary.
308
+ word_results.word_valid?
309
+ #=> true
310
+
311
+ # Returns true because the word had word preprocessing applied to it.
312
+ word_results.preprocessed_word?
313
+ #=> true
314
+
315
+ # Returns false because the word (after preprocessing) did not match any word filters.
316
+ word_results.filter_match?
317
+ #=> false
318
+
319
+ # Returns the original word, before any word preprocessors were applied.
320
+ word_results.original_word
321
+ #=> 'elijah'
322
+
323
+ # The resulting word after all word preprocessors have been applied.
324
+ word_results.preprocessed_word
325
+ #=> 'Elijah'
326
+ ```
327
+
328
+ ### Word filters and word preprocessors working together example...
329
+
330
+ ```ruby
331
+ # This builds on the preceeding example...
332
+
333
+ # Search for a word that does not exist in the literal dictionary, but matches a filter...
334
+
335
+ # Because of the LittleWeasel::Filters::EnUs::SingleCharacterWordFilter word filter,
336
+ # "i" ("I" after word preprocessing) will be considered valid, even though it's not
337
+ # literally found in the dictionary.
338
+ word_results = en_us_names_dictionary.word_results 'i'
339
+
340
+ # true is returned because 'i' ('I' after preprocessing) was matched against the
341
+ # LittleWeasel::Filters::EnUs::SingleCharacterWordFilter word filter, despite not
342
+ # having been found in the literal dictionary.
343
+ word_results.successful?
344
+ #=> true
345
+
346
+ # Returns false because 'i' ('I' after preprocessing) was not found in the
347
+ # literal dictionary.
348
+ word_results.word_valid?
349
+ #=> false
350
+
351
+ # Returns true because 'i' ('I' after preprocessing) had word preprocessing applied to it.
352
+ word_results.preprocessed_word?
353
+ #=> true
354
+
355
+ # Returns true because the 'i' ('I' after preprocessing) matched the LittleWeasel::Filters::EnUs::SingleCharacterWordFilter word filter.
356
+ word_results.filter_match?
357
+ #=> true
358
+
359
+ # Returns the original word, before any word preprocessors were applied.
360
+ word_results.original_word
361
+ #=> 'i'
362
+
363
+ # The resulting word after all word preprocessors have been applied.
364
+ word_results.preprocessed_word
365
+ #=> 'I'
366
+ ```
367
+
368
+ ## Rake Tasks
369
+
370
+ Below are some rake tasks that can be used as examples:
371
+
372
+ Rake Task | Description
373
+ ---------- | -------------
374
+ word_results:basic | Creates a **LittleWeasel::Dictionary** from a file source (file on disk) and calls **LittleWeasel::Dictionary#word_results** API.
375
+ word_results:from_memory | Creates a **LittleWeasel::Dictionary** from a memory source (Array of words) and calls **LittleWeasel::Dictionary#word_results** API.
376
+ word_results:advanced | Creates a **LittleWeasel::Dictionary** from a memory source (Array of words) and calls **LittleWeasel::Dictionary#word_results** API. Demonstrates the use of **word preprocessors** and **word filters**.
377
+ word_results:word_filters | Creates a **LittleWeasel::Dictionary** from a memory source (Array of words) and calls **LittleWeasel::Dictionary#word_results** API. Demonstrates some of the **word filters** that come prepackaged with LittleWeasel (**LittleWeasel::Filters::EnUs::NumericFilter**, **LittleWeasel::Filters::EnUs::CurrencyFilter** and **LittleWeasel::Filters::EnUs::SingleCharacterWordFilter**).
378
+ block_results:basic | Creates a **LittleWeasel::Dictionary** from a file source (file on disk) and calls **LittleWeasel::Dictionary#block_results** API. Demonstrates how to use the **#block_results** API.
379
+
380
+ ## Installation
381
+
382
+ Add this line to your application's Gemfile:
383
+
384
+ gem 'LittleWeasel'
385
+
386
+ And then execute:
387
+
388
+ $ bundle
389
+
390
+ Or install it yourself as:
391
+
392
+ $ gem install LittleWeasel
393
+
394
+ ## Contributing
395
+
396
+ Not taking contributions just yet.
397
+
398
+ ## License
399
+
400
+ (The MIT License)
401
+
402
+ Copyright © 2013-2021 Gene M. Angelo, Jr.
403
+
404
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the ‘Software’), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
405
+
406
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
407
+
408
+ THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.