words_counted 0.1.5 → 1.0.3

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: cba04e2004b13b0ee7b99e46cdf6549f6aebe2f6
4
- data.tar.gz: 885d494f7f2b2af40f59ed08aaca1db7ec89a54b
2
+ SHA256:
3
+ metadata.gz: a248654f9f76e28bde0f54993a5c5c87504acffed42b1531acc9de7f385f0696
4
+ data.tar.gz: c057a7ecb20d7989651b6667f39d16820734e63dd751a0182406f268ecf0f347
5
5
  SHA512:
6
- metadata.gz: e2009cd4b401da2b43047699a073a3f541654384d831d73c0d436016eb88325e29c179a59961c6d1d8d48a865f34a2da78e014a28a5e0cf4ccf714cafa7a6bb5
7
- data.tar.gz: f46e0031db714c0985ef4b2dee5d1f294c9ab0bdb629157110af0b26b76280bfe440207b4f6920156681cc91ded0246e3e66b6dcf26717208cc73ebbe4e86821
6
+ metadata.gz: 2c4a5028624393434586c7570e8a6c98785c6cedfc3a6f5c07b7fa9b8aba2880ddf847be8779f623df8e36becb8e148aeaabfae822dcc4f0c9b1db414f8c7916
7
+ data.tar.gz: e115d757c34480e9e7425db94f6c78a035b4464c69946aa31cbb45ea28f963dc1088a1617269b506001669753b2725abf4f0b708303ced59aa5c59cb1658096c
data/.gitignore CHANGED
@@ -15,3 +15,4 @@ spec/reports
15
15
  test/tmp
16
16
  test/version_tmp
17
17
  tmp
18
+ .idea/
data/.hound.yml ADDED
@@ -0,0 +1,2 @@
1
+ ruby:
2
+ config_file: .ruby-style.yml
data/.ruby-style.yml ADDED
@@ -0,0 +1,2 @@
1
+ Metrics/LineLength:
2
+ Max: 120
data/.ruby-version ADDED
@@ -0,0 +1 @@
1
+ 3.0.1
data/.travis.yml ADDED
@@ -0,0 +1,9 @@
1
+ language: ruby
2
+
3
+ rvm:
4
+ - 3.0.0
5
+ - 3.0.1
6
+ - ruby-head
7
+
8
+ gemfile:
9
+ - Gemfile
data/.yardopts CHANGED
@@ -1,3 +1,4 @@
1
- --title 'Word Counter for Ruby'
1
+ --title 'Ruby natural language processor'
2
2
  --private
3
- --markup markdown
3
+ --markup markdown
4
+ --hide-api private
data/CHANGELOG.md CHANGED
@@ -1,3 +1,32 @@
1
+ ## Version 1.0.3
2
+
3
+ 1. Adds support for Ruby 3.0.0.
4
+ 2. Improve documentation and adds newer configs to Travis CI and Hound.
5
+
6
+ ## Version 1.0
7
+
8
+ This version brings lots of improvements to code organisation. The tokeniser has been extracted into its own class. All methods in `Counter` have either renamed or deprecated. Deprecated methods and their tests have moved into their own modules. Using them will trigger warnings with upgrade instructions outlined below.
9
+
10
+ 1. Extracted tokenisation behaviour from `Counter` into a `Tokeniser` class.
11
+ 2. Deprecated all methods that have `word` in their name. Most are renamed such that `word` became `token`. They will be removed in version 1.1.
12
+ - Deprecated `word_count` in favor of `token_count`
13
+ - Deprecated `unique_word_count` in favor of `unique_token_count`
14
+ - Deprecated `word_occurrences` and `sorted_word_occurrences` in favor of `token_frequency`
15
+ - Deprecated `word_lengths` and `sorted_word_lengths` in favor of `token_lenghts`
16
+ - Deprecated `word_density` in favor of `token_density`
17
+ - Deprecated `most_occurring_words` in favor of `most_frequent_tokens`
18
+ - Deprecated `longest_words` in favor of `longest_tokens`
19
+ - Deprecated `average_chars_per_word` in favor of `average_chars_per_token`
20
+ - Deprecated `count`. Use `Array#count` instead.
21
+ 3. `token_lengths`, which replaces `word_lengths` returns a sorted two-dimensional array instead of a hash. It behaves exactly like `sorted_word_lengths` which has been deprecated. Use `token_lengths.to_h` for old behaviour.
22
+ 4. `token_frequency`, which replaces `word_occurences` returns a sorted two-dimensional array instead of a hash. It behaves like `sorted_word_occurrences` which has been deprecated. Use `token_frequency.to_h` for old behaviour.
23
+ 5. `token_density`, which replaces `word_density`, returns a decimal with a precision of 2, not a percent. Use `token_density * 100` for old behaviour.
24
+ 6. Add a refinement to Hash under `lib/refinements/hash_refinements.rb` to quickly sort by descending value.
25
+ 7. Extracted all deprecated methods to their own module, and their tests to their own spec file.
26
+ 8. Added a base `words_counted_spec.rb` and moved `.from_file` test to the new file.
27
+ 9. Added Travis continuous integration.
28
+ 10. Add documentation to the code.
29
+
1
30
  ## Version 0.1.5
2
31
 
3
32
  1. Removed `to_f` from the dividend in `average_chars_per_word` and `word_densities`. The divisor is a float, and dividing by a float returns a float.
data/README.md CHANGED
@@ -1,36 +1,35 @@
1
1
  # WordsCounted
2
2
 
3
- WordsCounted is a highly customisable Ruby text analyser. Consult the features for more information.
3
+ > We are all in the gutter, but some of us are looking at the stars.
4
+ >
5
+ > -- Oscar Wilde
6
+
7
+ WordsCounted is a Ruby NLP (natural language processor). WordsCounted lets you implement powerful tokensation strategies with a very flexible tokeniser class.
8
+
9
+ **Are you using WordsCounted to do something interesting?** Please [tell me about it][8].
4
10
 
5
11
  <a href="http://badge.fury.io/rb/words_counted">
6
12
  <img src="https://badge.fury.io/rb/words_counted@2x.png" alt="Gem Version" height="18">
7
13
  </a>
8
14
 
15
+ [RubyDoc documentation][7].
16
+
9
17
  ### Demo
10
18
 
11
- Visit [the gem's website][4] for a demo.
19
+ Visit [this website][4] for one example of what you can do with WordsCounted.
12
20
 
13
21
  ### Features
14
22
 
15
- * Get the following data from any string or readable file:
16
- * Word count
17
- * Unique word count
18
- * Word density
19
- * Character count
20
- * Average characters per word
21
- * A hash map of words and the number of times they occur
22
- * A hash map of words and their lengths
23
- * The longest word(s) and its length
24
- * The most occurring word(s) and its number of occurrences.
25
- * Count invividual strings for occurrences.
26
- * A flexible way to exclude words (or anything) from the count. You can pass a **string**, a **regexp**, an **array**, or a **lambda**.
27
- * Customisable criteria. Pass your own regexp rules to split strings if you prefer. The default regexp has two features:
28
- * Filters special characters but respects hyphens and apostrophes.
29
- * Plays nicely with diacritics (UTF and unicode characters): "São Paulo" is treated as `["São", "Paulo"]` and not `["S", "", "o", "Paulo"]`.
23
+ * Out of the box, get the following data from any string or readable file, or URL:
24
+ * Token count and unique token count
25
+ * Token densities, frequencies, and lengths
26
+ * Char count and average chars per token
27
+ * The longest tokens and their lengths
28
+ * The most frequent tokens and their frequencies.
29
+ * A flexible way to exclude tokens from the tokeniser. You can pass a **string**, **regexp**, **symbol**, **lambda**, or an **array** of any combination of those types for powerful tokenisation strategies.
30
+ * Pass your own regexp rules to the tokeniser if you prefer. The default regexp filters special characters but keeps hyphens and apostrophes. It also plays nicely with diacritics (UTF and unicode characters): *Bayrūt* is treated as `["Bayrūt"]` and not `["Bayr", "ū", "t"]`, for example.
30
31
  * Opens and reads files. Pass in a file path or a url instead of a string.
31
32
 
32
- See usage instructions for more details.
33
-
34
33
  ## Installation
35
34
 
36
35
  Add this line to your application's Gemfile:
@@ -58,62 +57,70 @@ counter = WordsCounted.count(
58
57
  counter = WordsCounted.from_file("path/or/url/to/my/file.txt")
59
58
  ```
60
59
 
60
+ `.count` and `.from_file` are convenience methods that take an input, tokenise it, and return an instance of `WordsCounted::Counter` initialized with the tokens. The `WordsCounted::Tokeniser` and `WordsCounted::Counter` classes can be used alone, however.
61
+
61
62
  ## API
62
63
 
63
- ### Class methods
64
+ ### WordsCounted
64
65
 
65
- #### `count(string, options = {})`
66
+ **`WordsCounted.count(input, options = {})`**
66
67
 
67
- Initializes an analyser object.
68
+ Tokenises input and initializes a `WordsCounted::Counter` object with the resulting tokens.
68
69
 
69
70
  ```ruby
70
71
  counter = WordsCounted.count("Hello Beirut!")
71
72
  ````
72
73
 
73
- Accepts two options: `exclude` and `regexp`. See [Excluding words from the analyser][5] and [Passing in a custom regexp][6] respectively.
74
+ Accepts two options: `exclude` and `regexp`. See [Excluding tokens from the analyser][5] and [Passing in a custom regexp][6] respectively.
74
75
 
75
- #### `from_file(path, options = {})`
76
+ **`WordsCounted.from_file(path, options = {})`**
76
77
 
77
- Initializes an analyser object from a file path.
78
+ Reads and tokenises a file, and initializes a `WordsCounted::Counter` object with the resulting tokens.
78
79
 
79
80
  ```ruby
80
- counter = WordsCounted.count("hello_beirut.txt")
81
+ counter = WordsCounted.from_file("hello_beirut.txt")
81
82
  ````
82
83
 
83
- Accepts the same options as `count()`.
84
+ Accepts the same options as `.count`.
85
+
86
+ ### Tokeniser
84
87
 
85
- ### Instance methods
88
+ The tokeniser allows you to tokenise text in a variety of ways. You can pass in your own rules for tokenisation, and apply a powerful filter with any combination of rules as long as they can boil down into a lambda.
86
89
 
87
- #### `.word_count`
90
+ Out of the box the tokeniser includes only alpha chars. Hyphenated tokens and tokens with apostrophes are considered a single token.
88
91
 
89
- Returns the word count of a given string. The word count includes only alpha characters. Hyphenated and words with apostrophes are considered a single word. You can pass in your own regular expression if this is not desired behaviour.
92
+ **`#tokenise([pattern: TOKEN_REGEXP, exclude: nil])`**
90
93
 
91
94
  ```ruby
92
- counter.word_count #=> 15
95
+ tokeniser = WordsCounted::Tokeniser.new("Hello Beirut!").tokenise
96
+
97
+ # With `exclude`
98
+ tokeniser = WordsCounted::Tokeniser.new("Hello Beirut!").tokenise(exclude: "hello")
99
+
100
+ # With `pattern`
101
+ tokeniser = WordsCounted::Tokeniser.new("I <3 Beirut!").tokenise(pattern: /[a-z]/i)
93
102
  ```
94
103
 
95
- #### `.word_occurrences`
104
+ See [Excluding tokens from the analyser][5] and [Passing in a custom regexp][6] for more information.
96
105
 
97
- Returns an unsorted hash map of words and their number of occurrences. Uppercase and lowercase words are counted as the same word.
106
+ ### Counter
98
107
 
99
- ```ruby
100
- counter.word_occurrences
108
+ The `WordsCounted::Counter` class allows you to collect various statistics from an array of tokens.
101
109
 
102
- {
103
- "we" => 1,
104
- "are" => 2,
105
- "all" => 1,
106
- # ...
107
- "stars" => 1
108
- }
110
+ **`#token_count`**
111
+
112
+ Returns the token count of a given string.
113
+
114
+ ```ruby
115
+ counter.token_count #=> 15
109
116
  ```
110
117
 
111
- #### `.sorted_word_occurrences`
118
+ **`#token_frequency`**
112
119
 
113
- Returns a two dimensional array of words and their number of occurrences sorted in descending order. Uppercase and lowercase words are counted as the same word.
120
+ Returns a sorted (unstable) two-dimensional array where each element is a token and its frequency. The array is sorted by frequency in descending order.
114
121
 
115
122
  ```ruby
116
- counter.sorted_word_occurrences
123
+ counter.token_frequency
117
124
 
118
125
  [
119
126
  ["the", 2],
@@ -124,38 +131,22 @@ counter.sorted_word_occurrences
124
131
  ]
125
132
  ```
126
133
 
127
- #### `.most_occurring_words`
128
-
129
- Returns a two dimensional array of the most occurring word and its number of occurrences. In case there is a tie all tied words are returned.
130
-
131
- ```ruby
132
- counter.most_occurring_words
133
-
134
- [ ["are", 2], ["the", 2] ]
135
- ```
136
-
137
- #### `.word_lengths`
134
+ **`#most_frequent_tokens`**
138
135
 
139
- Returns an unsorted hash of words and their lengths.
136
+ Returns a hash where each key-value pair is a token and its frequency.
140
137
 
141
138
  ```ruby
142
- counter.word_lengths
139
+ counter.most_frequent_tokens
143
140
 
144
- {
145
- "We" => 2,
146
- "are" => 3,
147
- "all" => 3,
148
- # ...
149
- "stars" => 5
150
- }
141
+ { "are" => 2, "the" => 2 }
151
142
  ```
152
143
 
153
- #### `.sorted_word_lengths`
144
+ **`#token_lengths`**
154
145
 
155
- Returns a two dimensional array of words and their lengths sorted in descending order.
146
+ Returns a sorted (unstable) two-dimentional array where each element contains a token and its length. The array is sorted by length in descending order.
156
147
 
157
148
  ```ruby
158
- counter.sorted_word_lengths
149
+ counter.token_lengths
159
150
 
160
151
  [
161
152
  ["looking", 7],
@@ -166,133 +157,121 @@ counter.sorted_word_lengths
166
157
  ]
167
158
  ```
168
159
 
169
- #### `.longest_word`
170
-
171
- Returns a two dimensional array of the longest word and its length. In case there is a tie all tied words are returned.
160
+ **`#longest_tokens`**
172
161
 
173
- ```ruby
174
- counter.longest_words
175
-
176
- [ ["looking", 7] ]
177
- ```
178
-
179
- #### `.words`
162
+ Returns a hash where each key-value pair is a token and its length.
180
163
 
181
- Returns an array of words resulting from the string passed into the initialize method.
182
164
 
183
165
  ```ruby
184
- counter.words
185
- #=> ["We", "are", "all", "in", "the", "gutter", "but", "some", "of", "us", "are", "looking", "at", "the", "stars"]
166
+ counter.longest_tokens
167
+
168
+ { "looking" => 7 }
186
169
  ```
187
170
 
188
- #### `.word_density([ precision = 2 ])`
171
+ **`#token_density([ precision: 2 ])`**
189
172
 
190
- Returns a two-dimensional array of words and their density to a precision of two. It accepts a precision argument which defaults to two.
173
+ Returns a sorted (unstable) two-dimentional array where each element contains a token and its density as a float, rounded to a precision of two. The array is sorted by density in descending order. It accepts a `precision` argument, which must be a float.
191
174
 
192
175
  ```ruby
193
- counter.word_density
176
+ counter.token_density
194
177
 
195
178
  [
196
- ["are", 13.33],
197
- ["the", 13.33],
198
- ["but", 6.67 ],
179
+ ["are", 0.13],
180
+ ["the", 0.13],
181
+ ["but", 0.07 ],
199
182
  # ...
200
- ["we", 6.67 ]
183
+ ["we", 0.07 ]
201
184
  ]
202
185
  ```
203
186
 
204
- #### `.char_count`
187
+ **`#char_count`**
205
188
 
206
- Returns the string's character count.
189
+ Returns the char count of tokens.
207
190
 
208
191
  ```ruby
209
- counter.char_count #=> 76
192
+ counter.char_count #=> 76
210
193
  ```
211
194
 
212
- #### `.average_chars_per_word([ precision = 2 ])`
195
+ **`#average_chars_per_token([ precision: 2 ])`**
213
196
 
214
- Returns the average character count per word. Accepts a precision argument which defaults to two.
197
+ Returns the average char count per token rounded to two decimal places. Accepts a precision argument which defaults to two. Precision must be a float.
215
198
 
216
199
  ```ruby
217
- counter.average_chars_per_word #=> 4
200
+ counter.average_chars_per_token #=> 4
218
201
  ```
219
202
 
220
- #### `.unique_word_count`
203
+ **`#uniq_token_count`**
221
204
 
222
- Returns the count of unique words in the string. This is case insensitive.
205
+ Returns the number of unique tokens.
223
206
 
224
207
  ```ruby
225
- counter.unique_word_count #=> 13
208
+ counter.uniq_token_count #=> 13
226
209
  ```
227
210
 
228
- #### `.count(word)`
211
+ ## Excluding tokens from the tokeniser
229
212
 
230
- Counts the occurrence of a word in the string.
213
+ You can exclude anything you want from the input by passing the `exclude` option. The exclude option accepts a variety of filters and is extremely flexible.
231
214
 
232
- ```ruby
233
- counter.count("are") #=> 2
234
- ```
215
+ 1. A *space-delimited* string. The filter will normalise the string.
216
+ 2. A regular expression.
217
+ 3. A lambda.
218
+ 4. A symbol that names a predicate method. For example `:odd?`.
219
+ 5. An array of any combination of the above.
235
220
 
236
- ## Excluding words from the analyser
237
-
238
- You can exclude anything you want from the string you want to analyse by passing in the `exclude` option. The exclude option accepts a variety of filters.
239
-
240
- 1. A *space-delimited* list of candidates. The filter will remove both uppercase and lowercase variants of the candidate when applicable. Useful for excluding *the*, *a*, and so on.
241
- 2. An array of string candidates. For example: `['a', 'the']`.
242
- 3. A regular expression.
243
- 4. A lambda.
244
-
245
- #### Using a string
246
221
  ```ruby
247
- WordsCounted.count(
248
- "Magnificent! That was magnificent, Trevor.", exclude: "was magnificent"
222
+ tokeniser =
223
+ WordsCounted::Tokeniser.new(
224
+ "Magnificent! That was magnificent, Trevor."
225
+ )
226
+
227
+ # Using a string
228
+ tokeniser.tokenise(exclude: "was magnificent")
229
+ # => ["that", "trevor"]
230
+
231
+ # Using a regular expression
232
+ tokeniser.tokenise(exclude: /trevor/)
233
+ # => ["magnificent", "that", "was", "magnificent"]
234
+
235
+ # Using a lambda
236
+ tokeniser.tokenise(exclude: ->(t) { t.length < 4 })
237
+ # => ["magnificent", "that", "magnificent", "trevor"]
238
+
239
+ # Using symbol
240
+ tokeniser = WordsCounted::Tokeniser.new("Hello! محمد")
241
+ tokeniser.tokenise(exclude: :ascii_only?)
242
+ # => ["محمد"]
243
+
244
+ # Using an array
245
+ tokeniser = WordsCounted::Tokeniser.new(
246
+ "Hello! اسماءنا هي محمد، كارولينا، سامي، وداني"
249
247
  )
250
- counter.words
251
- #=> ["That", "Trevor"]
252
- ```
253
-
254
- #### Using an array
255
- ```ruby
256
- WordsCounted.count("1 2 3 4 5 6", regexp: /[0-9]/, exclude: ['1', '2', '3'])
257
- counter.words
258
- #=> ["4", "5", "6"]
259
- ```
260
-
261
- #### Using a regular expression
262
- ```ruby
263
- WordsCounted.count("Hello Beirut", exclude: /Beirut/)
264
- counter.words
265
- #=> ["Hello"]
266
- ```
267
-
268
- #### Using a lambda
269
- ```ruby
270
- WordsCounted.count("1 2 3 4 5 6", regexp: /[0-9]/, exclude: ->(w) { w.to_i.even? })
271
- counter.words
272
- #=> ["1", "3", "5"]
248
+ tokeniser.tokenise(
249
+ exclude: [:ascii_only?, /محمد/, ->(t) { t.length > 6}, "و"]
250
+ )
251
+ # => ["هي", "سامي", "وداني"]
273
252
  ```
274
253
 
275
- ## Passing in a Custom Regexp
254
+ ## Passing in a custom regexp
276
255
 
277
- Defining words is tricky. The default regexp accounts for letters, hyphenated words, and apostrophes. This means *twenty-one* is treated as one word. So is *Mohamad's*.
256
+ The default regexp accounts for letters, hyphenated tokens, and apostrophes. This means *twenty-one* is treated as one token. So is *Mohamad's*.
278
257
 
279
258
  ```ruby
280
259
  /[\p{Alpha}\-']+/
281
260
  ```
282
261
 
283
- But maybe you don't want to count words?&ndash;Well, analyse anything you want. What you analyse is only limited by your knowledge of regular expressions. Pass your own criteria as a Ruby regular expression to split your string as desired.
262
+ You can pass your own criteria as a Ruby regular expression to split your string as desired.
284
263
 
285
- For example, if you wanted to include numbers in your analysis, you can override the regular expression:
264
+ For example, if you wanted to include numbers, you can override the regular expression:
286
265
 
287
266
  ```ruby
288
- counter = WordsCounted.count("Numbers 1, 2, and 3", regexp: /[\p{Alnum}\-']+/)
289
- counter.words
290
- #=> ["Numbers", "1", "2", "and", "3"]
267
+ counter = WordsCounted.count("Numbers 1, 2, and 3", pattern: /[\p{Alnum}\-']+/)
268
+ counter.tokens
269
+ #=> ["numbers", "1", "2", "and", "3"]
291
270
  ```
292
271
 
293
- ## Opening and Reading Files
272
+ ## Opening and reading files
294
273
 
295
- Use the `from_file` method to open files. `from_file` accepts the same options as `count`. The file path can be a URL.
274
+ Use the `from_file` method to open files. `from_file` accepts the same options as `.count`. The file path can be a URL.
296
275
 
297
276
  ```ruby
298
277
  counter = WordsCounted.from_file("url/or/path/to/file.text")
@@ -300,41 +279,31 @@ counter = WordsCounted.from_file("url/or/path/to/file.text")
300
279
 
301
280
  ## Gotchas
302
281
 
303
- A hyphen used in leu of an *em* or *en* dash will form part of the word. This affects the `word_occurences` algorithm.
282
+ A hyphen used in leu of an *em* or *en* dash will form part of the token. This affects the tokeniser algorithm.
304
283
 
305
284
  ```ruby
306
285
  counter = WordsCounted.count("How do you do?-you are well, I see.")
307
- counter.word_occurrences
308
-
309
- {
310
- "how" => 1,
311
- "do" => 2,
312
- "you" => 1,
313
- "-you" => 1, # WTF, mate!
314
- "are" => 1,
315
- "very" => 1,
316
- "well" => 1,
317
- "i" => 1,
318
- "see" => 1
319
- }
320
- ```
286
+ counter.token_frequency
321
287
 
322
- In this example `-you` and `you` are counted as separate words. Writers should use the correct dash element, but this is not always true.
288
+ [
289
+ ["do", 2],
290
+ ["how", 1],
291
+ ["you", 1],
292
+ ["-you", 1], # WTF, mate!
293
+ ["are", 1],
294
+ # ...
295
+ ]
296
+ ```
323
297
 
324
- Another gotcha is that the default criteria does not include numbers in its analysis. Remember that you can pass your own regular expression if the default behaviour does not fit your needs.
298
+ In this example `-you` and `you` are separate tokens. Also, the tokeniser does not include numbers by default. Remember that you can pass your own regular expression if the default behaviour does not fit your needs.
325
299
 
326
300
  ### A note on case sensitivity
327
301
 
328
- The program will downcase all incoming strings for consistency.
302
+ The program will normalise (downcase) all incoming strings for consistency and filters.
329
303
 
330
- ## Road Map
304
+ ## Roadmap
331
305
 
332
- 1. Add ability to open URLs.
333
- 2. Add paragraph, sentence, average words per sentence, and average sentence chars counters.
334
-
335
- #### Ability to read URLs
336
-
337
- Something like...
306
+ ### Ability to open URLs
338
307
 
339
308
  ```ruby
340
309
  def self.from_url
@@ -342,21 +311,9 @@ def self.from_url
342
311
  end
343
312
  ```
344
313
 
345
- ## But wait... wait a minute...
346
-
347
- #### Isn't it better to write this in JavaScript?
348
-
349
- ![Picard face-palm](http://stream1.gifsoup.com/view3/1290449/picard-facepalm-o.gif "Picard face-palm")
350
-
351
- ## About
352
-
353
- Originally I wrote this program for a code challenge on Treehouse. You can find the original implementation on [Code Review][1].
354
-
355
314
  ## Contributors
356
315
 
357
- Thanks to Dave Yarwood for helping me improve my code. Some of my code is based on his recommendations. You can find the original program implementation, as well as Dave's code review, on [Code Review][1].
358
-
359
- Thanks to [Wayne Conrad][2] for providing [an excellent code review][3], and improving the filter feature to well beyond what I can come up with.
316
+ See [contributors][3]. Not listed there is [Dave Yarwood][1].
360
317
 
361
318
  ## Contributing
362
319
 
@@ -366,10 +323,10 @@ Thanks to [Wayne Conrad][2] for providing [an excellent code review][3], and imp
366
323
  4. Push to the branch (`git push origin my-new-feature`)
367
324
  5. Create new Pull Request
368
325
 
369
-
370
- [1]: http://codereview.stackexchange.com/questions/46105/a-ruby-string-analyser
371
- [2]: https://github.com/wconrad
372
- [3]: http://codereview.stackexchange.com/a/49476/1563
326
+ [2]: http://www.rubydoc.info/gems/words_counted
327
+ [3]: https://github.com/abitdodgy/words_counted/graphs/contributors
373
328
  [4]: http://rubywordcount.com
374
- [5]: https://github.com/abitdodgy/words_counted#excluding-words-from-the-analyser
329
+ [5]: https://github.com/abitdodgy/words_counted#excluding-tokens-from-the-analyser
375
330
  [6]: https://github.com/abitdodgy/words_counted#passing-in-a-custom-regexp
331
+ [7]: http://www.rubydoc.info/gems/words_counted/
332
+ [8]: https://github.com/abitdodgy/words_counted/issues/new
@@ -0,0 +1,14 @@
1
+ # -*- encoding : utf-8 -*-
2
+ module Refinements
3
+ module HashRefinements
4
+ refine Hash do
5
+ # This is convenience method to sort hashes into an
6
+ # array of tuples by descending value.
7
+ #
8
+ # @return [Array<Array>] A sorted (unstable) array of candidates
9
+ def sort_by_value_desc
10
+ sort_by(&:last).reverse
11
+ end
12
+ end
13
+ end
14
+ end