words_counted 0.0.2 → 0.0.3
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.yardopts +2 -0
- data/README.md +85 -18
- data/lib/words_counted/counter.rb +1 -1
- data/lib/words_counted/version.rb +1 -1
- metadata +2 -1
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 5d127f1d7ea83482f4efa3b8d25ff2154d6d8006
|
4
|
+
data.tar.gz: 4e537850f5749aadf25f7ac435fbc141601b3425
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 95adf80c2551fb0dd92fa260aad8b0b0eb62ddb11ced4b4a3afb70493782810c24919c84ce1b89b646c07ea34db7f94e3cd0515d10709d956139b87bd441c2be
|
7
|
+
data.tar.gz: 2494524dc28161f2ca8d7ada896c7522ddb7266b9aac5ca50288b67298f8a0b1eb1460e027d60478c23c2fd220078ce02c98e2f0f977496723ae59d5aac4cf30
|
data/.yardopts
ADDED
data/README.md
CHANGED
@@ -1,15 +1,15 @@
|
|
1
1
|
# Words Counted
|
2
2
|
|
3
|
-
This Ruby gem is a word counter that includes some handy utility methods.
|
3
|
+
This Ruby gem is a word counter that includes some handy utility methods. It lets you send in a string of text and count the number of words, get the words sorted by number occurrences, get the highest occurring words, and few more things.
|
4
4
|
|
5
5
|
### Features
|
6
6
|
|
7
7
|
1. Count the number of words in a string.
|
8
|
-
2. Get a hash of
|
9
|
-
3. Get a hash of
|
8
|
+
2. Get a hash map of words and the number of times they occur.
|
9
|
+
3. Get a hash map of words and their lengthes.
|
10
10
|
4. Get the most occurring word(s) and its number of occurrences.
|
11
11
|
5. Get the longest word(s) and its length.
|
12
|
-
6. Ability to filter words.
|
12
|
+
6. Ability to filter out words from the count. Useful if you don't want to count `a`, `the`, etc...
|
13
13
|
7. Filters special characters but respects hyphens and apostrophes.
|
14
14
|
|
15
15
|
See usage instructions for details on each feature.
|
@@ -30,7 +30,7 @@ Or install it yourself as:
|
|
30
30
|
|
31
31
|
## Usage
|
32
32
|
|
33
|
-
Create an instance of `Counter` and pass in a string and an optional filter.
|
33
|
+
Create an instance of `Counter` and pass in a string and an optional filter string.
|
34
34
|
|
35
35
|
```ruby
|
36
36
|
counter = WordsCounted::Counter.new(
|
@@ -38,50 +38,117 @@ counter = WordsCounted::Counter.new(
|
|
38
38
|
)
|
39
39
|
```
|
40
40
|
|
41
|
-
|
41
|
+
#### `.word_count`
|
42
42
|
|
43
43
|
Returns the word count of a given string. The word count includes only alpha characters. Hyphenated and words with apostrophes are considered a single word.
|
44
44
|
|
45
45
|
```ruby
|
46
46
|
counter.word_count #=> 15
|
47
47
|
```
|
48
|
-
### `.word_occurrences`
|
49
48
|
|
50
|
-
|
49
|
+
#### `.word_occurrences`
|
50
|
+
|
51
|
+
Returns a hash map of words and their number of occurrences. Uppercase and lowercase words are counted as the same word.
|
51
52
|
|
52
53
|
```ruby
|
53
|
-
counter.word_occurrences
|
54
|
+
counter.word_occurrences
|
55
|
+
#=> {
|
56
|
+
"we"=>1,
|
57
|
+
"are"=>2,
|
58
|
+
"all"=>1,
|
59
|
+
"in"=>1,
|
60
|
+
"the"=>2,
|
61
|
+
"gutter"=>1,
|
62
|
+
"but"=>1,
|
63
|
+
"some"=>1,
|
64
|
+
"of"=>1,
|
65
|
+
"us"=>1,
|
66
|
+
"looking"=>1,
|
67
|
+
"at"=>1,
|
68
|
+
"stars"=>1
|
69
|
+
}
|
54
70
|
```
|
55
71
|
|
56
|
-
|
72
|
+
#### `.most_occurring_words`
|
57
73
|
|
58
|
-
Returns a two dimensional array of the
|
74
|
+
Returns a two dimensional array of the most occurring word and its number of occurrences. In case there is a tie all tied words are returned.
|
59
75
|
|
60
76
|
```ruby
|
61
|
-
counter.most_occurring_words
|
77
|
+
counter.most_occurring_words
|
78
|
+
#=>
|
79
|
+
[
|
80
|
+
["are", 2],
|
81
|
+
["the", 2]
|
82
|
+
]
|
62
83
|
```
|
63
84
|
|
64
|
-
|
85
|
+
#### `.word_lengths`
|
65
86
|
|
66
87
|
Returns a hash of words and their lengths.
|
67
88
|
|
68
89
|
```ruby
|
69
|
-
counter.word_lengths
|
90
|
+
counter.word_lengths
|
91
|
+
#=> {
|
92
|
+
"We"=>2,
|
93
|
+
"are"=>3,
|
94
|
+
"all"=>3,
|
95
|
+
"in"=>2,
|
96
|
+
"the"=>3,
|
97
|
+
"gutter"=>6,
|
98
|
+
"but"=>3,
|
99
|
+
"some"=>4,
|
100
|
+
"of"=>2,
|
101
|
+
"us"=>2,
|
102
|
+
"looking"=>7,
|
103
|
+
"at"=>2,
|
104
|
+
"stars"=>5
|
105
|
+
}
|
70
106
|
```
|
71
107
|
|
72
|
-
|
108
|
+
#### `.longest_word`
|
73
109
|
|
74
110
|
Returns a two dimensional array of the longest word and its length. In case there is a tie all tied words are returned.
|
75
111
|
|
112
|
+
```ruby
|
113
|
+
counter.longest_words
|
114
|
+
#=> [
|
115
|
+
["looking", 7]
|
116
|
+
]
|
117
|
+
```
|
118
|
+
|
76
119
|
## Filtering
|
77
120
|
|
78
|
-
You can pass in a space-delimited word list to filter words that you don't want to count. Filter words should be lowercase
|
121
|
+
You can pass in a space-delimited word list to filter words that you don't want to count. Filter words should be *lowercase*. The filter will remove both uppercase and lowercase variants of the word.
|
122
|
+
|
123
|
+
```ruby
|
124
|
+
WordsCounted::Counter.new("Magnificent! That was magnificent, Trevor.", "was magnificent")
|
125
|
+
#<WordsCounted::Counter:0x007fd4949f99d8 @words=["That", "Trevor"]>
|
126
|
+
```
|
127
|
+
|
128
|
+
## Gotchas
|
129
|
+
|
130
|
+
A hyphen use in leu of an *em* or *en* dash will form part of the word and throw off the `word_occurences` algorithm.
|
79
131
|
|
80
132
|
```ruby
|
81
|
-
WordsCounted::Counter.new("
|
82
|
-
|
133
|
+
counter = WordsCounted::Counter.new("How do you do?-you are well, I see.")
|
134
|
+
#<WordsCounted::Counter:0x007fd494252518 @words=["How", "do", "you", "do", "-you", "are", "well", "I", "see"]>
|
135
|
+
|
136
|
+
counter.word_occurrences
|
137
|
+
#=> {
|
138
|
+
"how"=>1,
|
139
|
+
"do"=>2,
|
140
|
+
"you"=>1,
|
141
|
+
"-you"=>1, # WTF, mate!
|
142
|
+
"are"=>1,
|
143
|
+
"very"=>1,
|
144
|
+
"well"=>1,
|
145
|
+
"i"=>1,
|
146
|
+
"see"=>1
|
147
|
+
}
|
83
148
|
```
|
84
149
|
|
150
|
+
In this example, `-you` and `you` are counted as separate words. Writers should use the correct dash element, but this is not always the case.
|
151
|
+
|
85
152
|
## About
|
86
153
|
|
87
154
|
Originally I wrote this program for a code challenge. My initial implementation was decent, but it could have been better. Thanks to [Dave Yarwood](http://codereview.stackexchange.com/a/47515/1563) for helping me improve my code. Some of this code is based on his recommendations. You can find the original implementation as well as the code review on [Code Review](http://codereview.stackexchange.com/questions/46105/a-ruby-string-analyser).
|
@@ -1,6 +1,6 @@
|
|
1
1
|
module WordsCounted
|
2
2
|
class Counter
|
3
|
-
# @!
|
3
|
+
# @!words [Array] an array of words resulting from the string passed to the initializer.
|
4
4
|
attr_reader :words
|
5
5
|
|
6
6
|
# This is the criteria for defining words.
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: words_counted
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.
|
4
|
+
version: 0.0.3
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Mohamad El-Husseini
|
@@ -75,6 +75,7 @@ extra_rdoc_files: []
|
|
75
75
|
files:
|
76
76
|
- .gitignore
|
77
77
|
- .rspec
|
78
|
+
- .yardopts
|
78
79
|
- Gemfile
|
79
80
|
- LICENSE.txt
|
80
81
|
- README.md
|