words_counted 0.0.2 → 0.0.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.yardopts +2 -0
- data/README.md +85 -18
- data/lib/words_counted/counter.rb +1 -1
- data/lib/words_counted/version.rb +1 -1
- metadata +2 -1
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 5d127f1d7ea83482f4efa3b8d25ff2154d6d8006
|
4
|
+
data.tar.gz: 4e537850f5749aadf25f7ac435fbc141601b3425
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 95adf80c2551fb0dd92fa260aad8b0b0eb62ddb11ced4b4a3afb70493782810c24919c84ce1b89b646c07ea34db7f94e3cd0515d10709d956139b87bd441c2be
|
7
|
+
data.tar.gz: 2494524dc28161f2ca8d7ada896c7522ddb7266b9aac5ca50288b67298f8a0b1eb1460e027d60478c23c2fd220078ce02c98e2f0f977496723ae59d5aac4cf30
|
data/.yardopts
ADDED
data/README.md
CHANGED
@@ -1,15 +1,15 @@
|
|
1
1
|
# Words Counted
|
2
2
|
|
3
|
-
This Ruby gem is a word counter that includes some handy utility methods.
|
3
|
+
This Ruby gem is a word counter that includes some handy utility methods. It lets you send in a string of text and count the number of words, get the words sorted by number occurrences, get the highest occurring words, and few more things.
|
4
4
|
|
5
5
|
### Features
|
6
6
|
|
7
7
|
1. Count the number of words in a string.
|
8
|
-
2. Get a hash of
|
9
|
-
3. Get a hash of
|
8
|
+
2. Get a hash map of words and the number of times they occur.
|
9
|
+
3. Get a hash map of words and their lengthes.
|
10
10
|
4. Get the most occurring word(s) and its number of occurrences.
|
11
11
|
5. Get the longest word(s) and its length.
|
12
|
-
6. Ability to filter words.
|
12
|
+
6. Ability to filter out words from the count. Useful if you don't want to count `a`, `the`, etc...
|
13
13
|
7. Filters special characters but respects hyphens and apostrophes.
|
14
14
|
|
15
15
|
See usage instructions for details on each feature.
|
@@ -30,7 +30,7 @@ Or install it yourself as:
|
|
30
30
|
|
31
31
|
## Usage
|
32
32
|
|
33
|
-
Create an instance of `Counter` and pass in a string and an optional filter.
|
33
|
+
Create an instance of `Counter` and pass in a string and an optional filter string.
|
34
34
|
|
35
35
|
```ruby
|
36
36
|
counter = WordsCounted::Counter.new(
|
@@ -38,50 +38,117 @@ counter = WordsCounted::Counter.new(
|
|
38
38
|
)
|
39
39
|
```
|
40
40
|
|
41
|
-
|
41
|
+
#### `.word_count`
|
42
42
|
|
43
43
|
Returns the word count of a given string. The word count includes only alpha characters. Hyphenated and words with apostrophes are considered a single word.
|
44
44
|
|
45
45
|
```ruby
|
46
46
|
counter.word_count #=> 15
|
47
47
|
```
|
48
|
-
### `.word_occurrences`
|
49
48
|
|
50
|
-
|
49
|
+
#### `.word_occurrences`
|
50
|
+
|
51
|
+
Returns a hash map of words and their number of occurrences. Uppercase and lowercase words are counted as the same word.
|
51
52
|
|
52
53
|
```ruby
|
53
|
-
counter.word_occurrences
|
54
|
+
counter.word_occurrences
|
55
|
+
#=> {
|
56
|
+
"we"=>1,
|
57
|
+
"are"=>2,
|
58
|
+
"all"=>1,
|
59
|
+
"in"=>1,
|
60
|
+
"the"=>2,
|
61
|
+
"gutter"=>1,
|
62
|
+
"but"=>1,
|
63
|
+
"some"=>1,
|
64
|
+
"of"=>1,
|
65
|
+
"us"=>1,
|
66
|
+
"looking"=>1,
|
67
|
+
"at"=>1,
|
68
|
+
"stars"=>1
|
69
|
+
}
|
54
70
|
```
|
55
71
|
|
56
|
-
|
72
|
+
#### `.most_occurring_words`
|
57
73
|
|
58
|
-
Returns a two dimensional array of the
|
74
|
+
Returns a two dimensional array of the most occurring word and its number of occurrences. In case there is a tie all tied words are returned.
|
59
75
|
|
60
76
|
```ruby
|
61
|
-
counter.most_occurring_words
|
77
|
+
counter.most_occurring_words
|
78
|
+
#=>
|
79
|
+
[
|
80
|
+
["are", 2],
|
81
|
+
["the", 2]
|
82
|
+
]
|
62
83
|
```
|
63
84
|
|
64
|
-
|
85
|
+
#### `.word_lengths`
|
65
86
|
|
66
87
|
Returns a hash of words and their lengths.
|
67
88
|
|
68
89
|
```ruby
|
69
|
-
counter.word_lengths
|
90
|
+
counter.word_lengths
|
91
|
+
#=> {
|
92
|
+
"We"=>2,
|
93
|
+
"are"=>3,
|
94
|
+
"all"=>3,
|
95
|
+
"in"=>2,
|
96
|
+
"the"=>3,
|
97
|
+
"gutter"=>6,
|
98
|
+
"but"=>3,
|
99
|
+
"some"=>4,
|
100
|
+
"of"=>2,
|
101
|
+
"us"=>2,
|
102
|
+
"looking"=>7,
|
103
|
+
"at"=>2,
|
104
|
+
"stars"=>5
|
105
|
+
}
|
70
106
|
```
|
71
107
|
|
72
|
-
|
108
|
+
#### `.longest_word`
|
73
109
|
|
74
110
|
Returns a two dimensional array of the longest word and its length. In case there is a tie all tied words are returned.
|
75
111
|
|
112
|
+
```ruby
|
113
|
+
counter.longest_words
|
114
|
+
#=> [
|
115
|
+
["looking", 7]
|
116
|
+
]
|
117
|
+
```
|
118
|
+
|
76
119
|
## Filtering
|
77
120
|
|
78
|
-
You can pass in a space-delimited word list to filter words that you don't want to count. Filter words should be lowercase
|
121
|
+
You can pass in a space-delimited word list to filter words that you don't want to count. Filter words should be *lowercase*. The filter will remove both uppercase and lowercase variants of the word.
|
122
|
+
|
123
|
+
```ruby
|
124
|
+
WordsCounted::Counter.new("Magnificent! That was magnificent, Trevor.", "was magnificent")
|
125
|
+
#<WordsCounted::Counter:0x007fd4949f99d8 @words=["That", "Trevor"]>
|
126
|
+
```
|
127
|
+
|
128
|
+
## Gotchas
|
129
|
+
|
130
|
+
A hyphen use in leu of an *em* or *en* dash will form part of the word and throw off the `word_occurences` algorithm.
|
79
131
|
|
80
132
|
```ruby
|
81
|
-
WordsCounted::Counter.new("
|
82
|
-
|
133
|
+
counter = WordsCounted::Counter.new("How do you do?-you are well, I see.")
|
134
|
+
#<WordsCounted::Counter:0x007fd494252518 @words=["How", "do", "you", "do", "-you", "are", "well", "I", "see"]>
|
135
|
+
|
136
|
+
counter.word_occurrences
|
137
|
+
#=> {
|
138
|
+
"how"=>1,
|
139
|
+
"do"=>2,
|
140
|
+
"you"=>1,
|
141
|
+
"-you"=>1, # WTF, mate!
|
142
|
+
"are"=>1,
|
143
|
+
"very"=>1,
|
144
|
+
"well"=>1,
|
145
|
+
"i"=>1,
|
146
|
+
"see"=>1
|
147
|
+
}
|
83
148
|
```
|
84
149
|
|
150
|
+
In this example, `-you` and `you` are counted as separate words. Writers should use the correct dash element, but this is not always the case.
|
151
|
+
|
85
152
|
## About
|
86
153
|
|
87
154
|
Originally I wrote this program for a code challenge. My initial implementation was decent, but it could have been better. Thanks to [Dave Yarwood](http://codereview.stackexchange.com/a/47515/1563) for helping me improve my code. Some of this code is based on his recommendations. You can find the original implementation as well as the code review on [Code Review](http://codereview.stackexchange.com/questions/46105/a-ruby-string-analyser).
|
@@ -1,6 +1,6 @@
|
|
1
1
|
module WordsCounted
|
2
2
|
class Counter
|
3
|
-
# @!
|
3
|
+
# @!words [Array] an array of words resulting from the string passed to the initializer.
|
4
4
|
attr_reader :words
|
5
5
|
|
6
6
|
# This is the criteria for defining words.
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: words_counted
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.
|
4
|
+
version: 0.0.3
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Mohamad El-Husseini
|
@@ -75,6 +75,7 @@ extra_rdoc_files: []
|
|
75
75
|
files:
|
76
76
|
- .gitignore
|
77
77
|
- .rspec
|
78
|
+
- .yardopts
|
78
79
|
- Gemfile
|
79
80
|
- LICENSE.txt
|
80
81
|
- README.md
|