goldmine 1.1.4 → 1.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: c686bfb600cadea453033f457a751ce8538b00af
4
- data.tar.gz: 1216c9ddd41197c6f25c1017039b0d2428a20105
3
+ metadata.gz: eac055065f84b7550241a15c980bf74e0fc892a6
4
+ data.tar.gz: 02d1831f66469c098299f819d0e3698959bb725f
5
5
  SHA512:
6
- metadata.gz: 3baf5482a83183114999ed3d705ff8c97b6bc5b0cb9a3ae02e64f0eb9af378ea8fc4a8e39e5003f6da9eb8fae275407bb541f6eb51880e886205a72843f7532e
7
- data.tar.gz: acf22665e1acf226ef31274021da2ceb406ef3a268555abb7f597fccc3b08ba7290f81ef9789b684073455b4f9c316a0d889208375c240da57e34600c0a978fd
6
+ metadata.gz: c4a8519848f4776ac73febc4850919c5449f302471a482673c8e7e47427237759e52444759544596ceb778642ebf0ce2bd866364868cee04fe7a38dbfae443cb
7
+ data.tar.gz: e1243ab291afcafa019b6b287f98a20b5054f080fc31764322cffe07b604c59d0b23893da822c27401c8727c3b9f64fabe65d9e1acb167f57a5ed82b814abfec
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- goldmine (1.1.4)
4
+ goldmine (1.2.0)
5
5
 
6
6
  GEM
7
7
  remote: https://rubygems.org/
data/README.md CHANGED
@@ -21,6 +21,7 @@ Think of __Goldmine__ as `Enumerable#group_by` on steroids.
21
21
  ---
22
22
 
23
23
  The [demo project](http://hopsoft.github.io/goldmine/) demonstrates some of Goldmine's uses.
24
+ `TODO: update the demo project`
24
25
 
25
26
  ---
26
27
 
@@ -109,7 +110,7 @@ list.pivot { |record| record[:favorite_colors] }
109
110
  }
110
111
  ```
111
112
 
112
- # Stacked pivots
113
+ ## Stacked pivots
113
114
 
114
115
  ```ruby
115
116
  list = [
@@ -142,31 +143,72 @@ end
142
143
  }
143
144
  ```
144
145
 
145
- # Returning pivots in tabular format
146
+ ## Rollups
146
147
 
147
- This feature is useful when you need to do things like export to CSV or build user interfaces.
148
+ Sometimes it's useful to roll pivots into a summary.
148
149
 
149
150
  ```ruby
150
- # using the stacked pivot example above
151
- mined.to_a
151
+ list = [1,2,3,4,5,6,7,8,9]
152
+ list = Goldmine::ArrayMiner.new(list)
153
+ pivoted = list.pivot(:less_than_5) { |i| i < 5 }.pivot(:even) { |i| i % 2 == 0 }
154
+ pivoted.rollup { |values| values.size }
155
+ # result:
156
+ {
157
+ { :less_than_5 => true, :even => false} => 2,
158
+ { :less_than_5 => true, :even => true} => 2,
159
+ { :less_than_5 => false, :even => false} => 3,
160
+ { :less_than_5 => false, :even => true} => 2
161
+ }
162
+ ```
163
+
164
+ ## Tabular data
165
+
166
+ Tabular data provides a more streamlined summary view of a pivot.
167
+
168
+ ```ruby
169
+ list = [1,2,3,4,5,6,7,8,9]
170
+ list = Goldmine::ArrayMiner.new(list)
171
+ pivoted = list.pivot(:less_than_5) { |i| i < 5 }.pivot(:even) { |i| i % 2 == 0 }
172
+ pivoted.to_tabular
152
173
  # result:
153
174
  [
154
- ["Name has an 'e'", ">= 21 years old", "Percent of Total", "Count"],
155
- [false, true, 0.4, 2],
156
- [true, true, 0.4, 2],
157
- [true, false, 0.2, 1]
175
+ ["less_than_5", "even", "percent", "count"],
176
+ [true, false, 0.22, 2],
177
+ [true, true, 0.22, 2],
178
+ [false, false, 0.33, 3],
179
+ [false, true, 0.22, 2]
158
180
  ]
159
181
  ```
160
182
 
161
- The first entry is the header row.
162
- Subsequent entries are data rows.
163
- The last value in each data row indicates the number of matches.
183
+ ## CSV table
164
184
 
165
- Need to sort the rows? Just pass a `sort_by` block.
185
+ CSV tables are a formalized version of tabular data.
186
+ They simplify the complexity of working with tabular data.
166
187
 
167
188
  ```ruby
168
- # sort on "total" i.e. 3rd value in the row
169
- mined.to_a do |row|
170
- row[2]
189
+ list = [1,2,3,4,5,6,7,8,9]
190
+ list = Goldmine::ArrayMiner.new(list)
191
+ pivoted = list.pivot(:less_than_5) { |i| i < 5 }.pivot(:even) { |i| i % 2 == 0 }
192
+ csv = pivoted.to_csv
193
+
194
+ csv.headers # => ["less_than_5", "even", "percent", "count"]
195
+
196
+ csv.each do |row|
197
+ puts row["less_than_5"]
198
+ puts row["even"]
171
199
  end
200
+
201
+ csv.to_csv
202
+ # result:
203
+ "less_than_5,even,percent,count\ntrue,false,0.22,2\ntrue,true,0.22,2\nfalse,false,0.33,3\nfalse,true,0.22,2\n"
172
204
  ```
205
+
206
+ ## Summary
207
+
208
+ Goldmine allows you to combine the power of pivots, rollups, tabular data,
209
+ & csv to construct deep insights into your data with minimal effort.
210
+
211
+ One of our common use cases is to query a database using ActiveRecord,
212
+ pivot the results, convert to csv, sort, pivot again,
213
+ then rollup the results to create data visualizations in the form of charts & graphs.
214
+
@@ -3,9 +3,11 @@ require "array_miner"
3
3
  require "hash_miner"
4
4
 
5
5
  module Goldmine
6
- def self.miner(object)
7
- return ArrayMiner.new(object) if object.is_a?(Array)
8
- return HashMiner.new(object) if object.is_a?(Hash)
9
- nil
6
+ class << self
7
+ def miner(object)
8
+ return ArrayMiner.new(object) if object.is_a?(Array)
9
+ return HashMiner.new(object) if object.is_a?(Hash)
10
+ nil
11
+ end
10
12
  end
11
13
  end
@@ -4,8 +4,9 @@ module Goldmine
4
4
  class ArrayMiner < SimpleDelegator
5
5
  attr_reader :source_data
6
6
 
7
- def initialize(array=[])
8
- super @source_data = array
7
+ def initialize(array=[], source_data: [])
8
+ @source_data = source_data
9
+ super array
9
10
  end
10
11
 
11
12
  # Pivots the Array into a Hash of mined data.
@@ -47,7 +48,7 @@ module Goldmine
47
48
  # @yield [Object] Yields once for each item in the Array
48
49
  # @return [Hash] The pivoted Hash of data.
49
50
  def pivot(name=nil, &block)
50
- reduce(HashMiner.new(source_data: source_data)) do |memo, item|
51
+ reduce(HashMiner.new(source_data: self)) do |memo, item|
51
52
  value = yield(item)
52
53
 
53
54
  if value.is_a?(Array)
@@ -1,11 +1,12 @@
1
1
  require "delegate"
2
+ require "csv"
2
3
 
3
4
  module Goldmine
4
5
  class HashMiner < SimpleDelegator
5
6
  attr_reader :source_data
6
7
 
7
- def initialize(hash={}, source_data: nil)
8
- @source_data = source_data || hash
8
+ def initialize(hash={}, source_data: [])
9
+ @source_data = source_data
9
10
  super hash
10
11
  end
11
12
 
@@ -28,8 +29,8 @@ module Goldmine
28
29
  #
29
30
  # @note This method should not be called directly. Call Array#pivot instead.
30
31
  #
31
- # @param [String] name The named of the pivot.
32
- # @yield [Object] Yields once for each item in the Array
32
+ # @param name [String] The named of the pivot.
33
+ # @yield [Object] Yields once for each item in the Array.
33
34
  # @return [Hash] The pivoted Hash of data.
34
35
  def pivot(name=nil, &block)
35
36
  return self unless goldmine
@@ -51,10 +52,51 @@ module Goldmine
51
52
  end
52
53
  end
53
54
 
55
+ # Returns a new "rolled up" Hash based on the return value of the yield.
56
+ #
57
+ # @yield [Object] Yields once for each pivoted group.
58
+ # @return [Hash] The rollup Hash of data.
59
+ def rollup
60
+ each_with_object({}) do |pair, memo|
61
+ memo[pair.first] = yield(pair.last)
62
+ end
63
+ end
64
+
65
+ # Returns a tabular representation of the pivot.
66
+ # Useful for building CSVs & data visualizations.
67
+ #
68
+ # @param percent_column_name [String] The name of the percent column (percent of total)
69
+ # @param count_column_name [String] The name of the count column (number of objects)
70
+ # @return [Array] The tabular representation of the data.
71
+ def to_tabular(percent_column_name: "percent", count_column_name: "count")
72
+ [].tap do |rows|
73
+ rows << tabular_header_from_key(first.first) + [percent_column_name, count_column_name]
74
+ rolled = rollup { |row| row.size }
75
+ rolled.each do |key, value|
76
+ tabular_row_from_key(key).tap do |row|
77
+ rows << row + [calculate_percentage(value, source_data.size), value]
78
+ end
79
+ end
80
+ end
81
+ end
82
+
83
+ # Returns an in memory CSV table representation of the pivot.
84
+ # Useful for working with data & building data visualizations.
85
+ #
86
+ # @param percent_column_name [String] The name of the percent column (percent of total)
87
+ # @param count_column_name [String] The name of the count column (number of objects)
88
+ # @return [CSV::Table] The CSV representation of the data.
89
+ def to_csv(percent_column_name: "percent", count_column_name: "count")
90
+ tabular = to_tabular(percent_column_name: percent_column_name, count_column_name: count_column_name)
91
+ header = tabular.shift
92
+ rows = tabular.map { |row| CSV::Row.new(header, row) }
93
+ CSV::Table.new rows
94
+ end
95
+
54
96
  # Assigns a key/value pair to the Hash.
55
- # @param [String] name The name of a pivot (can be null).
56
- # @param [Object] key The key to use.
57
- # @param [Object] value The value to assign
97
+ # @param name [String] The name of a pivot (can be null).
98
+ # @param key [Object] The key to use.
99
+ # @param value [Object] The value to assign
58
100
  # @return [Object] The result of the assignment.
59
101
  def assign_mined(name, key, value)
60
102
  goldmine_key = goldmine_key(name, key)
@@ -63,35 +105,31 @@ module Goldmine
63
105
  end
64
106
 
65
107
  # Creates a key for a pivot-name/key combo.
66
- # @param [String] name The name of a pivot (can be null).
67
- # @param [Object] key The key to use.
108
+ # @param name [String] The name of a pivot (can be null).
109
+ # @param key [Object] The key to use.
68
110
  # @return [Object] The constructed key.
69
111
  def goldmine_key(name, key)
70
112
  goldmine_key = { name => key } if name
71
113
  goldmine_key ||= key
72
114
  end
73
115
 
74
- # Returns the pivot keys.
75
- # @return [Array]
76
- def pivoted_keys
77
- first.first.keys
116
+ private
117
+
118
+ def calculate_percentage(count, total)
119
+ return 0.0 unless total > 0
120
+ sprintf("%.2f", count / total.to_f).to_f
78
121
  end
79
122
 
80
- # Returns pivoted data as a tabular Array that can be used to build CSVs or user interfaces.
81
- # @return [Array] Tabular pivot data
82
- # @yield [Array] sort_by block for sorting the Array
83
- def to_a(&block)
84
- rows = map do |pair|
85
- [].tap do |row|
86
- row.concat pair.first.values
87
- row << sprintf("%.2f", (pair.last.size / source_data.size.to_f)).to_f
88
- row << pair.last.size
89
- end
90
- end
91
- rows = rows.sort_by(&block) if block_given?
92
- header = [pivoted_keys.map(&:to_s), "Percent of Total", "Count"].flatten
93
- rows.insert 0, header
94
- rows
123
+ def tabular_header_from_key(key)
124
+ return key.keys.map(&:to_s) if key.is_a?(Hash)
125
+ key = [key] unless key.is_a?(Array)
126
+ (0..key.size-1).map { |i| "column#{i}" }
127
+ end
128
+
129
+ def tabular_row_from_key(key)
130
+ return key.dup if key.is_a?(Array)
131
+ return [key] unless key.is_a?(Hash)
132
+ key.values.dup
95
133
  end
96
134
 
97
135
  end
@@ -1,3 +1,3 @@
1
1
  module Goldmine
2
- VERSION = "1.1.4"
2
+ VERSION = "1.2.0"
3
3
  end
@@ -30,6 +30,44 @@ class TestGoldmine < PryTest::Test
30
30
  assert data == expected
31
31
  end
32
32
 
33
+ test "simple pivot rollup" do
34
+ list = [1,2,3,4,5,6,7,8,9]
35
+ list = Goldmine::ArrayMiner.new(list)
36
+ data = list.pivot { |i| i < 5 }
37
+ rolled = data.rollup { |row| row.size }
38
+
39
+ expected = {
40
+ true => 4,
41
+ false => 5
42
+ }
43
+
44
+ assert rolled == expected
45
+ end
46
+
47
+ test "simple pivot to_tabular" do
48
+ list = [1,2,3,4,5,6,7,8,9]
49
+ list = Goldmine::ArrayMiner.new(list)
50
+ data = list.pivot { |i| i < 5 }
51
+
52
+ expected = [
53
+ ["column0", "percent", "count"],
54
+ [true, 0.44, 4],
55
+ [false, 0.56, 5]
56
+ ]
57
+
58
+ assert data.to_tabular == expected
59
+ end
60
+
61
+ test "simple pivot to_csv" do
62
+ list = [1,2,3,4,5,6,7,8,9]
63
+ list = Goldmine::ArrayMiner.new(list)
64
+ data = list.pivot { |i| i < 5 }
65
+ csv = data.to_csv
66
+
67
+ assert csv.headers == ["column0", "percent", "count"]
68
+ assert csv.to_a == [["column0", "percent", "count"], [true, 0.44, 4], [false, 0.56, 5]]
69
+ end
70
+
33
71
  test "named pivot" do
34
72
  list = [1,2,3,4,5,6,7,8,9]
35
73
  list = Goldmine::ArrayMiner.new(list)
@@ -43,56 +81,32 @@ class TestGoldmine < PryTest::Test
43
81
  assert data == expected
44
82
  end
45
83
 
46
- test "pivoted_keys" do
84
+ test "named pivot rollup" do
47
85
  list = [1,2,3,4,5,6,7,8,9]
48
86
  list = Goldmine::ArrayMiner.new(list)
49
87
  data = list.pivot("less than 5") { |i| i < 5 }
50
- expected = ["less than 5"]
51
- assert data.pivoted_keys == expected
52
- end
53
-
54
- test "to_a tabular data" do
55
- list = [
56
- { :name => "Sally", :age => 21 },
57
- { :name => "John", :age => 28 },
58
- { :name => "Stephen", :age => 37 },
59
- { :name => "Emily", :age => 32 },
60
- { :name => "Joe", :age => 18 }
61
- ]
62
- list = Goldmine::ArrayMiner.new(list)
63
- mined = list.pivot("Name has an 'e'") do |record|
64
- !!record[:name].match(/e/i)
65
- end
66
- mined = mined.pivot(">= 21 years old") do |record|
67
- record[:age] >= 21
68
- end
88
+ rolled = data.rollup { |row| row.size }
69
89
 
70
- expected = [["Name has an 'e'", ">= 21 years old", "Percent of Total", "Count"], [true, false, 0.2, 1], [false, true, 0.4, 2], [true, true, 0.4, 2]]
71
-
72
- # block is sort_by
73
- tabular_data = mined.to_a do |row|
74
- [row[2], row[0] ? 1 : 0, row[1] ? 1 : 0]
75
- end
90
+ expected = {
91
+ { "less than 5" => true } => 4,
92
+ { "less than 5" => false } => 5
93
+ }
76
94
 
77
- assert tabular_data == expected
95
+ assert rolled == expected
78
96
  end
79
97
 
80
- test "source_data" do
81
- list = [
82
- { :name => "Sally", :age => 21 },
83
- { :name => "John", :age => 28 },
84
- { :name => "Stephen", :age => 37 },
85
- { :name => "Emily", :age => 32 },
86
- { :name => "Joe", :age => 18 }
87
- ]
98
+ test "named pivot to_tabular" do
99
+ list = [1,2,3,4,5,6,7,8,9]
88
100
  list = Goldmine::ArrayMiner.new(list)
89
- mined = list.pivot("Name has an 'e'") do |record|
90
- !!record[:name].match(/e/i)
91
- end
92
- mined = mined.pivot(">= 21 years old") do |record|
93
- record[:age] >= 21
94
- end
95
- assert mined.source_data == list
101
+ data = list.pivot("less than 5") { |i| i < 5 }
102
+
103
+ expected = [
104
+ ["less than 5", "percent", "count"],
105
+ [true, 0.44, 4],
106
+ [false, 0.56, 5]
107
+ ]
108
+
109
+ assert data.to_tabular == expected
96
110
  end
97
111
 
98
112
  test "pivot of list values" do
@@ -164,6 +178,38 @@ class TestGoldmine < PryTest::Test
164
178
  assert data == expected
165
179
  end
166
180
 
181
+ test "chained pivots rollup" do
182
+ list = [1,2,3,4,5,6,7,8,9]
183
+ list = Goldmine::ArrayMiner.new(list)
184
+ data = list.pivot { |i| i < 5 }.pivot { |i| i % 2 == 0 }
185
+ rolled = data.rollup { |row| row.size }
186
+
187
+ expected = {
188
+ [true, false] => 2,
189
+ [true, true] => 2,
190
+ [false, false] => 3,
191
+ [false, true] => 2
192
+ }
193
+
194
+ assert rolled == expected
195
+ end
196
+
197
+ test "chained pivots to_tabular" do
198
+ list = [1,2,3,4,5,6,7,8,9]
199
+ list = Goldmine::ArrayMiner.new(list)
200
+ data = list.pivot { |i| i < 5 }.pivot { |i| i % 2 == 0 }
201
+
202
+ expected = [
203
+ ["column0", "column1", "percent", "count"],
204
+ [true, false, 0.22, 2],
205
+ [true, true, 0.22, 2],
206
+ [false, false, 0.33, 3],
207
+ [false, true, 0.22, 2]
208
+ ]
209
+
210
+ assert data.to_tabular == expected
211
+ end
212
+
167
213
  test "deep chained pivots" do
168
214
  list = [1,2,3,4,5,6,7,8,9]
169
215
  list = Goldmine::ArrayMiner.new(list)
@@ -207,7 +253,6 @@ class TestGoldmine < PryTest::Test
207
253
  }
208
254
 
209
255
  assert data == expected
210
- assert data.source_data == list
211
256
  end
212
257
 
213
258
  test "named chained pivots" do
@@ -225,4 +270,53 @@ class TestGoldmine < PryTest::Test
225
270
  assert data == expected
226
271
  end
227
272
 
273
+ test "named chained pivots rollup" do
274
+ list = [1,2,3,4,5,6,7,8,9]
275
+ list = Goldmine::ArrayMiner.new(list)
276
+ data = list.pivot("less than 5") { |i| i < 5 }.pivot("divisible by 2") { |i| i % 2 == 0 }
277
+ rolled = data.rollup { |row| row.size }
278
+
279
+ expected = {
280
+ { "less than 5" => true, "divisible by 2" => false } => 2,
281
+ { "less than 5" => true, "divisible by 2" => true } => 2,
282
+ { "less than 5" => false, "divisible by 2" => false } => 3,
283
+ { "less than 5" => false, "divisible by 2" => true } => 2
284
+ }
285
+
286
+ assert rolled == expected
287
+ end
288
+
289
+ test "named chained pivots to tabular" do
290
+ list = [1,2,3,4,5,6,7,8,9]
291
+ list = Goldmine::ArrayMiner.new(list)
292
+ data = list.pivot("less than 5") { |i| i < 5 }.pivot("divisible by 2") { |i| i % 2 == 0 }
293
+
294
+ expected = [
295
+ ["less than 5", "divisible by 2", "percent", "count"],
296
+ [true, false, 0.22, 2],
297
+ [true, true, 0.22, 2],
298
+ [false, false, 0.33, 3],
299
+ [false, true, 0.22, 2]
300
+ ]
301
+
302
+ assert data.to_tabular == expected
303
+ end
304
+
305
+ test "named chained pivots to csv" do
306
+ list = [1,2,3,4,5,6,7,8,9]
307
+ list = Goldmine::ArrayMiner.new(list)
308
+ data = list.pivot("less than 5") { |i| i < 5 }.pivot("divisible by 2") { |i| i % 2 == 0 }
309
+ csv = data.to_csv
310
+
311
+ assert csv.to_a == data.to_tabular
312
+
313
+ expected = ["less than 5", "divisible by 2", "percent", "count"]
314
+ assert csv.headers == expected
315
+
316
+ row = csv.first
317
+ assert row["less than 5"] == true
318
+ assert row["divisible by 2"] == false
319
+ assert row["percent"] == 0.22
320
+ assert row ["count"] == 2
321
+ end
228
322
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: goldmine
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.1.4
4
+ version: 1.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Nathan Hopkins
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2015-05-28 00:00:00.000000000 Z
11
+ date: 2015-06-01 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rake