calculate-all 0.1.1 → 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/Gemfile +2 -2
- data/README.md +133 -41
- data/Rakefile +4 -4
- data/bin/console +3 -3
- data/lib/calculate-all.rb +41 -46
- data/lib/calculate-all/helpers.rb +32 -0
- data/lib/calculate-all/querying.rb +5 -0
- data/lib/calculate-all/version.rb +1 -1
- metadata +4 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: beb910feb72d190c1d3cb8b5795accadfcb8871c
|
4
|
+
data.tar.gz: e80af03a1220682d3c333642df00812793daa7ac
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 16cce395e20d2caf89a3db8735dd3c488a24e13e29edfa8bc7bc3e5ff85dae1122673a0d6acc4162e61f81eeb0261ae5229a9f8c9d394d5d3ac8903a935262d6
|
7
|
+
data.tar.gz: e528558f62c977b231f896ab8b14d501f02788e103405f2dab019e50eab340db20e69e159ff0b6e8d5c710d988faa562e331987dfdd90994bfea16422fa71d06
|
data/Gemfile
CHANGED
data/README.md
CHANGED
@@ -4,64 +4,156 @@ Provides `#calculate_all` method on your Active Record models, scopes and relati
|
|
4
4
|
It's a little addition to Active Record's `#count`, `#maximum`, `#minimum`, `#average` and `#sum`.
|
5
5
|
It allows to fetch all of the above and any other aggregate functions results in one request, with respect to grouping.
|
6
6
|
|
7
|
-
Tested only with Postgres and
|
7
|
+
Tested only with Postgres and MySQL only right now. It relies on automatic values type-casting of underlying driver.
|
8
8
|
|
9
9
|
## Usage
|
10
10
|
|
11
11
|
```ruby
|
12
|
-
|
13
|
-
|
14
|
-
|
12
|
+
stats = Order.group(:department_id).group(:payment_method).calculate_all(
|
13
|
+
:count,
|
14
|
+
:count_distinct_user_id,
|
15
|
+
:price_max,
|
16
|
+
:price_min,
|
17
|
+
:price_avg,
|
18
|
+
price_median: 'percentile_cont(0.5) within group (order by price desc)'
|
19
|
+
)
|
20
|
+
#
|
21
|
+
# (2.2ms) SELECT department_id, payment_method, percentile_cont(0.5) within group (order by price desc),
|
22
|
+
# COUNT(*), COUNT(DISTINCT user_id), MAX(price), MIN(price), AVG(price) FROM "orders" GROUP BY "department_id", "payment_method"
|
23
|
+
#
|
24
|
+
# => {
|
25
|
+
# [1, "cash"] => {
|
26
|
+
# count: 10,
|
27
|
+
# count_distinct_user_id: 5,
|
28
|
+
# price_max: 500,
|
29
|
+
# price_min: 100,
|
30
|
+
# price_avg: #<BigDecimal:7ff5932ff3d8,'0.3E3',9(27)>,
|
31
|
+
# price_median: #<BigDecimal:7ff5932ff3c2,'0.4E3',9(27)>
|
32
|
+
# },
|
33
|
+
# [1, "card"] => {
|
34
|
+
# ...
|
35
|
+
# }
|
36
|
+
# }
|
15
37
|
```
|
16
38
|
|
39
|
+
## Rationale
|
40
|
+
|
41
|
+
Active Record allows to use most common DB aggregate functions, COUNT(), MAX(), MIN(), AVG(), SUM() really easy.
|
42
|
+
But there's a whole world of wonderful other functions in
|
43
|
+
[Postgres](http://www.postgresql.org/docs/9.5/static/functions-aggregate.html) which I can't recommend enough
|
44
|
+
if you going to have any work with statistics and BI on your data, though MySQL has something
|
45
|
+
[too](http://dev.mysql.com/doc/refman/5.7/en/group-by-functions.html).
|
46
|
+
|
47
|
+
Also, in many cases you'll need several metrics at once, and database often has to perform a full scan on
|
48
|
+
the table for each metric, but it as well can calculate them all in one scan and one request.
|
49
|
+
|
50
|
+
`#calculate_all` to the rescue!
|
51
|
+
|
52
|
+
## Arguments
|
53
|
+
|
54
|
+
`#calculate_all` accepts a list of expression aliases and/or expression mapping.
|
55
|
+
It could be either one string of SQL,
|
56
|
+
|
57
|
+
```ruby
|
58
|
+
Model.calculate_all('SUM(price) / COUNT(DISTINCT user_id)')
|
59
|
+
```
|
60
|
+
|
61
|
+
a hash of expressions with arbitrary symbol keys
|
62
|
+
|
63
|
+
```ruby
|
64
|
+
Model.calculate_all(total: 'COUNT(*)', average_spendings: 'SUM(price) / COUNT(DISTINCT user_id)')
|
65
|
+
```
|
66
|
+
or a list of one or more symbols without expressions, in which case `#calculate_all` tries to guess
|
67
|
+
what you wanted from it.
|
68
|
+
|
69
|
+
```ruby
|
70
|
+
Model.calculate_all(:count, :average_price, :sum_price)
|
71
|
+
```
|
72
|
+
|
73
|
+
It's not so smart right now, but here's a cheatsheet:
|
74
|
+
|
75
|
+
| symbol | would fetch
|
76
|
+
|------------------------------------------------------------------------|------------
|
77
|
+
| `:count` | `COUNT(*)`
|
78
|
+
| `:count_column1`, `:column1_count` | `COUNT(column1)` (doesn't count NULL's in that column)
|
79
|
+
| `:count_distinct_column1`, `:column1_distinct_count` | `COUNT(DISTINCT column1)`
|
80
|
+
| `:max_column1`, `:column1_max`, `:maximum_column1`, `:column1_maximum` | `MAX(column1)`
|
81
|
+
| `:min_column1`, `:column1_min`, `:minimum_column1`, `:column1_minimum` | `MIN(column1)`
|
82
|
+
| `:avg_column1`, `:column1_avg`, `:average_column1`, `:column1_average` | `AVG(column1)`
|
83
|
+
| `:sum_column1`, `:column1_sum` | `SUM(column1)`
|
84
|
+
|
85
|
+
## Result
|
86
|
+
|
17
87
|
`#calculate_all` tries to mimic magic of Active Record's `#group`, `#count` and `#pluck`
|
18
88
|
so result type depends on arguments and on groupings.
|
19
89
|
|
20
|
-
### Container
|
21
|
-
|
22
90
|
If you have no `group()` on underlying scope, `#calculate_all` will return just one result.
|
91
|
+
|
92
|
+
```ruby
|
93
|
+
# same as Order.distinct.count(:user_id), so, probably useless example
|
94
|
+
# but you can have any expression with aggregate functions there.
|
95
|
+
Order.calculate_all('COUNT(DISTINCT user_id)')
|
96
|
+
# => 50
|
97
|
+
```
|
98
|
+
|
23
99
|
If you have one group, it will return hash of results, with simple keys.
|
100
|
+
|
101
|
+
```ruby
|
102
|
+
# again, Order.group(:department_id).distinct.count(:user_id) would do the same
|
103
|
+
Order.group(:department_id).calculate_all(:count_distinct_user_id)
|
104
|
+
# => {
|
105
|
+
# 1 => 20,
|
106
|
+
# 2 => 10,
|
107
|
+
# ...
|
108
|
+
# }
|
109
|
+
```
|
110
|
+
|
24
111
|
If you have two or more groupings, each result will have an array as a key.
|
25
112
|
|
26
|
-
|
113
|
+
```ruby
|
114
|
+
Order.group(:department_id).group(:department_method).calculate_all(:count_distinct_user_id)
|
115
|
+
# => {
|
116
|
+
# [1, "cash"] => 5,
|
117
|
+
# [1, "card"] => 15,
|
118
|
+
# [2, "cash"] => 1,
|
119
|
+
# ...
|
120
|
+
# }
|
121
|
+
```
|
27
122
|
|
28
123
|
If you provide just one argument to `#calculate_all`, its calculated value will be returned as is.
|
29
124
|
Otherwise results would be returned as hash(es) with symbol keys.
|
30
125
|
|
31
|
-
so, `
|
32
|
-
but `Model.group(:foo1, :foo2).calculate_all(expr1: 'count(expr1)', expr2: 'count(expr2)')` will return
|
33
|
-
something like this:
|
126
|
+
so, `Order.calculate_all(:count)` will return just a single integer, but
|
34
127
|
|
35
128
|
```ruby
|
36
|
-
|
37
|
-
|
38
|
-
|
39
|
-
|
40
|
-
|
129
|
+
Order.group(:department_id).group(:payment_method).calculate_all(:min_price, expr1: 'count(distinct user_id)')
|
130
|
+
# => {
|
131
|
+
# [1, 'cash'] => {min_price: 100, expr1: 5},
|
132
|
+
# [1, 'card'] => {min_price: 150, expr2: 15},
|
133
|
+
# ...
|
134
|
+
# }
|
41
135
|
```
|
42
136
|
|
43
|
-
### Conversion, formatting, value objects
|
44
|
-
|
45
137
|
You can pass block to calculate_all. Rows will be passed to it and returned value will be used instead of
|
46
138
|
row in result hash (or returned as is if there's no grouping)
|
47
139
|
|
48
140
|
```ruby
|
49
|
-
|
50
|
-
|
51
|
-
|
52
|
-
|
53
|
-
|
54
|
-
|
55
|
-
|
56
|
-
|
57
|
-
|
58
|
-
|
59
|
-
|
60
|
-
|
61
|
-
|
62
|
-
|
63
|
-
|
64
|
-
|
141
|
+
Order.group(:country_id).calculate_all(:count, :avg_price) { |count:, avg_price:|
|
142
|
+
"#{count} orders, #{avg_price.to_i} dollars average"
|
143
|
+
}
|
144
|
+
# => {
|
145
|
+
# 1 => "5 orders, 120 dollars average",
|
146
|
+
# 2 => "10 orders, 200 dollars average"
|
147
|
+
# }
|
148
|
+
|
149
|
+
Order.group(:country_id).calculate_all(:avg_price) { |avg_price| avg_price.to_i }
|
150
|
+
# => {
|
151
|
+
# 1 => 120,
|
152
|
+
# 2 => 200
|
153
|
+
# }
|
154
|
+
|
155
|
+
Order.calculate_all(:count, :max_price, &OpenStruct.method(:new))
|
156
|
+
# => #<OpenStruct max_price=500, count=15>
|
65
157
|
```
|
66
158
|
|
67
159
|
## groupdate compatibility
|
@@ -69,14 +161,14 @@ row in result hash (or returned as is if there's no grouping)
|
|
69
161
|
calculate-all should work with [groupdate](https://github.com/ankane/groupdate) too:
|
70
162
|
|
71
163
|
```ruby
|
72
|
-
|
73
|
-
|
74
|
-
|
75
|
-
|
76
|
-
|
77
|
-
|
78
|
-
|
79
|
-
|
164
|
+
Order.group_by_year(:created_at, last: 5, default_value: {}).calculate_all(:price_min, :price_max)
|
165
|
+
=> {
|
166
|
+
Sun, 01 Jan 2012 => {},
|
167
|
+
Tue, 01 Jan 2013 => {},
|
168
|
+
Wed, 01 Jan 2014 => {},
|
169
|
+
Thu, 01 Jan 2015 => {},
|
170
|
+
Fri, 01 Jan 2016 => {:price_min=>100, :price_max=>500}
|
171
|
+
}
|
80
172
|
```
|
81
173
|
|
82
174
|
## Installation
|
data/Rakefile
CHANGED
@@ -1,9 +1,9 @@
|
|
1
|
-
require
|
2
|
-
require
|
1
|
+
require 'bundler/gem_tasks'
|
2
|
+
require 'rake/testtask'
|
3
3
|
|
4
4
|
Rake::TestTask.new(:test) do |t|
|
5
|
-
t.libs <<
|
6
|
-
t.libs <<
|
5
|
+
t.libs << 'test'
|
6
|
+
t.libs << 'lib'
|
7
7
|
t.test_files = FileList['test/**/*_test.rb']
|
8
8
|
end
|
9
9
|
|
data/bin/console
CHANGED
@@ -1,10 +1,10 @@
|
|
1
1
|
#!/usr/bin/env ruby
|
2
2
|
|
3
|
-
require
|
4
|
-
require
|
3
|
+
require 'bundler/setup'
|
4
|
+
require 'calculate-all'
|
5
5
|
|
6
6
|
require './test/test_helper'
|
7
7
|
require 'logger'
|
8
8
|
ActiveRecord::Base.logger = Logger.new(STDERR)
|
9
|
-
require
|
9
|
+
require 'pry'
|
10
10
|
Pry.start
|
data/lib/calculate-all.rb
CHANGED
@@ -1,89 +1,84 @@
|
|
1
|
-
require
|
2
|
-
require
|
1
|
+
require 'calculate-all/version'
|
2
|
+
require 'calculate-all/helpers'
|
3
|
+
require 'calculate-all/querying'
|
3
4
|
|
4
5
|
module CalculateAll
|
6
|
+
# Method to aggregate function results in one request
|
5
7
|
def calculate_all(*function_aliases, **functions, &block)
|
8
|
+
|
9
|
+
# If only one function_alias is given, the result can be just a single value
|
10
|
+
# So return [{ cash: 3 }] instead of [{ cash: { count: 3 }}]
|
6
11
|
if function_aliases.size == 1 && functions == {}
|
7
12
|
return_plain_values = true
|
8
13
|
end
|
14
|
+
|
15
|
+
# Convert the function_aliases to actual SQL
|
9
16
|
functions.merge!(
|
10
17
|
CalculateAll::Helpers.decode_function_aliases(function_aliases)
|
11
18
|
)
|
19
|
+
|
20
|
+
# Check if any functions are given
|
12
21
|
if functions == {}
|
13
|
-
raise ArgumentError,
|
14
|
-
end
|
15
|
-
# groupdate compatibility
|
16
|
-
group_values = self.group_values
|
17
|
-
if !group_values.is_a?(Array) && group_values.respond_to?(:relation)
|
18
|
-
group_values = group_values.relation
|
22
|
+
raise ArgumentError, 'provide at least one function to calculate'
|
19
23
|
end
|
24
|
+
|
25
|
+
# If function is called without a group, the pluck method will still return
|
26
|
+
# an array but it is an array with the final results instead of each group
|
27
|
+
# The plain_rows boolean states how the results should be used
|
20
28
|
if functions.size == 1 && group_values.size == 0
|
21
29
|
plain_rows = true
|
22
30
|
end
|
23
31
|
|
32
|
+
# Final output hash
|
24
33
|
results = {}
|
25
34
|
|
26
|
-
|
35
|
+
# Fetch all the requested calculations from the database
|
36
|
+
# Note the map(&:to_s). It is required since groupdate returns a
|
37
|
+
# Groupdate::OrderHack instead of a string for the group_values which is not
|
38
|
+
# accepted by ActiveRecord's pluck method.
|
39
|
+
pluck(*group_values.map(&:to_s), *functions.values).each do |row|
|
40
|
+
|
41
|
+
# If no grouping, make sure it is still a results array
|
27
42
|
row = [row] if plain_rows
|
43
|
+
|
44
|
+
# If only one value, return a single value, else return a hash
|
28
45
|
if return_plain_values
|
29
46
|
value = row.last
|
30
47
|
else
|
31
48
|
value = functions.keys.zip(row.last(functions.size)).to_h
|
32
49
|
end
|
33
50
|
|
51
|
+
# Call the block for each group
|
34
52
|
value = block.call(value) if block
|
35
53
|
|
36
54
|
# Return unwrapped hash directly for scope without any .group()
|
37
55
|
return value if group_values.empty?
|
38
56
|
|
57
|
+
# If only one group is provided, the resulting key is just the group name
|
58
|
+
# if multiple group methods are provided, the key will be an array.
|
39
59
|
if group_values.size == 1
|
40
60
|
key = row.first
|
41
61
|
else
|
42
62
|
key = row.first(group_values.size)
|
43
63
|
end
|
64
|
+
|
65
|
+
# Set the value in the output array
|
44
66
|
results[key] = value
|
45
67
|
end
|
46
68
|
|
69
|
+
# Return the output array
|
47
70
|
results
|
48
71
|
end
|
49
|
-
|
50
|
-
module Helpers
|
51
|
-
module_function
|
52
|
-
def decode_function_aliases(aliases)
|
53
|
-
aliases.map do |key|
|
54
|
-
function =
|
55
|
-
case key
|
56
|
-
when String
|
57
|
-
key
|
58
|
-
when :count
|
59
|
-
"COUNT(*)"
|
60
|
-
when /^(.*)_distinct_count$/, /^count_distinct_(.*)$/
|
61
|
-
"COUNT(DISTINCT #{$1})"
|
62
|
-
when /^(.*)_(count|sum|max|min|avg)$/
|
63
|
-
"#{$2.upcase}(#{$1})"
|
64
|
-
when /^(count|sum|max|min|avg)_(.*)$$/
|
65
|
-
"#{$1.upcase}(#{$2})"
|
66
|
-
when /^(.*)_average$/, /^average_(.*)$/
|
67
|
-
"AVG(#{$1})"
|
68
|
-
when /^(.*)_maximum$/, /^maximum_(.*)$/
|
69
|
-
"MAX(#{$1})"
|
70
|
-
when /^(.*)_minimum$/, /^minimum_(.*)$/
|
71
|
-
"MIN(#{$1})"
|
72
|
-
else
|
73
|
-
raise ArgumentError, "Can't recognize function alias #{key}"
|
74
|
-
end
|
75
|
-
[key, function]
|
76
|
-
end.to_h
|
77
|
-
end
|
78
|
-
end
|
79
|
-
module Querying
|
80
|
-
delegate :calculate_all, to: :all
|
81
|
-
end
|
82
72
|
end
|
83
73
|
|
84
|
-
#
|
85
|
-
|
86
|
-
# including in module instead, for groupdate compatibility
|
87
|
-
ActiveRecord::Calculations.include CalculateAll
|
74
|
+
# Make the calculate_all method available for all ActiveRecord::Relations instances
|
75
|
+
ActiveRecord::Relation.include CalculateAll
|
88
76
|
|
77
|
+
# Make the calculate_all method available for all ActiveRecord::Base classes
|
78
|
+
# You can for example call Orders.calculate_all(:count, :sum_cents)
|
89
79
|
ActiveRecord::Base.extend CalculateAll::Querying
|
80
|
+
|
81
|
+
# A hack for groupdate since it checks if the calculate_all method is defined
|
82
|
+
# on the ActiveRecord::Calculations module. It is never called but it is just
|
83
|
+
# needed for the check.
|
84
|
+
ActiveRecord::Calculations.include CalculateAll::Querying
|
@@ -0,0 +1,32 @@
|
|
1
|
+
module CalculateAll
|
2
|
+
module Helpers
|
3
|
+
module_function
|
4
|
+
# Method to convert function aliases like :count to SQL commands like 'COUNT(*)'
|
5
|
+
def decode_function_aliases(aliases)
|
6
|
+
aliases.map do |key|
|
7
|
+
function =
|
8
|
+
case key
|
9
|
+
when String
|
10
|
+
key
|
11
|
+
when :count
|
12
|
+
'COUNT(*)'
|
13
|
+
when /^(.*)_distinct_count$/, /^count_distinct_(.*)$/
|
14
|
+
"COUNT(DISTINCT #{$1})"
|
15
|
+
when /^(.*)_(count|sum|max|min|avg)$/
|
16
|
+
"#{$2.upcase}(#{$1})"
|
17
|
+
when /^(count|sum|max|min|avg)_(.*)$$/
|
18
|
+
"#{$1.upcase}(#{$2})"
|
19
|
+
when /^(.*)_average$/, /^average_(.*)$/
|
20
|
+
"AVG(#{$1})"
|
21
|
+
when /^(.*)_maximum$/, /^maximum_(.*)$/
|
22
|
+
"MAX(#{$1})"
|
23
|
+
when /^(.*)_minimum$/, /^minimum_(.*)$/
|
24
|
+
"MIN(#{$1})"
|
25
|
+
else
|
26
|
+
raise ArgumentError, "Can't recognize function alias #{key}"
|
27
|
+
end
|
28
|
+
[key, function]
|
29
|
+
end.to_h
|
30
|
+
end
|
31
|
+
end
|
32
|
+
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: calculate-all
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- codesnik
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2017-01-24 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: activerecord
|
@@ -126,6 +126,8 @@ files:
|
|
126
126
|
- bin/setup
|
127
127
|
- calculate-all.gemspec
|
128
128
|
- lib/calculate-all.rb
|
129
|
+
- lib/calculate-all/helpers.rb
|
130
|
+
- lib/calculate-all/querying.rb
|
129
131
|
- lib/calculate-all/version.rb
|
130
132
|
homepage: http://github.com/codesnik/calculate-all
|
131
133
|
licenses:
|