activerecord-summarize 0.3.1 → 0.5.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +18 -0
- data/Gemfile.lock +2 -2
- data/README.md +15 -11
- data/bin/console +3 -2
- data/docs/use_case_moderator_dashboard.md +4 -4
- data/lib/activerecord/summarize/version.rb +1 -1
- data/lib/activerecord/summarize.rb +43 -22
- data/lib/chainable_result.rb +15 -4
- metadata +3 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: d3349ef226e79ac7b8182798fbe3f0966a94bd29a5ec31202fe14c74e681bc68
|
4
|
+
data.tar.gz: f257baecb4562c791d0d7648f45ec2549e1ae6faff2cfcd386d95a5e68fb233a
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: fdf0dece89a7d1db578414a1682adb956e8a973c7d575b7d589d0645ef906bdba2f9d849a1833fd1cc10a7b6f1e105c2c36b54532624005d67d8501e1e03aa13
|
7
|
+
data.tar.gz: a21a8e232e86706283954d4690b99c824c80b3521337c2dba10dc3f415f295844670f6d0c4ac94d7a0a8fa58c5fe408a07a64b3d31b65f20086a1d6920409382
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,21 @@
|
|
1
|
+
## [0.5.0] - 2023-05-14
|
2
|
+
|
3
|
+
- **FEATURE:** Your `summarize` blocks won't need to accept the proc second argument as often, because `ChainableResult` methods will also resolve their arguments. E.g., `query.summarize {|q| @mult = q.sum(:a) * q.sum(:b) }` now works, where previously you would have needed to write `query.summarize {|q,with| @mult = with[q.sum(:a),q.sum(:b)] {|a,b| a * b } }`.
|
4
|
+
|
5
|
+
- **IMPROVEMENT:** The conventional name of the proc provided as an optional second argument to `summarize` blocks is now `with_resolved` instead of `with`. Interactively teaching `activerecord-summarize` to some people showed that this was an improvement in clarity. The local name of the proc has always been under your control (it's your block!), so this doesn't affect anything besides documentation and tests, but if for some reason you accessed the proc at its internal name of `ChainableResult::WITH`, that will still work, too, even though we now refer to it as `ChainableResult::WITH_RESOLVED`.
|
6
|
+
|
7
|
+
## [0.4.0] - 2023-02-27
|
8
|
+
|
9
|
+
- **FEATURE:** Support for top-level .group(:belongs_to_association), returning hash with models as keys.
|
10
|
+
|
11
|
+
I didn't realize this until a few months ago, but in ActiveRecord, if `Foo belongs_to :bar`, you can do `Foo.group(:bar).count` and get back a hash with `Bar` records as keys and counts as values. (ActiveRecord executes two total queries to implement this, one for the counts grouped by the `bar_id` foreign key, then another to retrieve the `Bar` models.)
|
12
|
+
|
13
|
+
Now the same behavior works with `summarize`: you can still retrieve any number of counts and/or sums about `Foo`—including some with additional filters and even sub-grouping—in a single query, and then we'll execute one additional query to retrieve the records for the `Bar` model keys.
|
14
|
+
|
15
|
+
- **IMPROVEMENT:** `bin/console` is now much more useful for developing `activerecord-summarize`
|
16
|
+
|
17
|
+
- **IMPROVEMENT:** Added some tests for queries joining HABTM associations and (of course, supporting the new feature) `belongs_to` associations. `summarize` preceded by joins is already stable and documented, but it didn't have tests before.
|
18
|
+
|
1
19
|
## [0.3.1] - 2022-06-23
|
2
20
|
|
3
21
|
- **BUGFIX:** `with` didn't work correctly with a single argument. Embarassingly, both the time-traveling version of `with` and the trivial/fake one provided when `noop: true` is set had single argument bugs, and they were different bugs.
|
data/Gemfile.lock
CHANGED
data/README.md
CHANGED
@@ -59,7 +59,7 @@ Purchase.complete.left_joins(:region).summarize do |purchases|
|
|
59
59
|
end
|
60
60
|
```
|
61
61
|
|
62
|
-
Until the `summarize` block ends, the return value of your calculations are `ChainableResult::Future` instances, a bit like a Promise with a more convenient API. You can call any method you like on a `ChainableResult`, and you'll get back another `ChainableResult`, and they'll all turn out alright in the end—provided you called methods that would have worked if you had run that calculation without `summarize`. OTOH, using a `ChainableResult` as an argument to
|
62
|
+
Until the `summarize` block ends, the return value of your calculations are `ChainableResult::Future` instances, a bit like a Promise with a more convenient API. You can call any method you like on a `ChainableResult`, and you'll get back another `ChainableResult`, and they'll all turn out alright in the end—provided you called methods that would have worked if you had run that calculation without `summarize`. OTOH, using a `ChainableResult` as an argument to a method of a non-`ChainableResult` generally will not work.
|
63
63
|
|
64
64
|
```ruby
|
65
65
|
Purchase.last_quarter.complete.summarize do |purchases|
|
@@ -68,24 +68,28 @@ Purchase.last_quarter.complete.summarize do |purchases|
|
|
68
68
|
@vc_projection = @sales * 3
|
69
69
|
# And this won't:
|
70
70
|
@vc_projection = 3 * @sales
|
71
|
+
# But this will work since v0.5.0...
|
72
|
+
@units_sold = purchases.sum(:units)
|
73
|
+
# ...because methods of `ChainableResult` now resolve their argument(s)
|
74
|
+
@avg_unit_price = @sales / @units_sold
|
71
75
|
end
|
72
76
|
```
|
73
77
|
|
74
|
-
If, within a `summarize` block, you want to combine data from more than one `ChainableResult`, you
|
78
|
+
If, within a `summarize` block, you want to combine data from more than one `ChainableResult`, you may need to use the otherwise-optional second argument yielded to the block, a `proc` I like to name `with_resolved`. Pass it all the results you want to combine and a block that combines them and returns the new result:
|
75
79
|
|
76
80
|
```ruby
|
77
|
-
Purchase.complete.left_joins(:promotion).summarize do |purchases,
|
81
|
+
Purchase.complete.left_joins(:promotion).summarize do |purchases, with_resolved|
|
78
82
|
@all_revenue = purchases.sum(:amount)
|
79
83
|
promotions = purchases.where.not(promotions: {id: nil})
|
80
84
|
@promotion_sales = promotions.count
|
81
85
|
@promotion_discounts = promotions.sum("promotions.discount_amount")
|
82
|
-
@avg_discount =
|
86
|
+
@avg_discount = with_resolved[@promotion_sales, @promotion_discounts] do |sales, discounts|
|
83
87
|
sales.zero? ? 0 : discounts / sales
|
84
88
|
end
|
85
89
|
end
|
86
90
|
```
|
87
91
|
|
88
|
-
Treat a `
|
92
|
+
Treat a `with_resolved` block as a pure function: i.e., return the value you care about, and don't set or change any other state within the block. Behavior in any other case is undefined.
|
89
93
|
|
90
94
|
## Escape hatch
|
91
95
|
|
@@ -93,7 +97,7 @@ The query generated by `summarize` is often much faster than equivalent queries
|
|
93
97
|
|
94
98
|
By design, every operation performed with `summarize` is correct and corresponds to normal `ActiveRecord` behavior, and any operations that can't be done correctly this way or aren't yet will raise exceptions. But only imperfect humans have worked on this gem, so you might also wonder if `summarize` is producing correct results.
|
95
99
|
|
96
|
-
Fortunately, you can easily check both with `summarize(noop: true)`, which causes `summarize` to yield the original relation it was called on and a trivial `
|
100
|
+
Fortunately, you can easily check both with `summarize(noop: true)`, which causes `summarize` to yield the original relation it was called on and a trivial `with_resolved` proc. The block will be executed as though `summarize` were not involved, with each calculation executing separately and immediately returning numbers or hashes.
|
97
101
|
|
98
102
|
If you do find any case where you get different results with `summarize(noop: true)`, I'd be grateful if you filed an issue.
|
99
103
|
|
@@ -117,10 +121,10 @@ When the parent relation already has `.group` applied, `pure: true` is implied a
|
|
117
121
|
Build even more complex queries by using `summarize` on a relation that already has `.group` applied. Results are grouped just like a standard `.group(*expressions).count`, but instead of single numbers, the values are whatever set of calculations you return from the block, including further `.group(*more).calculate(:sum|:count,*args)` calculations, in whatever `Array` or `Hash` shape you arrange them. For example:
|
118
122
|
|
119
123
|
```ruby
|
120
|
-
puts Purchase.last_year.complete.group(:region_id).summarize do |purchases,
|
124
|
+
puts Purchase.last_year.complete.group(:region_id).summarize do |purchases,with_resolved|
|
121
125
|
total = purchases.count
|
122
126
|
by_quarter = purchases.group(CREATED_TO_YEAR_SQL, CREATED_TO_QUARTER_SQL).count.sort.to_h
|
123
|
-
target =
|
127
|
+
target = with_resolved[total / 4, by_quarter.values.max] {|avg_q, best_q| [avg_q * 1.25, best_q].max.round }
|
124
128
|
{last_year: total, quarters: by_quarter, unit_target: target}
|
125
129
|
end
|
126
130
|
# Output:
|
@@ -150,13 +154,13 @@ When the relation already has `group` applied, for correct results, `summarize`
|
|
150
154
|
|
151
155
|
```ruby
|
152
156
|
# A trivial example:
|
153
|
-
Purchase.complete.group(:
|
157
|
+
Purchase.complete.group(:region).summarize {|purchases| purchases.sum(:amount) }
|
154
158
|
|
155
159
|
# ...is exactly equivalent to:
|
156
|
-
Purchase.complete.group(:
|
160
|
+
Purchase.complete.group(:region).sum(:amount)
|
157
161
|
|
158
162
|
# But if there were three regions, what should the value of @target be in this case?
|
159
|
-
region_targets = Purchase.last_quarter.complete.group(:
|
163
|
+
region_targets = Purchase.last_quarter.complete.group(:region).summarize do |purchases|
|
160
164
|
@target = purchases.sum(:amount) * 1.25
|
161
165
|
end
|
162
166
|
```
|
data/bin/console
CHANGED
@@ -2,10 +2,11 @@
|
|
2
2
|
# frozen_string_literal: true
|
3
3
|
|
4
4
|
require "bundler/setup"
|
5
|
+
require "active_record"
|
5
6
|
require "activerecord/summarize"
|
7
|
+
require_relative "../test/test_data" # Test fixtures so there's something to play with
|
6
8
|
|
7
|
-
# You can
|
8
|
-
# with your gem easier. You can also use a different console, if you like.
|
9
|
+
# You can use a different console, if you like.
|
9
10
|
|
10
11
|
# (If you use this, don't forget to add pry to your Gemfile!)
|
11
12
|
# require "pry"
|
@@ -69,19 +69,19 @@ def dashboard
|
|
69
69
|
# If you forget, `daily_posts.popular.count` will raise `Unsummarizable` with a helpful message.
|
70
70
|
all_posts = Post.where(subreddit: @subreddits.select(:id)).where(created_at: 30.days.ago..)
|
71
71
|
.left_joins(:popularity_threshold_setting).order(:created_at)
|
72
|
-
@subreddit_stats = all_posts.group(:subreddit_id).summarize do |posts,
|
72
|
+
@subreddit_stats = all_posts.group(:subreddit_id).summarize do |posts, with_resolved|
|
73
73
|
daily_posts = posts.group("posts.created_at::date")
|
74
74
|
dow_not_burried = posts.where(karma: 0..).group("EXTRACT(DOW FROM posts.created_at)")
|
75
75
|
{
|
76
76
|
posts_created: posts.count,
|
77
77
|
buried_posts: posts.where(karma: ...0).count,
|
78
|
-
daily_popular_rate:
|
78
|
+
daily_popular_rate: with_resolved[
|
79
79
|
daily_posts.popular.count,
|
80
80
|
daily_posts.count
|
81
81
|
] do |popular, total|
|
82
82
|
total.map { |date, count| [date, (popular[date]||0).to_f / count] }.to_h
|
83
83
|
end,
|
84
|
-
dow_avg_comments:
|
84
|
+
dow_avg_comments: with_resolved[
|
85
85
|
dow_not_buried.sum(:comments_count),
|
86
86
|
dow_not_buried.count
|
87
87
|
] do |comments, posts|
|
@@ -94,4 +94,4 @@ end
|
|
94
94
|
|
95
95
|
Since `summarize` runs a single query that visits each relevant `posts` row just once, adding additional calculations is pretty close to free.
|
96
96
|
|
97
|
-
Even with the mental overhead of needing to join outside the block and use `
|
97
|
+
Even with the mental overhead of needing to join outside the block and use `with_resolved` to combine calculations (see [README](../README.md) for details), I think this is still easy to read, write, and reason about, and it beats the heck out of walls of SQL. What do you think?
|
@@ -7,7 +7,7 @@ module ActiveRecord::Summarize
|
|
7
7
|
class Unsummarizable < StandardError; end
|
8
8
|
|
9
9
|
class Summarize
|
10
|
-
attr_reader :current_result_row, :pure, :noop, :from_where
|
10
|
+
attr_reader :current_result_row, :base_groups, :base_association, :pure, :noop, :from_where
|
11
11
|
alias_method :pure?, :pure
|
12
12
|
alias_method :noop?, :noop
|
13
13
|
|
@@ -29,7 +29,18 @@ module ActiveRecord::Summarize
|
|
29
29
|
def initialize(relation, pure: nil, noop: false)
|
30
30
|
@relation = relation
|
31
31
|
@noop = noop
|
32
|
-
|
32
|
+
@base_groups, @base_association = relation.group_values.dup.then do |group_fields|
|
33
|
+
# Based upon a bit from ActiveRecord::Calculations.execute_grouped_calculation,
|
34
|
+
# if the base relation is grouped only by a belongs_to association, group by
|
35
|
+
# the association's foreign key.
|
36
|
+
if group_fields.size == 1 && group_fields.first.respond_to?(:to_sym)
|
37
|
+
association = relation.klass._reflect_on_association(group_fields.first)
|
38
|
+
# Like ActiveRecord's group(:association).count behavior, this only works with belongs_to associations
|
39
|
+
next [Array(association.foreign_key), association] if association&.belongs_to?
|
40
|
+
end
|
41
|
+
[group_fields, nil]
|
42
|
+
end
|
43
|
+
has_base_groups = base_groups.any?
|
33
44
|
raise Unsummarizable, "`summarize` must be pure when called on a grouped relation" if pure == false && has_base_groups
|
34
45
|
raise ArgumentError, "`summarize(noop: true)` is impossible on a grouped relation" if noop && has_base_groups
|
35
46
|
@pure = has_base_groups || !!pure
|
@@ -37,8 +48,8 @@ module ActiveRecord::Summarize
|
|
37
48
|
end
|
38
49
|
|
39
50
|
def process(&block)
|
40
|
-
# For noop, just yield the original relation and a transparent `
|
41
|
-
return yield(@relation, ChainableResult::
|
51
|
+
# For noop, just yield the original relation and a transparent `with_resolved` proc.
|
52
|
+
return yield(@relation, ChainableResult::SYNC_WITH_RESOLVED) if noop?
|
42
53
|
# Within the block, the relation and its future clones intercept calls to
|
43
54
|
# `count` and `sum`, registering them and returning a ChainableResult via
|
44
55
|
# summarize.add_calculation.
|
@@ -49,28 +60,42 @@ module ActiveRecord::Summarize
|
|
49
60
|
include InstanceMethods
|
50
61
|
end
|
51
62
|
end,
|
52
|
-
ChainableResult::
|
63
|
+
ChainableResult::WITH_RESOLVED
|
53
64
|
))
|
54
65
|
ChainableResult.with_cache(!pure?) do
|
55
66
|
# `resolve` builds the single query that answers all collected calculations,
|
56
|
-
# executes it, and aggregates the results by the values of
|
57
|
-
#
|
58
|
-
#
|
67
|
+
# executes it, and aggregates the results by the values of `base_groups`.
|
68
|
+
# In the common case of no `base_groups`, the resolve returns:
|
69
|
+
# `{[]=>[*final_value_for_each_calculation]}`
|
59
70
|
result = resolve.transform_values! do |row|
|
60
71
|
# Each row (in the common case, only one) is used to resolve any
|
61
72
|
# ChainableResults returned by the block. These may be a one-to-one mapping,
|
62
|
-
# or the block return may have combined some results via `with
|
73
|
+
# or the block return may have combined some results via `with`, chained
|
63
74
|
# additional methods on results, etc..
|
64
75
|
@current_result_row = row
|
65
76
|
future_block_result.value
|
66
77
|
end.then do |result|
|
67
|
-
#
|
68
|
-
|
69
|
-
|
70
|
-
|
71
|
-
|
72
|
-
|
73
|
-
|
78
|
+
# Now unpack/fix-up the result keys to match shape of Relation.count or Relation.group(*cols).count return values
|
79
|
+
if base_groups.empty?
|
80
|
+
# Change ungrouped result from `{[]=>v}` to `v`, like Relation.count
|
81
|
+
result.values.first
|
82
|
+
elsif base_association
|
83
|
+
# Change grouped-by-one-belongs_to-association result from `{[id1]=>v1,[id2]=>v2,...}` to
|
84
|
+
# `{<AssociatedModel id:id1>=>v1,<AssociatedModel id:id2>=>v2,...}` like Relation.group(:association).count
|
85
|
+
|
86
|
+
# Loosely based on a bit from ActiveRecord::Calculations.execute_grouped_calculation,
|
87
|
+
# retrieve the records for the group association and replace the keys of our final result.
|
88
|
+
key_class = base_association.klass.base_class
|
89
|
+
key_records = key_class
|
90
|
+
.where(key_class.primary_key => result.keys.flatten)
|
91
|
+
.index_by(&:id)
|
92
|
+
result.transform_keys! { |k| key_records[k[0]] }
|
93
|
+
elsif base_groups.size == 1
|
94
|
+
# Change grouped-by-one-column result from `{[k1]=>v1,[k2]=>v2,...}` to `{k1=>v1,k2=>v2,...}`, like Relation.group(:column).count
|
95
|
+
result.transform_keys! { |k| k[0] }
|
96
|
+
else
|
97
|
+
# Multiple-column base grouping (though perhaps relatively rare) requires no change.
|
98
|
+
result
|
74
99
|
end
|
75
100
|
end
|
76
101
|
if !pure?
|
@@ -166,7 +191,7 @@ module ActiveRecord::Summarize
|
|
166
191
|
base_group_columns = (0...base_groups.size)
|
167
192
|
data
|
168
193
|
.group_by { |row| row[base_group_columns] }
|
169
|
-
.tap { |h| h[[]] = [] if h.empty? && base_groups.
|
194
|
+
.tap { |h| h[[]] = [] if h.empty? && base_groups.empty? }
|
170
195
|
.transform_values! do |rows|
|
171
196
|
values = starting_values.map(&:dup) # map(&:dup) since some are hashes and we don't want to mutate starting_values
|
172
197
|
rows.each do |row|
|
@@ -201,14 +226,10 @@ module ActiveRecord::Summarize
|
|
201
226
|
end
|
202
227
|
end
|
203
228
|
|
204
|
-
def base_groups
|
205
|
-
@relation.group_values.dup
|
206
|
-
end
|
207
|
-
|
208
229
|
def all_groups
|
209
230
|
# keep all base groups, even if they did something silly like group by
|
210
231
|
# the same key twice, but otherwise don't repeat any groups
|
211
|
-
groups = base_groups
|
232
|
+
groups = base_groups.dup
|
212
233
|
groups_set = Set.new(groups)
|
213
234
|
@calculations.map { |f| f.relation.group_values }.flatten.each do |k|
|
214
235
|
next if groups_set.include? k
|
data/lib/chainable_result.rb
CHANGED
@@ -12,9 +12,19 @@ class ChainableResult
|
|
12
12
|
if use_cache?
|
13
13
|
return @value if @cached
|
14
14
|
@cached = true
|
15
|
-
@value = resolve_source.send(
|
15
|
+
@value = resolve_source.send(
|
16
|
+
@method,
|
17
|
+
*@args.map(&RESOLVE_ITEM),
|
18
|
+
**@opts.transform_values(&RESOLVE_ITEM),
|
19
|
+
&@block
|
20
|
+
)
|
16
21
|
else
|
17
|
-
resolve_source.send(
|
22
|
+
resolve_source.send(
|
23
|
+
@method,
|
24
|
+
*@args.map(&RESOLVE_ITEM),
|
25
|
+
**@opts.transform_values(&RESOLVE_ITEM),
|
26
|
+
&@block
|
27
|
+
)
|
18
28
|
end
|
19
29
|
end
|
20
30
|
|
@@ -86,8 +96,9 @@ class ChainableResult
|
|
86
96
|
(results.size == 1 ? results.first : results).then(&block)
|
87
97
|
end
|
88
98
|
|
89
|
-
|
90
|
-
|
99
|
+
# Shorter names are deprecated
|
100
|
+
WITH_RESOLVED = WITH = method(:with)
|
101
|
+
SYNC_WITH_RESOLVED = SYNC_WITH = method(:sync_with)
|
91
102
|
|
92
103
|
def self.resolve_item(item)
|
93
104
|
case item
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: activerecord-summarize
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.5.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Joshua Paine
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2023-05-20 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: activerecord
|
@@ -84,7 +84,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
84
84
|
- !ruby/object:Gem::Version
|
85
85
|
version: '0'
|
86
86
|
requirements: []
|
87
|
-
rubygems_version: 3.3.
|
87
|
+
rubygems_version: 3.3.7
|
88
88
|
signing_key:
|
89
89
|
specification_version: 4
|
90
90
|
summary: Run many .count and/or .sum queries in a single efficient query with minimal
|