activerecord-summarize 0.3.1 → 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: d491ee7730156f77105ec7df6bac79b6410fa97b3387816f103dee233b43df8d
4
- data.tar.gz: f33be41270ab955fcf2bf9b227cc7c212ae3aa385786da3a5b424a3e1e707e47
3
+ metadata.gz: d3349ef226e79ac7b8182798fbe3f0966a94bd29a5ec31202fe14c74e681bc68
4
+ data.tar.gz: f257baecb4562c791d0d7648f45ec2549e1ae6faff2cfcd386d95a5e68fb233a
5
5
  SHA512:
6
- metadata.gz: 0d1f5308da4fc8b781e8dd5a69a10c671106acdb9b565c056d23b33114fc96f8d4ff9a60f3887b861f9142b2564c3f33f17d9e8ea52c212c27abbdacc861e2d6
7
- data.tar.gz: 318bad930ac53001068e6e3396d921f71ff62c8a0899abe7a96a55d9dab59aa7e6e45c2cdc6e7a5f5ae60b67f18a47be79cb73ee29a85e1fac25a988c43dc47a
6
+ metadata.gz: fdf0dece89a7d1db578414a1682adb956e8a973c7d575b7d589d0645ef906bdba2f9d849a1833fd1cc10a7b6f1e105c2c36b54532624005d67d8501e1e03aa13
7
+ data.tar.gz: a21a8e232e86706283954d4690b99c824c80b3521337c2dba10dc3f415f295844670f6d0c4ac94d7a0a8fa58c5fe408a07a64b3d31b65f20086a1d6920409382
data/CHANGELOG.md CHANGED
@@ -1,3 +1,21 @@
1
+ ## [0.5.0] - 2023-05-14
2
+
3
+ - **FEATURE:** Your `summarize` blocks won't need to accept the proc second argument as often, because `ChainableResult` methods will also resolve their arguments. E.g., `query.summarize {|q| @mult = q.sum(:a) * q.sum(:b) }` now works, where previously you would have needed to write `query.summarize {|q,with| @mult = with[q.sum(:a),q.sum(:b)] {|a,b| a * b } }`.
4
+
5
+ - **IMPROVEMENT:** The conventional name of the proc provided as an optional second argument to `summarize` blocks is now `with_resolved` instead of `with`. Interactively teaching `activerecord-summarize` to some people showed that this was an improvement in clarity. The local name of the proc has always been under your control (it's your block!), so this doesn't affect anything besides documentation and tests, but if for some reason you accessed the proc at its internal name of `ChainableResult::WITH`, that will still work, too, even though we now refer to it as `ChainableResult::WITH_RESOLVED`.
6
+
7
+ ## [0.4.0] - 2023-02-27
8
+
9
+ - **FEATURE:** Support for top-level .group(:belongs_to_association), returning hash with models as keys.
10
+
11
+ I didn't realize this until a few months ago, but in ActiveRecord, if `Foo belongs_to :bar`, you can do `Foo.group(:bar).count` and get back a hash with `Bar` records as keys and counts as values. (ActiveRecord executes two total queries to implement this, one for the counts grouped by the `bar_id` foreign key, then another to retrieve the `Bar` models.)
12
+
13
+ Now the same behavior works with `summarize`: you can still retrieve any number of counts and/or sums about `Foo`—including some with additional filters and even sub-grouping—in a single query, and then we'll execute one additional query to retrieve the records for the `Bar` model keys.
14
+
15
+ - **IMPROVEMENT:** `bin/console` is now much more useful for developing `activerecord-summarize`
16
+
17
+ - **IMPROVEMENT:** Added some tests for queries joining HABTM associations and (of course, supporting the new feature) `belongs_to` associations. `summarize` preceded by joins is already stable and documented, but it didn't have tests before.
18
+
1
19
  ## [0.3.1] - 2022-06-23
2
20
 
3
21
  - **BUGFIX:** `with` didn't work correctly with a single argument. Embarassingly, both the time-traveling version of `with` and the trivial/fake one provided when `noop: true` is set had single argument bugs, and they were different bugs.
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- activerecord-summarize (0.3.1)
4
+ activerecord-summarize (0.5.0)
5
5
  activerecord (>= 5.0)
6
6
 
7
7
  GEM
@@ -65,4 +65,4 @@ DEPENDENCIES
65
65
  standard (~> 1.3)
66
66
 
67
67
  BUNDLED WITH
68
- 2.3.3
68
+ 2.4.13
data/README.md CHANGED
@@ -59,7 +59,7 @@ Purchase.complete.left_joins(:region).summarize do |purchases|
59
59
  end
60
60
  ```
61
61
 
62
- Until the `summarize` block ends, the return value of your calculations are `ChainableResult::Future` instances, a bit like a Promise with a more convenient API. You can call any method you like on a `ChainableResult`, and you'll get back another `ChainableResult`, and they'll all turn out alright in the end—provided you called methods that would have worked if you had run that calculation without `summarize`. OTOH, using a `ChainableResult` as an argument to another method generally will not work.
62
+ Until the `summarize` block ends, the return value of your calculations are `ChainableResult::Future` instances, a bit like a Promise with a more convenient API. You can call any method you like on a `ChainableResult`, and you'll get back another `ChainableResult`, and they'll all turn out alright in the end—provided you called methods that would have worked if you had run that calculation without `summarize`. OTOH, using a `ChainableResult` as an argument to a method of a non-`ChainableResult` generally will not work.
63
63
 
64
64
  ```ruby
65
65
  Purchase.last_quarter.complete.summarize do |purchases|
@@ -68,24 +68,28 @@ Purchase.last_quarter.complete.summarize do |purchases|
68
68
  @vc_projection = @sales * 3
69
69
  # And this won't:
70
70
  @vc_projection = 3 * @sales
71
+ # But this will work since v0.5.0...
72
+ @units_sold = purchases.sum(:units)
73
+ # ...because methods of `ChainableResult` now resolve their argument(s)
74
+ @avg_unit_price = @sales / @units_sold
71
75
  end
72
76
  ```
73
77
 
74
- If, within a `summarize` block, you want to combine data from more than one `ChainableResult`, you must use the otherwise-optional second argument yielded to the block, a `proc` I like to name `with`. Pass it all the results you want to combine and a block that combines them and returns the new result:
78
+ If, within a `summarize` block, you want to combine data from more than one `ChainableResult`, you may need to use the otherwise-optional second argument yielded to the block, a `proc` I like to name `with_resolved`. Pass it all the results you want to combine and a block that combines them and returns the new result:
75
79
 
76
80
  ```ruby
77
- Purchase.complete.left_joins(:promotion).summarize do |purchases, with|
81
+ Purchase.complete.left_joins(:promotion).summarize do |purchases, with_resolved|
78
82
  @all_revenue = purchases.sum(:amount)
79
83
  promotions = purchases.where.not(promotions: {id: nil})
80
84
  @promotion_sales = promotions.count
81
85
  @promotion_discounts = promotions.sum("promotions.discount_amount")
82
- @avg_discount = with[@promotion_sales, @promotion_discounts] do |sales, discounts|
86
+ @avg_discount = with_resolved[@promotion_sales, @promotion_discounts] do |sales, discounts|
83
87
  sales.zero? ? 0 : discounts / sales
84
88
  end
85
89
  end
86
90
  ```
87
91
 
88
- Treat a `with` block as a pure function: i.e., return the value you care about, and don't set or change any other state within the block. Behavior in any other case is undefined.
92
+ Treat a `with_resolved` block as a pure function: i.e., return the value you care about, and don't set or change any other state within the block. Behavior in any other case is undefined.
89
93
 
90
94
  ## Escape hatch
91
95
 
@@ -93,7 +97,7 @@ The query generated by `summarize` is often much faster than equivalent queries
93
97
 
94
98
  By design, every operation performed with `summarize` is correct and corresponds to normal `ActiveRecord` behavior, and any operations that can't be done correctly this way or aren't yet will raise exceptions. But only imperfect humans have worked on this gem, so you might also wonder if `summarize` is producing correct results.
95
99
 
96
- Fortunately, you can easily check both with `summarize(noop: true)`, which causes `summarize` to yield the original relation it was called on and a trivial `with` proc. The block will be executed as though `summarize` were not involved, with each calculation executing separately and immediately returning numbers or hashes.
100
+ Fortunately, you can easily check both with `summarize(noop: true)`, which causes `summarize` to yield the original relation it was called on and a trivial `with_resolved` proc. The block will be executed as though `summarize` were not involved, with each calculation executing separately and immediately returning numbers or hashes.
97
101
 
98
102
  If you do find any case where you get different results with `summarize(noop: true)`, I'd be grateful if you filed an issue.
99
103
 
@@ -117,10 +121,10 @@ When the parent relation already has `.group` applied, `pure: true` is implied a
117
121
  Build even more complex queries by using `summarize` on a relation that already has `.group` applied. Results are grouped just like a standard `.group(*expressions).count`, but instead of single numbers, the values are whatever set of calculations you return from the block, including further `.group(*more).calculate(:sum|:count,*args)` calculations, in whatever `Array` or `Hash` shape you arrange them. For example:
118
122
 
119
123
  ```ruby
120
- puts Purchase.last_year.complete.group(:region_id).summarize do |purchases,with|
124
+ puts Purchase.last_year.complete.group(:region_id).summarize do |purchases,with_resolved|
121
125
  total = purchases.count
122
126
  by_quarter = purchases.group(CREATED_TO_YEAR_SQL, CREATED_TO_QUARTER_SQL).count.sort.to_h
123
- target = with[total / 4, by_quarter.values.max] {|avg_q, best_q| [avg_q * 1.25, best_q].max.round }
127
+ target = with_resolved[total / 4, by_quarter.values.max] {|avg_q, best_q| [avg_q * 1.25, best_q].max.round }
124
128
  {last_year: total, quarters: by_quarter, unit_target: target}
125
129
  end
126
130
  # Output:
@@ -150,13 +154,13 @@ When the relation already has `group` applied, for correct results, `summarize`
150
154
 
151
155
  ```ruby
152
156
  # A trivial example:
153
- Purchase.complete.group(:region_id).summarize {|purchases| purchases.sum(:amount) }
157
+ Purchase.complete.group(:region).summarize {|purchases| purchases.sum(:amount) }
154
158
 
155
159
  # ...is exactly equivalent to:
156
- Purchase.complete.group(:region_id).sum(:amount)
160
+ Purchase.complete.group(:region).sum(:amount)
157
161
 
158
162
  # But if there were three regions, what should the value of @target be in this case?
159
- region_targets = Purchase.last_quarter.complete.group(:region_id).summarize do |purchases|
163
+ region_targets = Purchase.last_quarter.complete.group(:region).summarize do |purchases|
160
164
  @target = purchases.sum(:amount) * 1.25
161
165
  end
162
166
  ```
data/bin/console CHANGED
@@ -2,10 +2,11 @@
2
2
  # frozen_string_literal: true
3
3
 
4
4
  require "bundler/setup"
5
+ require "active_record"
5
6
  require "activerecord/summarize"
7
+ require_relative "../test/test_data" # Test fixtures so there's something to play with
6
8
 
7
- # You can add fixtures and/or initialization code here to make experimenting
8
- # with your gem easier. You can also use a different console, if you like.
9
+ # You can use a different console, if you like.
9
10
 
10
11
  # (If you use this, don't forget to add pry to your Gemfile!)
11
12
  # require "pry"
@@ -69,19 +69,19 @@ def dashboard
69
69
  # If you forget, `daily_posts.popular.count` will raise `Unsummarizable` with a helpful message.
70
70
  all_posts = Post.where(subreddit: @subreddits.select(:id)).where(created_at: 30.days.ago..)
71
71
  .left_joins(:popularity_threshold_setting).order(:created_at)
72
- @subreddit_stats = all_posts.group(:subreddit_id).summarize do |posts, with|
72
+ @subreddit_stats = all_posts.group(:subreddit_id).summarize do |posts, with_resolved|
73
73
  daily_posts = posts.group("posts.created_at::date")
74
74
  dow_not_burried = posts.where(karma: 0..).group("EXTRACT(DOW FROM posts.created_at)")
75
75
  {
76
76
  posts_created: posts.count,
77
77
  buried_posts: posts.where(karma: ...0).count,
78
- daily_popular_rate: with[
78
+ daily_popular_rate: with_resolved[
79
79
  daily_posts.popular.count,
80
80
  daily_posts.count
81
81
  ] do |popular, total|
82
82
  total.map { |date, count| [date, (popular[date]||0).to_f / count] }.to_h
83
83
  end,
84
- dow_avg_comments: with[
84
+ dow_avg_comments: with_resolved[
85
85
  dow_not_buried.sum(:comments_count),
86
86
  dow_not_buried.count
87
87
  ] do |comments, posts|
@@ -94,4 +94,4 @@ end
94
94
 
95
95
  Since `summarize` runs a single query that visits each relevant `posts` row just once, adding additional calculations is pretty close to free.
96
96
 
97
- Even with the mental overhead of needing to join outside the block and use `with` to combine calculations (see [README](../README.md) for details), I think this is still easy to read, write, and reason about, and it beats the heck out of walls of SQL. What do you think?
97
+ Even with the mental overhead of needing to join outside the block and use `with_resolved` to combine calculations (see [README](../README.md) for details), I think this is still easy to read, write, and reason about, and it beats the heck out of walls of SQL. What do you think?
@@ -2,6 +2,6 @@
2
2
 
3
3
  module ActiveRecord
4
4
  module Summarize
5
- VERSION = "0.3.1"
5
+ VERSION = "0.5.0"
6
6
  end
7
7
  end
@@ -7,7 +7,7 @@ module ActiveRecord::Summarize
7
7
  class Unsummarizable < StandardError; end
8
8
 
9
9
  class Summarize
10
- attr_reader :current_result_row, :pure, :noop, :from_where
10
+ attr_reader :current_result_row, :base_groups, :base_association, :pure, :noop, :from_where
11
11
  alias_method :pure?, :pure
12
12
  alias_method :noop?, :noop
13
13
 
@@ -29,7 +29,18 @@ module ActiveRecord::Summarize
29
29
  def initialize(relation, pure: nil, noop: false)
30
30
  @relation = relation
31
31
  @noop = noop
32
- has_base_groups = relation.group_values.any?
32
+ @base_groups, @base_association = relation.group_values.dup.then do |group_fields|
33
+ # Based upon a bit from ActiveRecord::Calculations.execute_grouped_calculation,
34
+ # if the base relation is grouped only by a belongs_to association, group by
35
+ # the association's foreign key.
36
+ if group_fields.size == 1 && group_fields.first.respond_to?(:to_sym)
37
+ association = relation.klass._reflect_on_association(group_fields.first)
38
+ # Like ActiveRecord's group(:association).count behavior, this only works with belongs_to associations
39
+ next [Array(association.foreign_key), association] if association&.belongs_to?
40
+ end
41
+ [group_fields, nil]
42
+ end
43
+ has_base_groups = base_groups.any?
33
44
  raise Unsummarizable, "`summarize` must be pure when called on a grouped relation" if pure == false && has_base_groups
34
45
  raise ArgumentError, "`summarize(noop: true)` is impossible on a grouped relation" if noop && has_base_groups
35
46
  @pure = has_base_groups || !!pure
@@ -37,8 +48,8 @@ module ActiveRecord::Summarize
37
48
  end
38
49
 
39
50
  def process(&block)
40
- # For noop, just yield the original relation and a transparent `with` proc.
41
- return yield(@relation, ChainableResult::SYNC_WITH) if noop?
51
+ # For noop, just yield the original relation and a transparent `with_resolved` proc.
52
+ return yield(@relation, ChainableResult::SYNC_WITH_RESOLVED) if noop?
42
53
  # Within the block, the relation and its future clones intercept calls to
43
54
  # `count` and `sum`, registering them and returning a ChainableResult via
44
55
  # summarize.add_calculation.
@@ -49,28 +60,42 @@ module ActiveRecord::Summarize
49
60
  include InstanceMethods
50
61
  end
51
62
  end,
52
- ChainableResult::WITH
63
+ ChainableResult::WITH_RESOLVED
53
64
  ))
54
65
  ChainableResult.with_cache(!pure?) do
55
66
  # `resolve` builds the single query that answers all collected calculations,
56
- # executes it, and aggregates the results by the values of
57
- # `@relation.group_values``. In the common case of no `@relation.group_values`,
58
- # the result is just `{[]=>[*final_value_for_each_calculation]}`
67
+ # executes it, and aggregates the results by the values of `base_groups`.
68
+ # In the common case of no `base_groups`, the resolve returns:
69
+ # `{[]=>[*final_value_for_each_calculation]}`
59
70
  result = resolve.transform_values! do |row|
60
71
  # Each row (in the common case, only one) is used to resolve any
61
72
  # ChainableResults returned by the block. These may be a one-to-one mapping,
62
- # or the block return may have combined some results via `with` or chained
73
+ # or the block return may have combined some results via `with`, chained
63
74
  # additional methods on results, etc..
64
75
  @current_result_row = row
65
76
  future_block_result.value
66
77
  end.then do |result|
67
- # Change ungrouped result from `{[]=>v}` to `v` and grouped-by-one-column
68
- # result from `{[k1]=>v1,[k2]=>v2,...}` to `{k1=>v1,k2=>v2,...}`.
69
- # (Those are both probably more common than multiple-column base grouping.)
70
- case @relation.group_values.size
71
- when 0 then result.values.first
72
- when 1 then result.transform_keys! { |k| k.first }
73
- else result
78
+ # Now unpack/fix-up the result keys to match shape of Relation.count or Relation.group(*cols).count return values
79
+ if base_groups.empty?
80
+ # Change ungrouped result from `{[]=>v}` to `v`, like Relation.count
81
+ result.values.first
82
+ elsif base_association
83
+ # Change grouped-by-one-belongs_to-association result from `{[id1]=>v1,[id2]=>v2,...}` to
84
+ # `{<AssociatedModel id:id1>=>v1,<AssociatedModel id:id2>=>v2,...}` like Relation.group(:association).count
85
+
86
+ # Loosely based on a bit from ActiveRecord::Calculations.execute_grouped_calculation,
87
+ # retrieve the records for the group association and replace the keys of our final result.
88
+ key_class = base_association.klass.base_class
89
+ key_records = key_class
90
+ .where(key_class.primary_key => result.keys.flatten)
91
+ .index_by(&:id)
92
+ result.transform_keys! { |k| key_records[k[0]] }
93
+ elsif base_groups.size == 1
94
+ # Change grouped-by-one-column result from `{[k1]=>v1,[k2]=>v2,...}` to `{k1=>v1,k2=>v2,...}`, like Relation.group(:column).count
95
+ result.transform_keys! { |k| k[0] }
96
+ else
97
+ # Multiple-column base grouping (though perhaps relatively rare) requires no change.
98
+ result
74
99
  end
75
100
  end
76
101
  if !pure?
@@ -166,7 +191,7 @@ module ActiveRecord::Summarize
166
191
  base_group_columns = (0...base_groups.size)
167
192
  data
168
193
  .group_by { |row| row[base_group_columns] }
169
- .tap { |h| h[[]] = [] if h.empty? && base_groups.size.zero? }
194
+ .tap { |h| h[[]] = [] if h.empty? && base_groups.empty? }
170
195
  .transform_values! do |rows|
171
196
  values = starting_values.map(&:dup) # map(&:dup) since some are hashes and we don't want to mutate starting_values
172
197
  rows.each do |row|
@@ -201,14 +226,10 @@ module ActiveRecord::Summarize
201
226
  end
202
227
  end
203
228
 
204
- def base_groups
205
- @relation.group_values.dup
206
- end
207
-
208
229
  def all_groups
209
230
  # keep all base groups, even if they did something silly like group by
210
231
  # the same key twice, but otherwise don't repeat any groups
211
- groups = base_groups
232
+ groups = base_groups.dup
212
233
  groups_set = Set.new(groups)
213
234
  @calculations.map { |f| f.relation.group_values }.flatten.each do |k|
214
235
  next if groups_set.include? k
@@ -12,9 +12,19 @@ class ChainableResult
12
12
  if use_cache?
13
13
  return @value if @cached
14
14
  @cached = true
15
- @value = resolve_source.send(@method, *@args, **@opts, &@block)
15
+ @value = resolve_source.send(
16
+ @method,
17
+ *@args.map(&RESOLVE_ITEM),
18
+ **@opts.transform_values(&RESOLVE_ITEM),
19
+ &@block
20
+ )
16
21
  else
17
- resolve_source.send(@method, *@args, **@opts, &@block)
22
+ resolve_source.send(
23
+ @method,
24
+ *@args.map(&RESOLVE_ITEM),
25
+ **@opts.transform_values(&RESOLVE_ITEM),
26
+ &@block
27
+ )
18
28
  end
19
29
  end
20
30
 
@@ -86,8 +96,9 @@ class ChainableResult
86
96
  (results.size == 1 ? results.first : results).then(&block)
87
97
  end
88
98
 
89
- WITH = method(:with)
90
- SYNC_WITH = method(:sync_with)
99
+ # Shorter names are deprecated
100
+ WITH_RESOLVED = WITH = method(:with)
101
+ SYNC_WITH_RESOLVED = SYNC_WITH = method(:sync_with)
91
102
 
92
103
  def self.resolve_item(item)
93
104
  case item
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: activerecord-summarize
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.1
4
+ version: 0.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Joshua Paine
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2022-06-23 00:00:00.000000000 Z
11
+ date: 2023-05-20 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: activerecord
@@ -84,7 +84,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
84
84
  - !ruby/object:Gem::Version
85
85
  version: '0'
86
86
  requirements: []
87
- rubygems_version: 3.3.3
87
+ rubygems_version: 3.3.7
88
88
  signing_key:
89
89
  specification_version: 4
90
90
  summary: Run many .count and/or .sum queries in a single efficient query with minimal