created_id 1.0.0 → 1.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 6aea8a225fcdd235ca56719e20de06dbaa6d4c143dc6d2658e68d95fc59c6d5b
4
- data.tar.gz: 656d93bb31b73415cbeeba237e2c868bb936cb26942e4b03c1cc778e911cf825
3
+ metadata.gz: afdabbea09a01fe7c621831292070fdd0138f39ef7efce693745765c5f468fdd
4
+ data.tar.gz: 6f48cde24728a3f4b92b3c1a003e092f92d4f781e22a68350bb504a210e35f81
5
5
  SHA512:
6
- metadata.gz: 85cce04a40800a43b39754ad7686341ba940a1e6fae1e5de90a34dbc6f3c83f346208486d8fbd084dd00b4ff8541df54cd7ef073f894175522fb1bd56e7a0cf3
7
- data.tar.gz: 42c29b10692f1dbfbae7af08b075d1eca605c06ca450d78024794a4965c76cc20bf2d3c0117d7cee324f2f1eee2976fc27261d6dc89ac86ab20961d5ee0b6319
6
+ metadata.gz: 500a45fdf791907a43d856c4315988343f69ecd571b294cdab0b5f83c347949fafba13e706e0bcfb2951342a96529a2b2d2a62ffc02982269c25f4b88b9c7003
7
+ data.tar.gz: 5f6ab7875355ffb9f15495c0d0d7ecc83fd01f1a726c734274bc922511a9acdaadcf831b71cc3e0fa3612581a450e4f561d99c0e0b5a87b3b675b99f4087bc69
data/CHANGELOG.md CHANGED
@@ -4,6 +4,24 @@ All notable changes to this project will be documented in this file.
4
4
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
5
5
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
6
6
 
7
+ ## 1.1.0
8
+
9
+ ### Changed
10
+
11
+ - Omit id clause in queries if ids have not been indexed.
12
+ - Update queries to use ranges to better support prepared statements.
13
+
14
+ ### Removed
15
+
16
+ - Drop support for ActiveRecord 5.
17
+ - Drop support for Ruby 2.5 and 2.6.
18
+
19
+ ## 1.0.1
20
+
21
+ ### Changed
22
+
23
+ - Standardize lazy loading of models.
24
+
7
25
  ## 1.0.0
8
26
 
9
27
  ### Added
data/README.md CHANGED
@@ -1,9 +1,17 @@
1
1
  # Created ID
2
2
 
3
3
  [![Continuous Integration](https://github.com/bdurand/created_id/actions/workflows/continuous_integration.yml/badge.svg)](https://github.com/bdurand/created_id/actions/workflows/continuous_integration.yml)
4
+ [![Regression Test](https://github.com/bdurand/created_id/actions/workflows/regression_test.yml/badge.svg)](https://github.com/bdurand/created_id/actions/workflows/regression_test.yml)
4
5
  [![Ruby Style Guide](https://img.shields.io/badge/code_style-standard-brightgreen.svg)](https://github.com/testdouble/standard)
6
+ [![Gem Version](https://badge.fury.io/rb/created_id.svg)](https://badge.fury.io/rb/created_id)
5
7
 
6
- The gem is designed to optimize queries for ActiveRecord models that filter by the `created_at` timestamp. It can make queries more efficient by pre-calculating the ranges of id's for specific dates.
8
+ **CreatedId** optimizes queries on large ActiveRecord tables by precalculating ID ranges for specific time intervals. This lets you avoid full table scans and makes filtering by `created_at` more efficient, even in complex queries.
9
+
10
+ ### Key Benefits
11
+
12
+ - **Efficient Range Queries**: Filter by time-based ID ranges instead of relying on a less predictable `created_at` index.
13
+ - **Reduced Indexing Needs**: Avoid adding specific `created_at` indexes, letting primary key indexing handle range queries.
14
+ - **Simple Integration**: Just include the `CreatedId` module in your models and run a periodic task to index ID ranges.
7
15
 
8
16
  The use case this code is designed to solve is when you have a large table with an auto-populated `created_at` column where you want to run queries that filter on that column. In most cases, simply adding an index on the `created_at` column will work just fine., However, once you start constructing more complex queries or adding joins and your table grows very large, the index can become less effective and not even be used at all.
9
17
 
@@ -52,7 +60,7 @@ WHERE tasks.status = 'completed'
52
60
  AND tasks.created_at < ?
53
61
  ```
54
62
 
55
- The query optimizer will have it's choice of several indexes to use to figure out the best query plan. The most important choice will be the first step of the plan to reduce the number of rows that the query needs to look at. Depending on the shape of your data, the query optimizer may decide to simply filter by `status` or `user_id` and then perform a table scan on all the rows to filter by `created_at`, not using the index on that column at all.
63
+ The query optimizer will have its choice of several indexes to use to figure out the best query plan. The most important choice will be the first step of the plan to reduce the number of rows that the query needs to look at. Depending on the shape of your data, the query optimizer may decide to simply filter by `status` or `user_id` and then perform a table scan on all the rows to filter by `created_at`, not using the index on that column at all.
56
64
 
57
65
  This gem solves for this case by keeping track of the range ids created in each hour in a separate table. When you query on the `created_at` column, it will then look up the possible id range and add that to the query, so the SQL becomes:
58
66
 
@@ -68,16 +76,16 @@ WHERE tasks.status = 'completed'
68
76
  AND tasks.id < ?
69
77
  ```
70
78
 
71
- Because the `id` column is the primary key, it will always be indexed and the query optimizer will generally make better decisions about how to filter the query rows. You won't even need the index on `created_at` since the primay key would always be preferred.
79
+ Because the `id` column is the primary key, it will always be indexed and the query optimizer will generally make better decisions about how to filter the query rows. You won't even need the index on `created_at` since the primary key would always be preferred.
72
80
 
73
81
  Another good use case is if you have some periodic tasks to calculate daily stats for some large tables. You will be able to make these queries more efficient without having to add an index on the `created_at` column that's only used on one query per day.
74
82
 
75
83
  ## Usage
76
84
 
77
- Run the generator to create the database table
85
+ Run the generator to create the database migration. This will create a table to store time indexed id ranges for your models.
78
86
 
79
87
  ```
80
- rails created_id_engine:install:migrations
88
+ rails created_id_engine:install:migrations
81
89
  ```
82
90
 
83
91
  Next, include the `CreatedId` module into your models. Note that any model you wish to include this module in must have a numeric primary key. If the model is subclassed you will need to include the `CreatedId` module in the parent model.
@@ -93,11 +101,14 @@ end
93
101
  Now when you want to query by a range on the `created_at` column, you can use the `created_after`, `created_before`, or `created_between` scopes on the model.
94
102
 
95
103
  ```ruby
104
+ # Query for tasks completed after a specific time
96
105
  Task.where(status: "completed").created_after(24.hours.ago)
97
106
 
107
+ # Query for tasks by a specific user created before a specific time
98
108
  Task.where(user_id: 1000).created_before(7.days.ago)
99
109
 
100
- Task.created_between(25.hour.ago, 24.hours.ago)
110
+ # Query for tasks within a specific timeframe
111
+ Task.created_between(25.hours.ago, 24.hours.ago)
101
112
  ```
102
113
 
103
114
  You'll then need to set up a periodic task to store the id ranges for your models. For each model that includes `CreatedId`, you need to run the `index_ids_for` once per hour. This task should be run shortly after the top of the hour.
@@ -117,7 +128,7 @@ while time < Time.now
117
128
  end
118
129
  ```
119
130
 
120
- Don't worry if the id range for a specific hour does not get recorded, the queries will still work and they can be re-calcuated at any time. Queries will just be a bit less efficient if the ranges don't exist because queries will be given a large span of ids to filter on.
131
+ If an ID range is missing for a specific hour, your queries will still function, but with a broader range of IDs. You can recalculate missing ranges at any time to improve efficiency.
121
132
 
122
133
  There is an additional requirement for using this gem that you do not change the `created_at` value after a row is inserted since this can mess up the assumption about the correlation between ids and `created_at` timestamps. An error will be thrown if you try to change a record's timestamp after the id range has been created. The query logic can handle small variations between id order and timestamp order (i.e. if id 1000 has a timestamp a few seconds after id 1001).
123
134
 
data/VERSION CHANGED
@@ -1 +1 @@
1
- 1.0.0
1
+ 1.1.0
data/created_id.gemspec CHANGED
@@ -4,10 +4,16 @@ Gem::Specification.new do |spec|
4
4
  spec.authors = ["Brian Durand"]
5
5
  spec.email = ["bbdurand@gmail.com"]
6
6
 
7
- spec.summary = "Mechanism for optimizing ActiveRecord queries against the created_at column on tables."
7
+ spec.summary = "Optimize ActiveRecord queries for filtering large tables on the created_at column by pre-computing id ranges."
8
8
  spec.homepage = "https://github.com/bdurand/created_id"
9
9
  spec.license = "MIT"
10
10
 
11
+ spec.metadata = {
12
+ "homepage_uri" => spec.homepage,
13
+ "source_code_uri" => spec.homepage,
14
+ "changelog_uri" => "#{spec.homepage}/blob/main/CHANGELOG.md"
15
+ }
16
+
11
17
  # Specify which files should be added to the gem when it is released.
12
18
  # The `git ls-files -z` loads the files in the RubyGem that have been added into git.
13
19
  ignore_files = %w[
@@ -26,9 +32,9 @@ Gem::Specification.new do |spec|
26
32
 
27
33
  spec.require_paths = ["lib"]
28
34
 
29
- spec.add_dependency "activerecord", ">= 5.0"
35
+ spec.add_dependency "activerecord", ">= 6.0"
30
36
 
31
37
  spec.add_development_dependency "bundler"
32
38
 
33
- spec.required_ruby_version = ">= 2.5"
39
+ spec.required_ruby_version = ">= 2.7"
34
40
  end
@@ -2,8 +2,5 @@
2
2
 
3
3
  module CreatedId
4
4
  class Engine < Rails::Engine
5
- config.before_eager_load do
6
- require_relative "id_range"
7
- end
8
5
  end
9
6
  end
@@ -7,8 +7,8 @@ module CreatedId
7
7
  self.table_name = "created_ids"
8
8
 
9
9
  scope :for_class, ->(klass) { where(class_name: klass.base_class.name) }
10
- scope :created_before, ->(time) { where(arel_table[:hour].lteq(time)) }
11
- scope :created_after, ->(time) { where(arel_table[:hour].gteq(time)) }
10
+ scope :created_before, ->(time) { where(hour: nil..time) }
11
+ scope :created_after, ->(time) { where(hour: time...nil) }
12
12
 
13
13
  before_validation :set_hour
14
14
 
@@ -23,20 +23,30 @@ module CreatedId
23
23
  #
24
24
  # @param klass [Class] The class to get the minimum id for.
25
25
  # @param time [Time] The hour to get the minimum id for.
26
- # @return [Integer] The minimum id for the class created in the given hour.
27
- def min_id(klass, time)
28
- for_class(klass).created_before(time).order(hour: :desc).first&.min_id || 0
26
+ # @param allow_nil [Boolean] Whether to allow a nil value to be returned. If this is false,
27
+ # then the method will return 0 if no value is found.
28
+ # @return [Integer, nil] The minimum id for the class created in the given hour.
29
+ def min_id(klass, time, allow_nil: false)
30
+ return nil if time.nil? && allow_nil
31
+
32
+ id = for_class(klass).created_before(time).order(hour: :desc).first&.min_id
33
+ id ||= 0 unless allow_nil
34
+ id
29
35
  end
30
36
 
31
37
  # Get the maximum id for a class created in a given hour.
32
38
  #
33
39
  # @param klass [Class] The class to get the maximum id for.
34
40
  # @param time [Time] The hour to get the maximum id for.
35
- # @return [Integer] The maximum id for the class created in the given hour.
36
- def max_id(klass, time)
41
+ # @param allow_nil [Boolean] Whether to allow a nil value to be returned. If this is false,
42
+ # then the method will return the maximum possible id for the id column.
43
+ # @return [Integer, nil] The maximum id for the class created in the given hour.
44
+ def max_id(klass, time, allow_nil: false)
45
+ return nil if time.nil? && allow_nil
46
+
37
47
  id = for_class(klass).created_after(CreatedId.coerce_hour(time)).order(hour: :asc).first&.max_id
38
48
 
39
- unless id
49
+ if id.nil? && !allow_nil
40
50
  col_limit = klass.columns.detect { |c| c.name == klass.primary_key }.limit
41
51
  id = if col_limit && col_limit > 0
42
52
  ((256**col_limit) / 2) - 1
data/lib/created_id.rb CHANGED
@@ -1,10 +1,13 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- require_relative "created_id/version"
4
3
  require_relative "created_id/engine" if defined?(Rails::Engine)
5
4
 
6
5
  module CreatedId
7
6
  extend ActiveSupport::Concern
7
+
8
+ autoload :IdRange, "created_id/id_range"
9
+ autoload :VERSION, "created_id/version"
10
+
8
11
  class CreatedAtChangedError < StandardError
9
12
  end
10
13
 
@@ -21,12 +24,8 @@ module CreatedId
21
24
  raise ArgmentError, "CreatedId can only be included in ActiveRecord models"
22
25
  end
23
26
 
24
- # Require here so we don't mess up loading the activerecord gem.
25
- require_relative "created_id/id_range"
26
-
27
- scope :created_after, ->(time) { where(arel_table[:created_at].gteq(time).and(arel_table[primary_key].gteq(CreatedId::IdRange.min_id(self, time)))) }
28
- scope :created_before, ->(time) { where(arel_table[:created_at].lt(time).and(arel_table[primary_key].lteq(CreatedId::IdRange.max_id(self, time)))) }
29
- scope :created_between, ->(time_1, time_2) { created_after(time_1).created_before(time_2) }
27
+ scope :created_after, ->(time) { created_between(time, nil) }
28
+ scope :created_before, ->(time) { created_between(nil, time) }
30
29
 
31
30
  before_save :verify_created_at_created_id!, if: :created_at_changed?
32
31
  end
@@ -42,6 +41,24 @@ module CreatedId
42
41
  CreatedId::IdRange.save_created_id(self, time, min_id, max_id)
43
42
  end
44
43
  end
44
+
45
+ # Get records created in the given time range. The time range is based on the
46
+ # created_at column and is inclusive of the start time and exclusive of the end time.
47
+ #
48
+ # @param start_time [Time, nil] The start of the time range. If nil, the range is open-ended.
49
+ # @param end_time [Time, nil] The end of the time range. If nil, the range is open-ended.
50
+ # @return [ActiveRecord::Relation] The records created in the given time range.
51
+ def created_between(start_time, end_time)
52
+ finder = where(created_at: start_time...end_time)
53
+
54
+ min_id = CreatedId::IdRange.min_id(self, start_time, allow_nil: true)
55
+ max_id = CreatedId::IdRange.max_id(self, end_time, allow_nil: true)
56
+ if min_id || max_id
57
+ finder = finder.where(primary_key => min_id..max_id)
58
+ end
59
+
60
+ finder
61
+ end
45
62
  end
46
63
 
47
64
  private
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: created_id
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0
4
+ version: 1.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Brian Durand
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2023-04-28 00:00:00.000000000 Z
11
+ date: 2024-11-13 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: activerecord
@@ -16,14 +16,14 @@ dependencies:
16
16
  requirements:
17
17
  - - ">="
18
18
  - !ruby/object:Gem::Version
19
- version: '5.0'
19
+ version: '6.0'
20
20
  type: :runtime
21
21
  prerelease: false
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
24
  - - ">="
25
25
  - !ruby/object:Gem::Version
26
- version: '5.0'
26
+ version: '6.0'
27
27
  - !ruby/object:Gem::Dependency
28
28
  name: bundler
29
29
  requirement: !ruby/object:Gem::Requirement
@@ -58,7 +58,10 @@ files:
58
58
  homepage: https://github.com/bdurand/created_id
59
59
  licenses:
60
60
  - MIT
61
- metadata: {}
61
+ metadata:
62
+ homepage_uri: https://github.com/bdurand/created_id
63
+ source_code_uri: https://github.com/bdurand/created_id
64
+ changelog_uri: https://github.com/bdurand/created_id/blob/main/CHANGELOG.md
62
65
  post_install_message:
63
66
  rdoc_options: []
64
67
  require_paths:
@@ -67,16 +70,16 @@ required_ruby_version: !ruby/object:Gem::Requirement
67
70
  requirements:
68
71
  - - ">="
69
72
  - !ruby/object:Gem::Version
70
- version: '2.5'
73
+ version: '2.7'
71
74
  required_rubygems_version: !ruby/object:Gem::Requirement
72
75
  requirements:
73
76
  - - ">="
74
77
  - !ruby/object:Gem::Version
75
78
  version: '0'
76
79
  requirements: []
77
- rubygems_version: 3.4.12
80
+ rubygems_version: 3.4.10
78
81
  signing_key:
79
82
  specification_version: 4
80
- summary: Mechanism for optimizing ActiveRecord queries against the created_at column
81
- on tables.
83
+ summary: Optimize ActiveRecord queries for filtering large tables on the created_at
84
+ column by pre-computing id ranges.
82
85
  test_files: []