job-iteration 1.13.1 → 1.15.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +22 -0
- data/README.md +22 -6
- data/job-iteration.gemspec +1 -1
- data/lib/job-iteration/active_record_batch_enumerator.rb +0 -4
- data/lib/job-iteration/active_record_cursor.rb +13 -2
- data/lib/job-iteration/active_record_enumerator.rb +6 -3
- data/lib/job-iteration/enumerator_builder.rb +70 -1
- data/lib/job-iteration/iteration.rb +73 -8
- data/lib/job-iteration/log_subscriber.rb +6 -0
- data/lib/job-iteration/parallel_enumerator.rb +52 -0
- data/lib/job-iteration/railtie.rb +1 -2
- data/lib/job-iteration/version.rb +1 -1
- data/lib/tapioca/dsl/compilers/job_iteration.rb +5 -9
- metadata +5 -4
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: dbb861f47602c89b8163b0f36f662c8b1f10c34528c0d06b0ce74918973e89db
|
|
4
|
+
data.tar.gz: b00adea778b0a722dc727fdd4a9954c60442df995ee273ec82ff9d85b9fa222f
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 1d085184d274fcc7315277c0e11e45093674e96a152d6bf49eff6644be23974df17f6d778e0abf0d1289ac2203af8ea59a9c7a7d9aeedd46de26b6ed16649d86
|
|
7
|
+
data.tar.gz: 50fea852189c3f13f4b0c84f465f532e4b27804d5d71238f044fa0f6beec67c8372c83561b0690e1605eb09c47a3f07cfad24a417a3dd612a4b57bfda840696d
|
data/CHANGELOG.md
CHANGED
|
@@ -16,6 +16,28 @@ nil
|
|
|
16
16
|
|
|
17
17
|
nil
|
|
18
18
|
|
|
19
|
+
## v1.15.0 (Jun 4, 2026)
|
|
20
|
+
|
|
21
|
+
### Features
|
|
22
|
+
|
|
23
|
+
- [715](https://github.com/Shopify/job-iteration/pull/715) - Add support for keyword arguments in `build_enumerator` and `each_iteration` while preserving positional params Hash compatibility for existing jobs. This compatibility path is transitional; migrate jobs away from positional params Hash signatures paired with `perform_later(keyword: value)`.
|
|
24
|
+
|
|
25
|
+
## v1.14.0 (May 14, 2026)
|
|
26
|
+
|
|
27
|
+
### Breaking Changes
|
|
28
|
+
|
|
29
|
+
- [704](https://github.com/Shopify/job-iteration/pull/704) - Drop support for Rails 7.0. The minimum supported Rails version is now 7.1.
|
|
30
|
+
|
|
31
|
+
### Features
|
|
32
|
+
|
|
33
|
+
- [702](https://github.com/Shopify/job-iteration/pull/702) - Add support for parallel iteration with `enumerator_builder.parallel`. This enqueues multiple jobs, allowing you to split up the work across multiple instances.
|
|
34
|
+
- [705](https://github.com/Shopify/job-iteration/pull/705) - Add support for parallel array iteration with `enumerator_builder.parallel_array`.
|
|
35
|
+
- [706](https://github.com/Shopify/job-iteration/pull/706) - Add support for parallel Active Record relation iteration with `enumerator_builder.parallel_active_record_on_records` and `enumerator_builder.parallel_active_record_on_batches`.
|
|
36
|
+
|
|
37
|
+
### Bug fixes
|
|
38
|
+
|
|
39
|
+
- [709](https://github.com/Shopify/job-iteration/pull/709) - Fix an issue with parallel iteration when `instances` changed after child jobs were enqueued.
|
|
40
|
+
|
|
19
41
|
## v1.13.1 (Apr 28, 2026)
|
|
20
42
|
|
|
21
43
|
### Bug fixes
|
data/README.md
CHANGED
|
@@ -101,6 +101,21 @@ class BatchesAsRelationJob < ApplicationJob
|
|
|
101
101
|
end
|
|
102
102
|
```
|
|
103
103
|
|
|
104
|
+
```ruby
|
|
105
|
+
class ParallelIterationJob < ApplicationJob
|
|
106
|
+
include JobIteration::Iteration
|
|
107
|
+
|
|
108
|
+
def build_enumerator(cursor:)
|
|
109
|
+
enumerator_builder.parallel_active_record_on_records(User.all, instances: 5, cursor: cursor)
|
|
110
|
+
end
|
|
111
|
+
|
|
112
|
+
# Runs in 5 separate jobs, with each job processing a subset of the users
|
|
113
|
+
def each_iteration(user)
|
|
114
|
+
user.notify_about_something
|
|
115
|
+
end
|
|
116
|
+
end
|
|
117
|
+
```
|
|
118
|
+
|
|
104
119
|
```ruby
|
|
105
120
|
class ArrayJob < ApplicationJob
|
|
106
121
|
include JobIteration::Iteration
|
|
@@ -167,7 +182,7 @@ Job-iteration currently supports the following queue adapters (in order of imple
|
|
|
167
182
|
It supports the following platforms:
|
|
168
183
|
|
|
169
184
|
- Ruby 3.1 and later
|
|
170
|
-
- Rails 7.
|
|
185
|
+
- Rails 7.1 and later
|
|
171
186
|
|
|
172
187
|
Support for older platforms that have reached end of life may occasionally be dropped if maintaining backwards compatibility is cumbersome.
|
|
173
188
|
|
|
@@ -178,20 +193,21 @@ Support for older platforms that have reached end of life may occasionally be dr
|
|
|
178
193
|
* [Best practices](guides/best-practices.md)
|
|
179
194
|
* [Writing custom enumerator](guides/custom-enumerator.md)
|
|
180
195
|
* [Throttling](guides/throttling.md)
|
|
196
|
+
* [Parallel iteration](guides/parallel-iteration.md)
|
|
181
197
|
|
|
182
198
|
For more detailed documentation, see [rubydoc](https://www.rubydoc.info/gems/job-iteration).
|
|
183
199
|
|
|
184
200
|
## Requirements
|
|
185
201
|
|
|
186
|
-
|
|
202
|
+
Active Job is the primary requirement for Iteration. For iteration without Active Job in Sidekiq, see [Sidekiq Iteration](https://github.com/sidekiq/sidekiq/wiki/Iteration).
|
|
187
203
|
|
|
188
204
|
### API
|
|
189
205
|
|
|
190
|
-
Iteration job must respond to `build_enumerator` and `each_iteration` methods. `build_enumerator` must return [Enumerator](http://ruby-doc.org/core-2.5.1/Enumerator.html) object that respects the `cursor` value.
|
|
206
|
+
Iteration job must respond to `build_enumerator` and `each_iteration` methods. `build_enumerator` must return an [Enumerator](http://ruby-doc.org/core-2.5.1/Enumerator.html) object that respects the `cursor` value.
|
|
191
207
|
|
|
192
208
|
### Sidekiq adapter
|
|
193
209
|
|
|
194
|
-
|
|
210
|
+
Running iterating jobs on Sidekiq should work with the default configuration. The most important setting is Sidekiq's [timeout](https://github.com/mperham/sidekiq/wiki/Deployment#overview) option, which defaults to 25 seconds. That allows the last `each_iteration` to complete and gracefully shutdown.
|
|
195
211
|
|
|
196
212
|
### Resque adapter
|
|
197
213
|
|
|
@@ -203,11 +219,11 @@ There a few configuration assumptions that are required for Iteration to work wi
|
|
|
203
219
|
|
|
204
220
|
**What happens when my job is interrupted?** A checkpoint will be persisted to Redis after the current `each_iteration`, and the job will be re-enqueued. Once it's popped off the queue, the worker will work off from the next iteration.
|
|
205
221
|
|
|
206
|
-
**What happens with retries?** An interruption of a job does not count as a retry. If an exception occurs, the job will retry or be discarded as normal using Active Job configuration for the job. If the job retries, it processes the iteration that originally failed and progress will continue from there on if successful.
|
|
222
|
+
**What happens with retries?** An interruption of a job does not count as a retry. If an exception occurs, the job will retry or be discarded as normal using Active Job configuration for the job. If the job retries, it re-processes the iteration that originally failed and progress will continue from there on if successful.
|
|
207
223
|
|
|
208
224
|
**What happens if my iteration takes a long time?** We recommend that a single `each_iteration` should take no longer than 30 seconds. In the future, this may raise an exception.
|
|
209
225
|
|
|
210
|
-
**Why is it important that `each_iteration`
|
|
226
|
+
**Why is it important that `each_iteration` runs quickly?** When the job worker is scheduled for restart or shutdown, it gets a notice to finish remaining unit of work. To guarantee that no progress is lost we need to make sure that `each_iteration` completes within a reasonable amount of time. The exact timeout depends on your queue adapter configuration.
|
|
211
227
|
|
|
212
228
|
**Why do I use have to use this ugly helper in `build_enumerator`? Why can't you automatically infer it?** This is how the first version of the API worked. We checked the type of object returned by `build_enumerable`, and whether it was ActiveRecord Relation or an Array, we used the matching adapter. This caused opaque type branching in Iteration internals and it didn’t allow developers to craft their own Enumerators and control the cursor value. We made a decision to _always_ return Enumerator instance from `build_enumerator`. Now we provide explicit helpers to convert ActiveRecord Relation or an Array to Enumerator, and for more complex iteration flows developers can build their own `Enumerator` objects.
|
|
213
229
|
|
data/job-iteration.gemspec
CHANGED
|
@@ -26,5 +26,5 @@ Gem::Specification.new do |spec|
|
|
|
26
26
|
spec.metadata["changelog_uri"] = "https://github.com/Shopify/job-iteration/blob/main/CHANGELOG.md"
|
|
27
27
|
spec.metadata["allowed_push_host"] = "https://rubygems.org"
|
|
28
28
|
|
|
29
|
-
spec.add_dependency("activejob", ">= 7.
|
|
29
|
+
spec.add_dependency("activejob", ">= 7.1")
|
|
30
30
|
end
|
|
@@ -66,10 +66,6 @@ module JobIteration
|
|
|
66
66
|
pkey = @column_mgr.primary_key
|
|
67
67
|
pkey_values = primary_key_values
|
|
68
68
|
|
|
69
|
-
# If the primary key is only composed of a single column, simplify the
|
|
70
|
-
# query. This keeps us compatible with Rails prior to 7.1 where composite
|
|
71
|
-
# primary keys were introduced along with the syntax that allows you to
|
|
72
|
-
# query for multi-column values.
|
|
73
69
|
if pkey.size <= 1
|
|
74
70
|
pkey = pkey.first
|
|
75
71
|
pkey_values = pkey_values.map(&:first)
|
|
@@ -18,7 +18,7 @@ module JobIteration
|
|
|
18
18
|
end
|
|
19
19
|
end
|
|
20
20
|
|
|
21
|
-
def initialize(relation, columns
|
|
21
|
+
def initialize(relation, columns, position, instance, instances)
|
|
22
22
|
@columns = if columns
|
|
23
23
|
Array(columns)
|
|
24
24
|
else
|
|
@@ -35,6 +35,17 @@ module JobIteration
|
|
|
35
35
|
end
|
|
36
36
|
|
|
37
37
|
@base_relation = relation.reorder(@columns.join(","))
|
|
38
|
+
|
|
39
|
+
if instances.present?
|
|
40
|
+
pk = relation.primary_key
|
|
41
|
+
unless pk.is_a?(String) && relation.klass.column_for_attribute(pk).type == :integer
|
|
42
|
+
raise ArgumentError, "Parallel iteration requires a single integer primary key. " \
|
|
43
|
+
"For more complex cases, use the enumerator_builder.parallel primitive directly."
|
|
44
|
+
end
|
|
45
|
+
|
|
46
|
+
@base_relation = @base_relation.where("#{relation.table_name}.#{pk} % ? = ?", instances, instance)
|
|
47
|
+
end
|
|
48
|
+
|
|
38
49
|
@reached_end = false
|
|
39
50
|
end
|
|
40
51
|
|
|
@@ -56,7 +67,7 @@ module JobIteration
|
|
|
56
67
|
self.position = @columns.map do |column|
|
|
57
68
|
method = column.to_s.split(".").last
|
|
58
69
|
|
|
59
|
-
if
|
|
70
|
+
if method == "id"
|
|
60
71
|
record.id_value
|
|
61
72
|
else
|
|
62
73
|
record.send(method.to_sym)
|
|
@@ -7,7 +7,7 @@ module JobIteration
|
|
|
7
7
|
class ActiveRecordEnumerator
|
|
8
8
|
SQL_DATETIME_WITH_NSEC = "%Y-%m-%d %H:%M:%S.%N"
|
|
9
9
|
|
|
10
|
-
def initialize(relation, columns: nil, batch_size: 100, timezone: nil, cursor: nil)
|
|
10
|
+
def initialize(relation, columns: nil, batch_size: 100, timezone: nil, cursor: nil, instance: nil, instances: nil)
|
|
11
11
|
@relation = relation
|
|
12
12
|
@batch_size = batch_size
|
|
13
13
|
@timezone = timezone
|
|
@@ -17,6 +17,8 @@ module JobIteration
|
|
|
17
17
|
Array(relation.primary_key).map { |pk| "#{relation.table_name}.#{pk}" }
|
|
18
18
|
end
|
|
19
19
|
@cursor = cursor
|
|
20
|
+
@instance = instance
|
|
21
|
+
@instances = instances
|
|
20
22
|
end
|
|
21
23
|
|
|
22
24
|
def records
|
|
@@ -39,7 +41,8 @@ module JobIteration
|
|
|
39
41
|
end
|
|
40
42
|
|
|
41
43
|
def size
|
|
42
|
-
@relation.count(:all)
|
|
44
|
+
full_size = @relation.count(:all)
|
|
45
|
+
@instances.present? ? full_size / @instances : full_size
|
|
43
46
|
end
|
|
44
47
|
|
|
45
48
|
private
|
|
@@ -61,7 +64,7 @@ module JobIteration
|
|
|
61
64
|
end
|
|
62
65
|
|
|
63
66
|
def finder_cursor
|
|
64
|
-
JobIteration::ActiveRecordCursor.new(@relation, @columns, @cursor)
|
|
67
|
+
JobIteration::ActiveRecordCursor.new(@relation, @columns, @cursor, @instance, @instances)
|
|
65
68
|
end
|
|
66
69
|
|
|
67
70
|
def column_value(record, attribute)
|
|
@@ -5,6 +5,7 @@ require_relative "active_record_enumerator"
|
|
|
5
5
|
require_relative "csv_enumerator"
|
|
6
6
|
require_relative "throttle_enumerator"
|
|
7
7
|
require_relative "nested_enumerator"
|
|
8
|
+
require_relative "parallel_enumerator"
|
|
8
9
|
require "forwardable"
|
|
9
10
|
|
|
10
11
|
module JobIteration
|
|
@@ -65,6 +66,23 @@ module JobIteration
|
|
|
65
66
|
wrap(self, enumerable.each_with_index.drop(drop).to_enum { enumerable.size - drop })
|
|
66
67
|
end
|
|
67
68
|
|
|
69
|
+
# Builds an Enumerator that iterates over a given array, across +instances+ parallel jobs.
|
|
70
|
+
#
|
|
71
|
+
# Child job i iterates over the slice of the array starting at
|
|
72
|
+
# index (array.size / instances * i).floor and ending at index (array.size / instances * (i + 1)).floor - 1.
|
|
73
|
+
def build_parallel_array_enumerator(array, instances:, cursor:)
|
|
74
|
+
unless array.is_a?(Array)
|
|
75
|
+
raise ArgumentError, "array must be an Array"
|
|
76
|
+
end
|
|
77
|
+
|
|
78
|
+
build_parallel_enumerator(instances: instances, cursor: cursor) do |instance, instances, inner_cursor|
|
|
79
|
+
slice_start = (array.size.to_f / instances * instance).floor
|
|
80
|
+
next_slice_start = (array.size.to_f / instances * (instance + 1)).floor
|
|
81
|
+
slice = array[slice_start...next_slice_start]
|
|
82
|
+
build_array_enumerator(slice, cursor: inner_cursor)
|
|
83
|
+
end
|
|
84
|
+
end
|
|
85
|
+
|
|
68
86
|
# Builds Enumerator from Active Record Relation. Each Enumerator tick moves the cursor one row forward.
|
|
69
87
|
#
|
|
70
88
|
# +columns:+ argument is used to build the actual query for iteration. +columns+: defaults to primary key:
|
|
@@ -103,6 +121,21 @@ module JobIteration
|
|
|
103
121
|
wrap(self, enum)
|
|
104
122
|
end
|
|
105
123
|
|
|
124
|
+
# Builds an Enumerator that iterates over a given Active Record Relation, across +instances+ parallel jobs.
|
|
125
|
+
#
|
|
126
|
+
# Child job i iterates over the records where the id is equal to (instance % instances).
|
|
127
|
+
def build_parallel_active_record_enumerator_on_records(scope, instances:, cursor:, **args)
|
|
128
|
+
build_parallel_enumerator(instances: instances, cursor: cursor) do |instance, instances, inner_cursor|
|
|
129
|
+
build_active_record_enumerator(
|
|
130
|
+
scope,
|
|
131
|
+
cursor: inner_cursor,
|
|
132
|
+
instances: instances,
|
|
133
|
+
instance: instance,
|
|
134
|
+
**args,
|
|
135
|
+
).records
|
|
136
|
+
end
|
|
137
|
+
end
|
|
138
|
+
|
|
106
139
|
# Builds Enumerator from Active Record Relation and enumerates on batches of records.
|
|
107
140
|
# Each Enumerator tick moves the cursor +batch_size+ rows forward.
|
|
108
141
|
#
|
|
@@ -118,6 +151,21 @@ module JobIteration
|
|
|
118
151
|
wrap(self, enum)
|
|
119
152
|
end
|
|
120
153
|
|
|
154
|
+
# Builds an Enumerator that iterates over a given Active Record Relation, across +instances+ parallel jobs, and enumerates on batches.
|
|
155
|
+
#
|
|
156
|
+
# Child job i iterates over the batches of records where the id is equal to (instance % instances).
|
|
157
|
+
def build_parallel_active_record_enumerator_on_batches(scope, instances:, cursor:, **args)
|
|
158
|
+
build_parallel_enumerator(instances: instances, cursor: cursor) do |instance, instances, inner_cursor|
|
|
159
|
+
build_active_record_enumerator(
|
|
160
|
+
scope,
|
|
161
|
+
cursor: inner_cursor,
|
|
162
|
+
instances: instances,
|
|
163
|
+
instance: instance,
|
|
164
|
+
**args,
|
|
165
|
+
).batches
|
|
166
|
+
end
|
|
167
|
+
end
|
|
168
|
+
|
|
121
169
|
# Builds Enumerator from Active Record Relation and enumerates on batches, yielding Active Record Relations.
|
|
122
170
|
# See documentation for #build_active_record_enumerator_on_batches.
|
|
123
171
|
def build_active_record_enumerator_on_batch_relations(scope, wrap: true, cursor:, **args)
|
|
@@ -185,27 +233,48 @@ module JobIteration
|
|
|
185
233
|
wrap(self, enum)
|
|
186
234
|
end
|
|
187
235
|
|
|
236
|
+
def build_parallel_enumerator(instances:, cursor:, &block)
|
|
237
|
+
unless instances.is_a?(Integer) && instances.positive?
|
|
238
|
+
raise ArgumentError, "instances must be a positive Integer"
|
|
239
|
+
end
|
|
240
|
+
|
|
241
|
+
return ParallelEnumerator::EnqueueJobs.new(instances) if cursor.nil?
|
|
242
|
+
|
|
243
|
+
enum = ParallelEnumerator.new(block, cursor: cursor).to_enum
|
|
244
|
+
wrap(self, enum)
|
|
245
|
+
end
|
|
246
|
+
|
|
188
247
|
alias_method :once, :build_once_enumerator
|
|
189
248
|
alias_method :times, :build_times_enumerator
|
|
190
249
|
alias_method :array, :build_array_enumerator
|
|
250
|
+
alias_method :parallel_array, :build_parallel_array_enumerator
|
|
191
251
|
alias_method :active_record_on_records, :build_active_record_enumerator_on_records
|
|
252
|
+
alias_method :parallel_active_record_on_records, :build_parallel_active_record_enumerator_on_records
|
|
192
253
|
alias_method :active_record_on_batches, :build_active_record_enumerator_on_batches
|
|
254
|
+
alias_method :parallel_active_record_on_batches, :build_parallel_active_record_enumerator_on_batches
|
|
193
255
|
alias_method :active_record_on_batch_relations, :build_active_record_enumerator_on_batch_relations
|
|
194
256
|
alias_method :throttle, :build_throttle_enumerator
|
|
195
257
|
alias_method :csv, :build_csv_enumerator
|
|
196
258
|
alias_method :csv_on_batches, :build_csv_enumerator_on_batches
|
|
197
259
|
alias_method :nested, :build_nested_enumerator
|
|
260
|
+
alias_method :parallel, :build_parallel_enumerator
|
|
198
261
|
|
|
199
262
|
private
|
|
200
263
|
|
|
201
|
-
def build_active_record_enumerator(scope, cursor:, **args)
|
|
264
|
+
def build_active_record_enumerator(scope, cursor:, instance: nil, instances: nil, **args)
|
|
202
265
|
unless scope.is_a?(ActiveRecord::Relation)
|
|
203
266
|
raise ArgumentError, "scope must be an ActiveRecord::Relation"
|
|
204
267
|
end
|
|
205
268
|
|
|
269
|
+
if (instance.nil? && instances.present?) || (instance.present? && instances.nil?)
|
|
270
|
+
raise ArgumentError, "instance and instances must both be provided or both be nil"
|
|
271
|
+
end
|
|
272
|
+
|
|
206
273
|
JobIteration::ActiveRecordEnumerator.new(
|
|
207
274
|
scope,
|
|
208
275
|
cursor: cursor,
|
|
276
|
+
instance: instance,
|
|
277
|
+
instances: instances,
|
|
209
278
|
**args,
|
|
210
279
|
)
|
|
211
280
|
end
|
|
@@ -107,8 +107,8 @@ module JobIteration
|
|
|
107
107
|
self.total_time = Float(job_data["total_time"] || 0.0)
|
|
108
108
|
end
|
|
109
109
|
|
|
110
|
-
def perform(
|
|
111
|
-
interruptible_perform(
|
|
110
|
+
def perform(...) # @private
|
|
111
|
+
interruptible_perform(...)
|
|
112
112
|
|
|
113
113
|
nil
|
|
114
114
|
end
|
|
@@ -128,12 +128,12 @@ module JobIteration
|
|
|
128
128
|
JobIteration.enumerator_builder.new(self)
|
|
129
129
|
end
|
|
130
130
|
|
|
131
|
-
def interruptible_perform(*
|
|
131
|
+
def interruptible_perform(*args, **kwargs)
|
|
132
132
|
self.start_time = Time.now.utc
|
|
133
133
|
|
|
134
134
|
enumerator = nil
|
|
135
135
|
ActiveSupport::Notifications.instrument("build_enumerator.iteration", instrumentation_tags) do
|
|
136
|
-
enumerator =
|
|
136
|
+
enumerator = call_job_iteration_build_enumerator(args, kwargs)
|
|
137
137
|
end
|
|
138
138
|
|
|
139
139
|
unless enumerator
|
|
@@ -141,6 +141,14 @@ module JobIteration
|
|
|
141
141
|
return
|
|
142
142
|
end
|
|
143
143
|
|
|
144
|
+
if enumerator.is_a?(ParallelEnumerator::EnqueueJobs)
|
|
145
|
+
tags = instrumentation_tags.merge(instances: enumerator.instances)
|
|
146
|
+
ActiveSupport::Notifications.instrument("enqueue_parallel_jobs.iteration", tags) do
|
|
147
|
+
enumerator.enqueue_jobs(self)
|
|
148
|
+
end
|
|
149
|
+
return
|
|
150
|
+
end
|
|
151
|
+
|
|
144
152
|
assert_enumerator!(enumerator)
|
|
145
153
|
|
|
146
154
|
if executions == 1 && times_interrupted == 0
|
|
@@ -153,7 +161,7 @@ module JobIteration
|
|
|
153
161
|
end
|
|
154
162
|
|
|
155
163
|
completed = catch(:abort) do
|
|
156
|
-
iterate_with_enumerator(enumerator,
|
|
164
|
+
iterate_with_enumerator(enumerator, args, kwargs)
|
|
157
165
|
end
|
|
158
166
|
|
|
159
167
|
run_callbacks(:shutdown)
|
|
@@ -170,8 +178,7 @@ module JobIteration
|
|
|
170
178
|
end
|
|
171
179
|
end
|
|
172
180
|
|
|
173
|
-
def iterate_with_enumerator(enumerator,
|
|
174
|
-
arguments = arguments.dup.freeze
|
|
181
|
+
def iterate_with_enumerator(enumerator, args, kwargs)
|
|
175
182
|
found_record = false
|
|
176
183
|
@needs_reenqueue = false
|
|
177
184
|
|
|
@@ -183,7 +190,7 @@ module JobIteration
|
|
|
183
190
|
ActiveSupport::Notifications.instrument("each_iteration.iteration", tags) do
|
|
184
191
|
found_record = true
|
|
185
192
|
run_callbacks(:iterate) do
|
|
186
|
-
|
|
193
|
+
call_job_iteration_each_iteration(object_from_enumerator, args, kwargs)
|
|
187
194
|
end
|
|
188
195
|
self.cursor_position = cursor_from_enumerator
|
|
189
196
|
end
|
|
@@ -207,6 +214,64 @@ module JobIteration
|
|
|
207
214
|
adjust_total_time
|
|
208
215
|
end
|
|
209
216
|
|
|
217
|
+
def call_job_iteration_build_enumerator(args, kwargs)
|
|
218
|
+
positional_args, keyword_args = normalize_job_iteration_arguments(args, kwargs, :build_enumerator)
|
|
219
|
+
|
|
220
|
+
if keyword_args&.key?(:cursor)
|
|
221
|
+
raise ArgumentError, "The keyword argument `cursor` is reserved for the job iteration framework. " \
|
|
222
|
+
"Please remove `cursor` from the arguments passed to the job or rename it"
|
|
223
|
+
end
|
|
224
|
+
|
|
225
|
+
# `keyword_args || {}` because splatting `nil` raises on Ruby < 3.4; it only
|
|
226
|
+
# became a no-op in 3.4 (https://bugs.ruby-lang.org/issues/20064).
|
|
227
|
+
build_enumerator(*positional_args, **(keyword_args || {}), cursor: cursor_position)
|
|
228
|
+
end
|
|
229
|
+
|
|
230
|
+
def call_job_iteration_each_iteration(object_from_enumerator, args, kwargs)
|
|
231
|
+
positional_args, keyword_args = normalize_job_iteration_arguments(args, kwargs, :each_iteration)
|
|
232
|
+
# `keyword_args || {}` because splatting `nil` raises on Ruby < 3.4; it only
|
|
233
|
+
# became a no-op in 3.4 (https://bugs.ruby-lang.org/issues/20064).
|
|
234
|
+
each_iteration(object_from_enumerator, *positional_args, **(keyword_args || {}))
|
|
235
|
+
end
|
|
236
|
+
|
|
237
|
+
# Normalize Active Job kwargs for job-iteration dispatch. If the target method
|
|
238
|
+
# accepts job keyword parameters other than build_enumerator's reserved cursor:,
|
|
239
|
+
# keep kwargs separate so they can be splatted. Otherwise, append kwargs as a
|
|
240
|
+
# positional params Hash for transitional compatibility with existing jobs
|
|
241
|
+
# enqueued with keyword syntax.
|
|
242
|
+
#: (Array[top], Hash[Symbol, top], Symbol) -> [Array[top], Hash[Symbol, top]?]
|
|
243
|
+
def normalize_job_iteration_arguments(args, kwargs, method_name)
|
|
244
|
+
if kwargs.empty?
|
|
245
|
+
[args.dup.freeze, nil]
|
|
246
|
+
elsif job_has_keyword_parameters?(method_name)
|
|
247
|
+
[args.dup.freeze, kwargs.dup.freeze]
|
|
248
|
+
else
|
|
249
|
+
params_hash_args = args + [kwargs]
|
|
250
|
+
[params_hash_args.dup.freeze, nil]
|
|
251
|
+
end
|
|
252
|
+
end
|
|
253
|
+
|
|
254
|
+
#: (Symbol) -> bool
|
|
255
|
+
def job_has_keyword_parameters?(method_name)
|
|
256
|
+
@job_has_keyword_parameters ||= {}
|
|
257
|
+
return @job_has_keyword_parameters[method_name] if @job_has_keyword_parameters.key?(method_name)
|
|
258
|
+
|
|
259
|
+
@job_has_keyword_parameters[method_name] = method_parameters(method_name).any? do |type, name|
|
|
260
|
+
job_keyword_argument?(method_name, type, name)
|
|
261
|
+
end
|
|
262
|
+
end
|
|
263
|
+
|
|
264
|
+
def job_keyword_argument?(method_name, type, name)
|
|
265
|
+
# Match keyword parameters: double-splat (**kwargs), required (key:) or optional (key: default).
|
|
266
|
+
return false unless type == :keyrest || type == :keyreq || type == :key
|
|
267
|
+
|
|
268
|
+
# Ignore the `cursor:` argument, which is part of the `job-iteration`
|
|
269
|
+
# framework, and not a argument in the job's serialized argument list.
|
|
270
|
+
return false if method_name == :build_enumerator && name == :cursor
|
|
271
|
+
|
|
272
|
+
true
|
|
273
|
+
end
|
|
274
|
+
|
|
210
275
|
def reenqueue_iteration_job
|
|
211
276
|
ActiveSupport::Notifications.instrument(
|
|
212
277
|
"interrupted.iteration",
|
|
@@ -19,6 +19,12 @@ module JobIteration
|
|
|
19
19
|
end
|
|
20
20
|
end
|
|
21
21
|
|
|
22
|
+
def enqueue_parallel_jobs(event)
|
|
23
|
+
info do
|
|
24
|
+
"[JobIteration::Iteration] Enqueued #{event.payload[:instances]} parallel jobs."
|
|
25
|
+
end
|
|
26
|
+
end
|
|
27
|
+
|
|
22
28
|
def interrupted(event)
|
|
23
29
|
info do
|
|
24
30
|
"[JobIteration::Iteration] Interrupting and re-enqueueing the job " \
|
|
@@ -0,0 +1,52 @@
|
|
|
1
|
+
# typed: true
|
|
2
|
+
# frozen_string_literal: true
|
|
3
|
+
|
|
4
|
+
module JobIteration
|
|
5
|
+
# ParallelEnumerator allows you to parallelize iterations.
|
|
6
|
+
class ParallelEnumerator
|
|
7
|
+
class EnqueueError < StandardError; end
|
|
8
|
+
|
|
9
|
+
class EnqueueJobs
|
|
10
|
+
def initialize(instances)
|
|
11
|
+
@instances = instances
|
|
12
|
+
end
|
|
13
|
+
|
|
14
|
+
attr_reader :instances
|
|
15
|
+
|
|
16
|
+
def enqueue_jobs(job)
|
|
17
|
+
child_jobs = instances.times.map do |index|
|
|
18
|
+
job.class.new(*job.arguments).tap do |child_job|
|
|
19
|
+
child_job.cursor_position = { "instance" => index, "instances" => instances, "inner_cursor" => nil }
|
|
20
|
+
|
|
21
|
+
# Carry forward potential overrides from the parent job
|
|
22
|
+
child_job.queue_name = job.queue_name
|
|
23
|
+
child_job.priority = job.priority if job.priority
|
|
24
|
+
end
|
|
25
|
+
end
|
|
26
|
+
|
|
27
|
+
ActiveJob.perform_all_later(child_jobs)
|
|
28
|
+
|
|
29
|
+
unless child_jobs.all?(&:successfully_enqueued?)
|
|
30
|
+
failed_count = instances - child_jobs.count(&:successfully_enqueued?)
|
|
31
|
+
raise EnqueueError, "Failed to enqueue #{failed_count} out of #{instances} child jobs"
|
|
32
|
+
end
|
|
33
|
+
end
|
|
34
|
+
end
|
|
35
|
+
|
|
36
|
+
def initialize(block, cursor:)
|
|
37
|
+
@instance = cursor["instance"]
|
|
38
|
+
@instances = cursor["instances"]
|
|
39
|
+
inner_cursor = cursor["inner_cursor"]
|
|
40
|
+
@inner_enum = block.call(@instance, @instances, inner_cursor)
|
|
41
|
+
end
|
|
42
|
+
|
|
43
|
+
def to_enum
|
|
44
|
+
Enumerator.new(-> { @inner_enum.size }) do |yielder|
|
|
45
|
+
@inner_enum.each do |object_from_enumerator, cursor_from_enumerator|
|
|
46
|
+
parallel_cursor = { "instance" => @instance, "instances" => @instances, "inner_cursor" => cursor_from_enumerator }
|
|
47
|
+
yielder.yield(object_from_enumerator, parallel_cursor)
|
|
48
|
+
end
|
|
49
|
+
end
|
|
50
|
+
end
|
|
51
|
+
end
|
|
52
|
+
end
|
|
@@ -5,8 +5,7 @@ return unless defined?(Rails::Railtie)
|
|
|
5
5
|
module JobIteration
|
|
6
6
|
class Railtie < Rails::Railtie
|
|
7
7
|
initializer "job_iteration.register_deprecator" do |app|
|
|
8
|
-
|
|
9
|
-
app.deprecators[:job_iteration] = JobIteration::Deprecation if app.respond_to?(:deprecators)
|
|
8
|
+
app.deprecators[:job_iteration] = JobIteration::Deprecation
|
|
10
9
|
end
|
|
11
10
|
end
|
|
12
11
|
end
|
|
@@ -105,15 +105,11 @@ module Tapioca
|
|
|
105
105
|
).returns(T::Array[RBI::TypedParam])
|
|
106
106
|
end
|
|
107
107
|
def perform_later_parameters(parameters, returned_job_class)
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
)]
|
|
114
|
-
else
|
|
115
|
-
parameters
|
|
116
|
-
end
|
|
108
|
+
parameters.reject! { |typed_param| RBI::BlockParam === typed_param.param }
|
|
109
|
+
parameters + [create_block_param(
|
|
110
|
+
"block",
|
|
111
|
+
type: "T.nilable(T.proc.params(job: #{returned_job_class}).void)",
|
|
112
|
+
)]
|
|
117
113
|
end
|
|
118
114
|
|
|
119
115
|
class << self
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: job-iteration
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 1.
|
|
4
|
+
version: 1.15.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Shopify
|
|
@@ -15,14 +15,14 @@ dependencies:
|
|
|
15
15
|
requirements:
|
|
16
16
|
- - ">="
|
|
17
17
|
- !ruby/object:Gem::Version
|
|
18
|
-
version: '7.
|
|
18
|
+
version: '7.1'
|
|
19
19
|
type: :runtime
|
|
20
20
|
prerelease: false
|
|
21
21
|
version_requirements: !ruby/object:Gem::Requirement
|
|
22
22
|
requirements:
|
|
23
23
|
- - ">="
|
|
24
24
|
- !ruby/object:Gem::Version
|
|
25
|
-
version: '7.
|
|
25
|
+
version: '7.1'
|
|
26
26
|
description: Makes your background jobs interruptible and resumable.
|
|
27
27
|
email:
|
|
28
28
|
- ops-accounts+shipit@shopify.com
|
|
@@ -52,6 +52,7 @@ files:
|
|
|
52
52
|
- lib/job-iteration/iteration.rb
|
|
53
53
|
- lib/job-iteration/log_subscriber.rb
|
|
54
54
|
- lib/job-iteration/nested_enumerator.rb
|
|
55
|
+
- lib/job-iteration/parallel_enumerator.rb
|
|
55
56
|
- lib/job-iteration/railtie.rb
|
|
56
57
|
- lib/job-iteration/test_helper.rb
|
|
57
58
|
- lib/job-iteration/throttle_enumerator.rb
|
|
@@ -77,7 +78,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
|
77
78
|
- !ruby/object:Gem::Version
|
|
78
79
|
version: '0'
|
|
79
80
|
requirements: []
|
|
80
|
-
rubygems_version: 4.0.
|
|
81
|
+
rubygems_version: 4.0.12
|
|
81
82
|
specification_version: 4
|
|
82
83
|
summary: Makes your background jobs interruptible and resumable.
|
|
83
84
|
test_files: []
|