job-iteration 1.1.3 → 1.1.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of job-iteration might be problematic. Click here for more details.

checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 3321e73fd83420a90163ec8963f381897cd1bd6e06717ce3c1e5db645e3aa823
4
- data.tar.gz: bae3d1af0d3d803423c42c0085d3b5c9487c804e7362a9a70fe845e6d1abc406
3
+ metadata.gz: 99e26756140f1346b646a3931dc6cc8de1181a67a27916d9a0d85deb1d4dddeb
4
+ data.tar.gz: '058d65ec35e8cebbf74ba37c2227aa63cb1b5fbf183c81ee4c1b4f1cc3a9d033'
5
5
  SHA512:
6
- metadata.gz: f78e4bbc18af7343762de3dbfbb4fc958512b7e62806a8adb372f5d94d3fa37b2e16fa0228f343e37679516ddb477099b8d9c3f4517493eb9fb9594ffba86e9e
7
- data.tar.gz: d1593c8350cf85a78754375f808fd467421a9b04349fb776b3e7471a4ab12d3df14e104021c17ffc80f9595248b2e99659a6747954296dd0d15afefa244bc4ec
6
+ metadata.gz: 86a6a0f8ef55ab7502215fc4efcd6bc6e7b7a0b4737decf470db964d2302705d42ab1d16eff904e0d2c914f7ac30ca66cd2b99fffcb6aeaf1db904619fd89418
7
+ data.tar.gz: 7d9c9d5023f56365e2a1040fc7ccbb57f01c73f50c7addeaaedcf64edc2dab5cf2383d56cda46b832cdc6acce367abf87757dbd1113971335f51d675883e4141
@@ -5,7 +5,7 @@ AllCops:
5
5
  TargetRubyVersion: 2.4.4
6
6
  Exclude:
7
7
  - 'vendor/bundle/**/*'
8
- Lint/HandleExceptions:
8
+ Lint/SuppressedException:
9
9
  Exclude:
10
10
  - lib/job-iteration.rb
11
11
  Style/GlobalVars:
@@ -3,8 +3,9 @@ services:
3
3
  - redis-server
4
4
  language: ruby
5
5
  rvm:
6
- - 2.5.5
7
- - 2.6.2
6
+ - 2.5
7
+ - 2.6
8
+ - 2.7
8
9
  before_install:
9
10
  - mysql -e 'CREATE DATABASE job_iteration_test;'
10
11
  script:
@@ -14,4 +15,5 @@ script:
14
15
 
15
16
  gemfile:
16
17
  - 'gemfiles/rails_5_2.gemfile'
18
+ - 'gemfiles/rails_6_0.gemfile'
17
19
  - 'gemfiles/rails_edge.gemfile'
@@ -4,6 +4,28 @@
4
4
 
5
5
  #### Bug fix
6
6
 
7
+ ## v1.1.8 (June 8, 2020)
8
+
9
+ - Preserve ruby2_keywords tags in arguments on Ruby 2.7
10
+
11
+ ## v1.1.7 (June 4, 2020)
12
+
13
+ - [54](https://github.com/Shopify/job-iteration/pull/54) - Fix warnings on Ruby 2.7
14
+
15
+ ## v1.1.6 (May 22, 2020)
16
+
17
+ - [49](https://github.com/Shopify/job-iteration/pull/49) - Log when enumerator has nothing to iterate
18
+ - [52](https://github.com/Shopify/job-iteration/pull/52) - Fix CSVEnumerator cursor to properly remove already processed rows
19
+
20
+ ## v1.1.5 (February 27, 2020)
21
+
22
+ - [47](https://github.com/Shopify/job-iteration/pull/47) - Optional `sorbet-runtime` support for `JobIteration::Iteration` interface validation
23
+
24
+ ## v1.1.4 (December 13, 2019)
25
+
26
+ - [45](https://github.com/Shopify/job-iteration/pull/45) - Add Throttle enumerator
27
+
28
+
7
29
  ### v1.1.3 (August 20, 2019)
8
30
 
9
31
  - [36](https://github.com/shopify/job-iteration/pull/39) - Check method validation at job initialization step
data/Gemfile CHANGED
@@ -11,7 +11,7 @@ gemspec
11
11
  gem 'sidekiq'
12
12
  gem 'resque'
13
13
 
14
- gem 'mysql2', '~> 0.4.4'
14
+ gem 'mysql2', '~> 0.5'
15
15
  gem 'globalid'
16
16
  gem 'i18n'
17
17
  gem 'redis'
@@ -20,6 +20,9 @@ gem 'database_cleaner'
20
20
  gem 'pry'
21
21
  gem 'mocha'
22
22
 
23
- gem 'rubocop'
23
+ gem 'rubocop', '~> 0.77.0'
24
24
  gem 'yard'
25
25
  gem 'rake'
26
+
27
+ # for unit testing optional sorbet support
28
+ gem 'sorbet-runtime'
data/README.md CHANGED
@@ -112,6 +112,7 @@ Iteration hooks into Sidekiq and Resque out of the box to support graceful inter
112
112
  * [Iteration: how it works](guides/iteration-how-it-works.md)
113
113
  * [Best practices](guides/best-practices.md)
114
114
  * [Writing custom enumerator](guides/custom-enumerator.md)
115
+ * [Throttling](guides/throttling.md)
115
116
 
116
117
  For more detailed documentation, see [rubydoc](https://www.rubydoc.info/github/Shopify/job-iteration).
117
118
 
data/dev.yml CHANGED
@@ -7,7 +7,7 @@ up:
7
7
  - mysql-client:
8
8
  or: [mysql@5.7]
9
9
  - ruby:
10
- version: 2.6.2
10
+ version: 2.6.5
11
11
  - railgun
12
12
  - bundler
13
13
  - custom:
@@ -0,0 +1,6 @@
1
+ # frozen_string_literal: true
2
+
3
+ eval_gemfile '../Gemfile'
4
+
5
+ gem 'activejob', '~> 6.0.0'
6
+ gem 'activerecord', '~> 6.0.0'
@@ -8,7 +8,7 @@ class ListJob < ActiveJob::Base
8
8
 
9
9
  def build_enumerator(*)
10
10
  @redis = Redis.new
11
- Enumerator.new |yielder|
11
+ Enumerator.new do |yielder|
12
12
  yielder.yield @redis.lpop(key), nil
13
13
  end
14
14
  end
@@ -0,0 +1,46 @@
1
+ Iteration comes with a special wrapper enumerator that allows you to throttle iterations based on external signal (e.g. database health).
2
+
3
+ Consider this example:
4
+
5
+ ```ruby
6
+ class InactiveAccountDeleteJob < ActiveJob::Base
7
+ include JobIteration::Iteration
8
+
9
+ def build_enumerator(_params, cursor:)
10
+ enumerator_builder.active_record_on_batches(
11
+ Account.inactive,
12
+ cursor: cursor
13
+ )
14
+ end
15
+
16
+ def each_iteration(batch, _params)
17
+ Account.where(id: batch.map(&:id)).delete_all
18
+ end
19
+ end
20
+ ```
21
+
22
+ For an app that keeps track of customer accounts, it's typical to purge old data that's no longer relevant for storage.
23
+
24
+ At the same time, if you've got a lot of DB writes to perform, this can cause extra load on the database and slow down other parts of your service.
25
+
26
+ You can change `build_enumerator` to wrap enumeration on DB rows into a throttle enumerator, which takes signal as a proc and enqueues the job for later in case the proc returned `true`.
27
+
28
+ ```ruby
29
+ def build_enumerator(_params, cursor:)
30
+ enumerator_builder.build_throttle_enumerator(
31
+ enumerator_builder.active_record_on_batches(
32
+ Account.inactive,
33
+ cursor: cursor
34
+ ),
35
+ throttle_on: -> { DatabaseStatus.unhealthy? },
36
+ backoff: 30.seconds
37
+ )
38
+ end
39
+ ```
40
+
41
+ Note that it's up to you to implement `DatabaseStatus.unhealthy?` that works for your database choice. At Shopify, a helper like `DatabaseStatus` checks the following MySQL metrics:
42
+
43
+ * Replication lag across all regions
44
+ * DB threads
45
+ * DB is available for writes (otherwise indicates a failover happening)
46
+ * [Semian](https://github.com/shopify/semian) open circuits
@@ -23,6 +23,7 @@ Gem::Specification.new do |spec|
23
23
  spec.require_paths = %w(lib)
24
24
 
25
25
  spec.metadata["changelog_uri"] = "https://github.com/Shopify/job-iteration/blob/master/CHANGELOG.md"
26
+ spec.metadata["allowed_push_host"] = "https://rubygems.org"
26
27
 
27
28
  spec.add_development_dependency("activerecord")
28
29
  spec.add_dependency("activejob", ">= 5.2")
@@ -1,6 +1,8 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module JobIteration
4
+ # Curious about how this works from the SQL perspective?
5
+ # Check "Pagination Done the Right way": https://bit.ly/2Rq7iPF
4
6
  class ActiveRecordCursor # @private
5
7
  include Comparable
6
8
 
@@ -32,8 +32,8 @@ module JobIteration
32
32
  def rows(cursor:)
33
33
  @csv.lazy
34
34
  .each_with_index
35
- .drop(cursor.to_i)
36
- .to_enum { count_rows_in_file }
35
+ .drop(count_of_processed_rows(cursor))
36
+ .to_enum { count_of_rows_in_file }
37
37
  end
38
38
 
39
39
  # Constructs a enumerator on batches of CSV rows
@@ -42,13 +42,13 @@ module JobIteration
42
42
  @csv.lazy
43
43
  .each_slice(batch_size)
44
44
  .each_with_index
45
- .drop(cursor.to_i)
46
- .to_enum { (count_rows_in_file.to_f / batch_size).ceil }
45
+ .drop(count_of_processed_rows(cursor))
46
+ .to_enum { (count_of_rows_in_file.to_f / batch_size).ceil }
47
47
  end
48
48
 
49
49
  private
50
50
 
51
- def count_rows_in_file
51
+ def count_of_rows_in_file
52
52
  # TODO: Remove rescue for NoMethodError when Ruby 2.6 is no longer supported.
53
53
  begin
54
54
  filepath = @csv.path
@@ -63,5 +63,9 @@ module JobIteration
63
63
  count -= 1 if @csv.headers
64
64
  count
65
65
  end
66
+
67
+ def count_of_processed_rows(cursor)
68
+ cursor.nil? ? 0 : cursor + 1
69
+ end
66
70
  end
67
71
  end
@@ -1,6 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
  require_relative "./active_record_enumerator"
3
3
  require_relative "./csv_enumerator"
4
+ require_relative "./throttle_enumerator"
4
5
  require "forwardable"
5
6
 
6
7
  module JobIteration
@@ -61,16 +62,6 @@ module JobIteration
61
62
  wrap(self, enumerable.each_with_index.drop(drop).to_enum { enumerable.size })
62
63
  end
63
64
 
64
- # Builds Enumerator from a lock queue instance that belongs to a job.
65
- # The helper is only to be used from jobs that use LockQueue module.
66
- def build_lock_queue_enumerator(lock_queue, at_most_once:)
67
- unless lock_queue.is_a?(BackgroundQueue::LockQueue::RedisQueue) ||
68
- lock_queue.is_a?(BackgroundQueue::LockQueue::RolloutRedisQueue)
69
- raise ArgumentError, "an argument to #build_lock_queue_enumerator must be a LockQueue"
70
- end
71
- wrap(self, BackgroundQueue::LockQueueEnumerator.new(lock_queue, at_most_once: at_most_once).to_enum)
72
- end
73
-
74
65
  # Builds Enumerator from Active Record Relation. Each Enumerator tick moves the cursor one row forward.
75
66
  #
76
67
  # +columns:+ argument is used to build the actual query for iteration. +columns+: defaults to primary key:
@@ -119,11 +110,21 @@ module JobIteration
119
110
  wrap(self, enum)
120
111
  end
121
112
 
113
+ def build_throttle_enumerator(enum, throttle_on:, backoff:)
114
+ JobIteration::ThrottleEnumerator.new(
115
+ enum,
116
+ @job,
117
+ throttle_on: throttle_on,
118
+ backoff: backoff
119
+ ).to_enum
120
+ end
121
+
122
122
  alias_method :once, :build_once_enumerator
123
123
  alias_method :times, :build_times_enumerator
124
124
  alias_method :array, :build_array_enumerator
125
125
  alias_method :active_record_on_records, :build_active_record_enumerator_on_records
126
126
  alias_method :active_record_on_batches, :build_active_record_enumerator_on_batches
127
+ alias_method :throttle, :build_throttle_enumerator
127
128
 
128
129
  private
129
130
 
@@ -49,6 +49,7 @@ module JobIteration
49
49
  self.total_time = 0.0
50
50
  assert_implements_methods!
51
51
  end
52
+ ruby2_keywords(:initialize) if respond_to?(:ruby2_keywords, true)
52
53
 
53
54
  def serialize # @private
54
55
  super.merge(
@@ -69,7 +70,7 @@ module JobIteration
69
70
  interruptible_perform(*params)
70
71
  end
71
72
 
72
- def retry_job(*)
73
+ def retry_job(*, **)
73
74
  super unless defined?(@retried) && @retried
74
75
  @retried = true
75
76
  end
@@ -116,8 +117,10 @@ module JobIteration
116
117
 
117
118
  def iterate_with_enumerator(enumerator, arguments)
118
119
  arguments = arguments.dup.freeze
120
+ found_record = false
119
121
  enumerator.each do |object_from_enumerator, index|
120
122
  record_unit_of_work do
123
+ found_record = true
121
124
  each_iteration(object_from_enumerator, *arguments)
122
125
  self.cursor_position = index
123
126
  end
@@ -128,6 +131,11 @@ module JobIteration
128
131
  return false
129
132
  end
130
133
 
134
+ logger.info(
135
+ "[JobIteration::Iteration] Enumerator found nothing to iterate! " \
136
+ "times_interrupted=#{times_interrupted} cursor_position=#{cursor_position}"
137
+ ) unless found_record
138
+
131
139
  true
132
140
  end
133
141
 
@@ -176,7 +184,7 @@ module JobIteration
176
184
  end
177
185
 
178
186
  if respond_to?(:build_enumerator, true)
179
- parameters = method(:build_enumerator).parameters
187
+ parameters = method_parameters(:build_enumerator)
180
188
  unless valid_cursor_parameter?(parameters)
181
189
  raise ArgumentError, "Iteration job (#{self.class}) #build_enumerator " \
182
190
  "expects the keyword argument `cursor`"
@@ -187,6 +195,17 @@ module JobIteration
187
195
  end
188
196
  end
189
197
 
198
+ def method_parameters(method_name)
199
+ method = method(method_name)
200
+
201
+ if defined?(T::Private::Methods)
202
+ signature = T::Private::Methods.signature_for_method(method)
203
+ method = signature.method if signature
204
+ end
205
+
206
+ method.parameters
207
+ end
208
+
190
209
  def iteration_instrumentation_tags
191
210
  { job_class: self.class.name }
192
211
  end
@@ -0,0 +1,46 @@
1
+ # typed: true
2
+ # frozen_string_literal: true
3
+ module JobIteration
4
+ # ThrottleEnumerator allows you to throttle iterations
5
+ # based on external signal (e.g. database health).
6
+ # @example
7
+ # def build_enumerator(_params, cursor:)
8
+ # enumerator_builder.build_throttle_enumerator(
9
+ # enumerator_builder.active_record_on_batches(
10
+ # Account.inactive,
11
+ # cursor: cursor
12
+ # ),
13
+ # throttle_on: -> { DatabaseStatus.unhealthy? },
14
+ # backoff: 30.seconds
15
+ # )
16
+ # end
17
+ # The enumerator from above will mimic +active_record_on_batches+,
18
+ # except when +DatabaseStatus.unhealthy?+ starts to return true.
19
+ # In that case, it will re-enqueue the job with a specified backoff.
20
+ class ThrottleEnumerator
21
+ def initialize(enum, job, throttle_on:, backoff:)
22
+ @enum = enum
23
+ @job = job
24
+ @throttle_on = throttle_on
25
+ @backoff = backoff
26
+ end
27
+
28
+ def to_enum
29
+ Enumerator.new(-> { @enum.size }) do |yielder|
30
+ @enum.each do |*val|
31
+ if should_throttle?
32
+ ActiveSupport::Notifications.instrument("throttled.iteration", job_class: @job.class.name)
33
+ @job.retry_job(wait: @backoff)
34
+ throw(:abort, :skip_complete_callbacks)
35
+ end
36
+
37
+ yielder.yield(*val)
38
+ end
39
+ end
40
+ end
41
+
42
+ def should_throttle?
43
+ @throttle_on.call
44
+ end
45
+ end
46
+ end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module JobIteration
4
- VERSION = "1.1.3"
4
+ VERSION = "1.1.8"
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: job-iteration
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.1.3
4
+ version: 1.1.8
5
5
  platform: ruby
6
6
  authors:
7
7
  - Shopify
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2019-08-20 00:00:00.000000000 Z
11
+ date: 2020-06-08 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: activerecord
@@ -58,10 +58,12 @@ files:
58
58
  - bin/setup
59
59
  - dev.yml
60
60
  - gemfiles/rails_5_2.gemfile
61
+ - gemfiles/rails_6_0.gemfile
61
62
  - gemfiles/rails_edge.gemfile
62
63
  - guides/best-practices.md
63
64
  - guides/custom-enumerator.md
64
65
  - guides/iteration-how-it-works.md
66
+ - guides/throttling.md
65
67
  - job-iteration.gemspec
66
68
  - lib/job-iteration.rb
67
69
  - lib/job-iteration/active_record_cursor.rb
@@ -72,6 +74,7 @@ files:
72
74
  - lib/job-iteration/integrations/sidekiq.rb
73
75
  - lib/job-iteration/iteration.rb
74
76
  - lib/job-iteration/test_helper.rb
77
+ - lib/job-iteration/throttle_enumerator.rb
75
78
  - lib/job-iteration/version.rb
76
79
  - railgun.yml
77
80
  homepage: https://github.com/shopify/job-iteration
@@ -79,6 +82,7 @@ licenses:
79
82
  - MIT
80
83
  metadata:
81
84
  changelog_uri: https://github.com/Shopify/job-iteration/blob/master/CHANGELOG.md
85
+ allowed_push_host: https://rubygems.org
82
86
  post_install_message:
83
87
  rdoc_options: []
84
88
  require_paths: