workhorse 1.2.12 → 1.2.13

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 5d242b1602e66d5de0ad72ef85ce8a15e885248ca14a075a2f25bb60b80c0364
4
- data.tar.gz: db8f8a6aecfe00aae955775b09c7d08568653258ef56ab874839cde702725854
3
+ metadata.gz: c528399ac973ce36bd6bdb993ac2d234eddf8836b8fc997031dd4009487ed252
4
+ data.tar.gz: 8482be9f53a1c8b5eb238784a6f350f79a0bf7b5c2c42a3921667ef2a82340d6
5
5
  SHA512:
6
- metadata.gz: aa64c4d9ff6c4085c42018ddd293b04fd8078e702306631d8b9918c917a53903b48832d561216c25f2dcae6c36cbc15f2e6587301302165b1a3fc639c4dfa59a
7
- data.tar.gz: f76393011d3be7b4002d27a5836e86ada615375130d04ae5f1ea64002fdad4cb5f2410c45e829ad726ce386c600d353208cba9d25de9e296db796f864373f1a0
6
+ metadata.gz: 83b44c60c1789755e2d09a2f32ea71728287d54ab1904bf5a6f9e88f225d02da0280d784e56d3352c336efb292aa864f841b85d1c085df7425878551bdec27ed
7
+ data.tar.gz: a554b93bc1c0c18bd1472981e2a84e80f79dfc0561f03c89839cc9e4f32d6480cb4fa327cd046cc8d9582a142af1092a4ffb1c5c492cb79fbb5af2fb3ebef725
data/CHANGELOG.md CHANGED
@@ -1,5 +1,22 @@
1
1
  # Workhorse Changelog
2
2
 
3
+ ## 1.2.13 - 2023-02-20
4
+
5
+ * Add the `config.max_global_lock_fails` setting (defaults to 10). If a
6
+ worker's poller cannot acquire the global lock, an error is logged, and if
7
+ `config.on_exception` is configured, the error is handled using this callback.
8
+
9
+ This change allows you to be aware of essentially defunct worker processes due
10
+ to a global lock that could not be obtained, for example, because of another
11
+ worker that was killed without properly releasing the lock. However, this is
12
+ an edge case because:
13
+
14
+ 1. The lock is released by Workhorse in an `ensure` block.
15
+ 2. At least MySQL is supposed to release global locks obtained in a connection
16
+ when that connection is closed.
17
+
18
+ Sitrox reference: #110339.
19
+
3
20
  ## 1.2.12 - 2023-01-18
4
21
 
5
22
  * Call `on_exception` callback on failed `Performer` initialization (e.g. when
data/README.md CHANGED
@@ -419,6 +419,24 @@ This means that you should always have an external *watcher* (usually a
419
419
  cronjob), that calls the `workhorse watch` command regularly. This would
420
420
  automatically restart crashed worker processes.
421
421
 
422
+ ### Unobtainable Locks
423
+
424
+ Each Workhorse worker uses a poller to check the database for new jobs. To
425
+ ensure that no job is obtained by more than one worker, a global database lock
426
+ is used. If a worker is killed, it may happen that the lock is not properly
427
+ released, which can cause all pollers to stop working because they cannot
428
+ acquire the lock. This is an edge case, as the locks should typically be
429
+ released properly, even if a worker process is killed using `SIGKILL`.
430
+
431
+ In the event that this still happens, Workhorse takes the following steps:
432
+
433
+ - Logs when a lock could not be obtained.
434
+ - Retries acquiring the lock on the next poll.
435
+ - Calls the `on_exception` callback (if configured) after a configurable number of consecutive failures to obtain the lock.
436
+
437
+ The maximum number of consecutive failures can be configured using
438
+ `config.max_global_lock_fails`, which defaults to 10.
439
+
422
440
  ### Stuck queues
423
441
 
424
442
  Jobs in named queues (non-null queues) are always run sequentially. This means
data/VERSION CHANGED
@@ -1 +1 @@
1
- 1.2.12
1
+ 1.2.13
@@ -15,6 +15,8 @@ module Workhorse
15
15
  @table = Workhorse::DbJob.arel_table
16
16
  @is_oracle = ActiveRecord::Base.connection.adapter_name == 'OracleEnhanced'
17
17
  @instant_repoll = Concurrent::AtomicBoolean.new(false)
18
+ @global_lock_fails = 0
19
+ @max_global_lock_fails_reached = false
18
20
  end
19
21
 
20
22
  def running?
@@ -86,6 +88,38 @@ module Workhorse
86
88
  success = result == 1
87
89
  end
88
90
 
91
+ if success
92
+ @global_lock_fails = 0
93
+ @max_global_lock_fails_reached = false
94
+ else
95
+ @global_lock_fails += 1
96
+
97
+ unless @max_global_lock_fails_reached
98
+ worker.log 'Could not obtain global lock, retrying with next poll.', :warn
99
+ end
100
+
101
+ if @global_lock_fails > Workhorse.max_global_lock_fails && !@max_global_lock_fails_reached
102
+ @max_global_lock_fails_reached = true
103
+
104
+ worker.log 'Could not obtain global lock, retrying with next poll. '\
105
+ 'This will be the last such message for this worker until '\
106
+ 'the issue is resolved.', :warn
107
+
108
+ message = "Worker reached maximum number of consecutive times (#{Workhorse.max_global_lock_fails}) " \
109
+ "where the global lock could no be acquired within the specified timeout (#{timeout}). " \
110
+ 'A worker that obtained this lock may have crashed without ending the database ' \
111
+ 'connection properly. On MySQL, use "show processlist;" to see which connection(s) ' \
112
+ 'is / are holding the lock for a long period of time and consider killing them using '\
113
+ "MySQL's \"kill <Id>\" command. This message will be issued only once per worker " \
114
+ "and may only be re-triggered if the error happens again *after* the lock has " \
115
+ "been solved in the meantime."
116
+
117
+ worker.log message
118
+ exception = StandardError.new(message)
119
+ Workhorse.on_exception.call(exception)
120
+ end
121
+ end
122
+
89
123
  return unless success
90
124
 
91
125
  yield
data/lib/workhorse.rb CHANGED
@@ -20,6 +20,12 @@ module Workhorse
20
20
  || fail('No performer is associated with the current thread. This method must always be called inside of a job.')
21
21
  end
22
22
 
23
+ # A worker will log an error and, if defined, call the on_exception callback,
24
+ # if it couldn't obtain the global lock for the specified number of times in a
25
+ # row.
26
+ mattr_accessor :max_global_lock_fails
27
+ self.max_global_lock_fails = 10
28
+
23
29
  mattr_accessor :tx_callback
24
30
  self.tx_callback = proc do |*args, &block|
25
31
  ActiveRecord::Base.transaction(*args, &block)
data/workhorse.gemspec CHANGED
@@ -1,14 +1,14 @@
1
1
  # -*- encoding: utf-8 -*-
2
- # stub: workhorse 1.2.12 ruby lib
2
+ # stub: workhorse 1.2.13 ruby lib
3
3
 
4
4
  Gem::Specification.new do |s|
5
5
  s.name = "workhorse".freeze
6
- s.version = "1.2.12"
6
+ s.version = "1.2.13"
7
7
 
8
8
  s.required_rubygems_version = Gem::Requirement.new(">= 0".freeze) if s.respond_to? :required_rubygems_version=
9
9
  s.require_paths = ["lib".freeze]
10
10
  s.authors = ["Sitrox".freeze]
11
- s.date = "2023-01-18"
11
+ s.date = "2023-02-20"
12
12
  s.files = [".github/workflows/ruby.yml".freeze, ".gitignore".freeze, ".releaser_config".freeze, ".rubocop.yml".freeze, "CHANGELOG.md".freeze, "FAQ.md".freeze, "Gemfile".freeze, "LICENSE".freeze, "README.md".freeze, "RUBY_VERSION".freeze, "Rakefile".freeze, "VERSION".freeze, "bin/rubocop".freeze, "lib/active_job/queue_adapters/workhorse_adapter.rb".freeze, "lib/generators/workhorse/install_generator.rb".freeze, "lib/generators/workhorse/templates/bin/workhorse.rb".freeze, "lib/generators/workhorse/templates/config/initializers/workhorse.rb".freeze, "lib/generators/workhorse/templates/create_table_jobs.rb".freeze, "lib/workhorse.rb".freeze, "lib/workhorse/daemon.rb".freeze, "lib/workhorse/daemon/shell_handler.rb".freeze, "lib/workhorse/db_job.rb".freeze, "lib/workhorse/enqueuer.rb".freeze, "lib/workhorse/jobs/cleanup_succeeded_jobs.rb".freeze, "lib/workhorse/jobs/detect_stale_jobs_job.rb".freeze, "lib/workhorse/jobs/run_active_job.rb".freeze, "lib/workhorse/jobs/run_rails_op.rb".freeze, "lib/workhorse/performer.rb".freeze, "lib/workhorse/poller.rb".freeze, "lib/workhorse/pool.rb".freeze, "lib/workhorse/scoped_env.rb".freeze, "lib/workhorse/worker.rb".freeze, "test/active_job/queue_adapters/workhorse_adapter_test.rb".freeze, "test/lib/db_schema.rb".freeze, "test/lib/jobs.rb".freeze, "test/lib/test_helper.rb".freeze, "test/workhorse/db_job_test.rb".freeze, "test/workhorse/enqueuer_test.rb".freeze, "test/workhorse/performer_test.rb".freeze, "test/workhorse/poller_test.rb".freeze, "test/workhorse/pool_test.rb".freeze, "test/workhorse/worker_test.rb".freeze, "workhorse.gemspec".freeze]
13
13
  s.rubygems_version = "3.0.3".freeze
14
14
  s.summary = "Multi-threaded job backend with database queuing for ruby.".freeze
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: workhorse
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.2.12
4
+ version: 1.2.13
5
5
  platform: ruby
6
6
  authors:
7
7
  - Sitrox
8
- autorequire:
8
+ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2023-01-18 00:00:00.000000000 Z
11
+ date: 2023-02-20 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -178,8 +178,8 @@ dependencies:
178
178
  - - ">="
179
179
  - !ruby/object:Gem::Version
180
180
  version: '0'
181
- description:
182
- email:
181
+ description:
182
+ email:
183
183
  executables: []
184
184
  extensions: []
185
185
  extra_rdoc_files: []
@@ -227,10 +227,10 @@ files:
227
227
  - test/workhorse/pool_test.rb
228
228
  - test/workhorse/worker_test.rb
229
229
  - workhorse.gemspec
230
- homepage:
230
+ homepage:
231
231
  licenses: []
232
232
  metadata: {}
233
- post_install_message:
233
+ post_install_message:
234
234
  rdoc_options: []
235
235
  require_paths:
236
236
  - lib
@@ -245,8 +245,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
245
245
  - !ruby/object:Gem::Version
246
246
  version: '0'
247
247
  requirements: []
248
- rubygems_version: 3.3.11
249
- signing_key:
248
+ rubygems_version: 3.0.3.1
249
+ signing_key:
250
250
  specification_version: 4
251
251
  summary: Multi-threaded job backend with database queuing for ruby.
252
252
  test_files: