workhorse 0.6.8 → 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +57 -0
- data/FAQ.md +5 -1
- data/README.md +52 -2
- data/Rakefile +1 -0
- data/VERSION +1 -1
- data/lib/workhorse.rb +12 -0
- data/lib/workhorse/daemon.rb +6 -6
- data/lib/workhorse/db_job.rb +7 -0
- data/lib/workhorse/jobs/cleanup_succeeded_jobs.rb +1 -1
- data/lib/workhorse/jobs/detect_stale_jobs_job.rb +48 -0
- data/lib/workhorse/performer.rb +2 -2
- data/lib/workhorse/poller.rb +68 -46
- data/lib/workhorse/worker.rb +3 -3
- data/test/lib/test_helper.rb +10 -1
- data/test/workhorse/poller_test.rb +79 -0
- data/workhorse.gemspec +7 -4
- metadata +17 -2
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 4e083c55d647214dd5c164b6cf60f41effbd3977d4e9cb4caf1b5f7f91fe5e3d
|
|
4
|
+
data.tar.gz: 1c5c25756520f509d69372aed0a89f2a6f90e988c28388c0144717b418cad2ba
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 8f11391a2d5aa0953f6aace485b4e6c769a7c7395499b5050a02b4f54491cb0d1e46cd57322e4f1cffb84d3e6d3d38136439656b17d45f4486022cbf3576516c
|
|
7
|
+
data.tar.gz: 607d765c6d5d3960e7dfa67d0a2aebdd9cdb3bbf3f6538cb4512c80c4815aedc5def6c547214e4109f225224adb2161714c4ca7cbd602ab12b70ff1cd5baadbf
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,62 @@
|
|
|
1
1
|
# Workhorse Changelog
|
|
2
2
|
|
|
3
|
+
## 1.0.0 - 2020-09-21
|
|
4
|
+
|
|
5
|
+
* Stable release, idendital to 1.0.0.beta2 but now extensively battle-tested
|
|
6
|
+
|
|
7
|
+
## 1.0.0.beta2 - 2020-08-27
|
|
8
|
+
|
|
9
|
+
* Add option `config.silence_poller_exceptions` (default `false`)
|
|
10
|
+
|
|
11
|
+
* Add option `config.silence_watcher` (default `false`)
|
|
12
|
+
|
|
13
|
+
## 1.0.0.beta1 - 2020-08-20
|
|
14
|
+
|
|
15
|
+
This is a stability release that is still experimental and has to be tested in
|
|
16
|
+
battle before it can be considered stable.
|
|
17
|
+
|
|
18
|
+
* Stop passing ActiveRecord job objects between polling and worker threads to
|
|
19
|
+
avoid AR race conditions. Now only IDs are passed between threads.
|
|
20
|
+
|
|
21
|
+
## 1.0.0.beta0 - 2020-08-19
|
|
22
|
+
|
|
23
|
+
This is a stability release that is still experimental and has to be tested in
|
|
24
|
+
battle before it can be considered stable.
|
|
25
|
+
|
|
26
|
+
* Simplify locking during polling. Other than locking individual jobs, pollers
|
|
27
|
+
now acquire a global lock. While this can lead to many pollers waiting for
|
|
28
|
+
each others locks, performing a poll is usually done very quickly and the
|
|
29
|
+
performance drawback is to be considered neglegible. This change should work
|
|
30
|
+
around some deadlock issues as well as an issue where a job was obtained by
|
|
31
|
+
more than one poller.
|
|
32
|
+
|
|
33
|
+
* Shut down worker if polling encountered any kind of error (running jobs will
|
|
34
|
+
be completed whenever possible). This leads to potential watcher jobs being
|
|
35
|
+
able to restore the failed process.
|
|
36
|
+
|
|
37
|
+
* Make unit test database connection configurable using environment variables
|
|
38
|
+
`DB_NAME`, `DB_USERNAME`, `DB_PASSWORD` and `DB_HOST`. This is only relevant
|
|
39
|
+
if you are working on workhorse and need to run the unit tests.
|
|
40
|
+
|
|
41
|
+
* Fix misbehaviour where queueless jobs were not picked up by workers as long as
|
|
42
|
+
a named queue was in a locked state.
|
|
43
|
+
|
|
44
|
+
* Add built-in job `Workhorse::Jobs::DetectStaleJobsJob` which you can schedule.
|
|
45
|
+
It picks up jobs that remained `locked` or `started` (running) for more than a
|
|
46
|
+
certain amount of time. If any of these jobs are found, an exception is thrown
|
|
47
|
+
(which may cause a notification if you configured `on_exception` accordingly).
|
|
48
|
+
See the job's API documentation for more information.
|
|
49
|
+
|
|
50
|
+
**If using oracle:** Make sure to grant execute permission to the package
|
|
51
|
+
`DBMS_LOCK` for your oracle database schema:
|
|
52
|
+
|
|
53
|
+
```GRANT execute ON DBMS_LOCK TO <schema-name>;```
|
|
54
|
+
|
|
55
|
+
## 0.6.9 - 2020-04-22
|
|
56
|
+
|
|
57
|
+
* Fix error where processes may have mistakenly been detected as running (add a
|
|
58
|
+
further improvement to the fix in 0.6.7).
|
|
59
|
+
|
|
3
60
|
## 0.6.8 - 2020-04-07
|
|
4
61
|
|
|
5
62
|
* Fix bug introduced in 0.6.7 where all processes were detected as running
|
data/FAQ.md
CHANGED
|
@@ -74,4 +74,8 @@ production mode.
|
|
|
74
74
|
|
|
75
75
|
## Why does workhorse not support timeouts?
|
|
76
76
|
|
|
77
|
-
Generic timeout implementations are [a dangerous
|
|
77
|
+
Generic timeout implementations are [a dangerous
|
|
78
|
+
thing](http://www.mikeperham.com/2015/05/08/timeout-rubys-most-dangerous-api/)
|
|
79
|
+
in Ruby. This is why we decided against providing this feature in Workhorse and
|
|
80
|
+
recommend to implement the timeouts inside of your jobs - i.e. via network
|
|
81
|
+
timeouts.
|
data/README.md
CHANGED
|
@@ -66,6 +66,15 @@ What it does not do:
|
|
|
66
66
|
|
|
67
67
|
Please customize the initializer and worker script to your liking.
|
|
68
68
|
|
|
69
|
+
### Oracle
|
|
70
|
+
|
|
71
|
+
When using Oracle databases, make sure your schema has access to the package
|
|
72
|
+
`DBMS_LOCK`:
|
|
73
|
+
|
|
74
|
+
```
|
|
75
|
+
GRANT execute ON DBMS_LOCK TO <schema-name>;
|
|
76
|
+
```
|
|
77
|
+
|
|
69
78
|
## Queuing jobs
|
|
70
79
|
|
|
71
80
|
### Basic jobs
|
|
@@ -281,6 +290,10 @@ Workhorse.setup do |config|
|
|
|
281
290
|
end
|
|
282
291
|
```
|
|
283
292
|
|
|
293
|
+
Using the settings `config.silence_poller_exceptions` and
|
|
294
|
+
`config.silence_watcher`, you can silence certain exceptions / error outputs
|
|
295
|
+
(both are disabled by default).
|
|
296
|
+
|
|
284
297
|
## Handling database jobs
|
|
285
298
|
|
|
286
299
|
Jobs stored in the database can be accessed via the ActiveRecord model
|
|
@@ -320,7 +333,6 @@ DbJob.started
|
|
|
320
333
|
DbJob.succeeded
|
|
321
334
|
DbJob.failed
|
|
322
335
|
```
|
|
323
|
-
|
|
324
336
|
### Resetting jobs
|
|
325
337
|
|
|
326
338
|
Jobs in a state other than `waiting` are either being processed or else already
|
|
@@ -352,7 +364,6 @@ Performing a reset will reset the job state to `waiting` and it will be
|
|
|
352
364
|
processed again. All meta fields will be reset as well. See inline documentation
|
|
353
365
|
of `Workhorse::DbJob#reset!` for more details.
|
|
354
366
|
|
|
355
|
-
|
|
356
367
|
## Using workhorse with Rails / ActiveJob
|
|
357
368
|
|
|
358
369
|
While workhorse can be used though its custom interface as documented above, it
|
|
@@ -365,6 +376,45 @@ To use workhorse as your ActiveJob backend, set the `queue_adapter` to
|
|
|
365
376
|
configuration or else using `self.queue_adapter` in a job class inheriting from
|
|
366
377
|
`ActiveJob`. See ActiveJob documentation for more details.
|
|
367
378
|
|
|
379
|
+
## Cleaning up jobs
|
|
380
|
+
|
|
381
|
+
Per default, jobs remain in the database, no matter in which state. This can
|
|
382
|
+
eventually lead to a very large jobs database. You are advised to clean your
|
|
383
|
+
jobs database on a regular interval. Workhorse provides the job
|
|
384
|
+
`Workhose::Jobs::CleanupSucceededJobs` for this purpose that cleans up all
|
|
385
|
+
succeeded jobs. You can run this using your scheduler in a specific interval.
|
|
386
|
+
|
|
387
|
+
## Caveats
|
|
388
|
+
|
|
389
|
+
### Errors during polling / crashed workers
|
|
390
|
+
|
|
391
|
+
Each worker process includes one thread that polls the database for jobs and
|
|
392
|
+
dispatches them to individual worker threads. In case of an error in the poller
|
|
393
|
+
(usually due to a database connection drop), the poller aborts and gracefully
|
|
394
|
+
shuts down the entire worker. Jobs still being processed by this worker are
|
|
395
|
+
attempted to be completed during this shutdown (which only works if the database
|
|
396
|
+
connection is still active).
|
|
397
|
+
|
|
398
|
+
This means that you should always have an external *watcher* (usually a
|
|
399
|
+
cronjob), that calls the `workhorse watch` command regularly. This would
|
|
400
|
+
automatically restart crashed worker processes.
|
|
401
|
+
|
|
402
|
+
### Stuck queues
|
|
403
|
+
|
|
404
|
+
Jobs in named queues (non-null queues) are always run sequentially. This means
|
|
405
|
+
that if a job in such a queue is stuck in states `locked` or `started` (i.e. due
|
|
406
|
+
to a database connection failure), no more jobs of this queue will be run as the
|
|
407
|
+
entire queue is considered locked to ensure that no jobs of the same queue run
|
|
408
|
+
in parallel.
|
|
409
|
+
|
|
410
|
+
For this purpose, Workhorse provides the built-in job
|
|
411
|
+
`Workhorse::Jobs::DetectStaleJobsJob` which you are advised schedule on a
|
|
412
|
+
regular basis. It picks up jobs that remained `locked` or `started` (running)
|
|
413
|
+
for more than a certain amount of time. If any of these jobs are found, an
|
|
414
|
+
exception is thrown (which may cause a notification if you configured
|
|
415
|
+
`on_exception` accordingly). See the job's API documentation for more
|
|
416
|
+
information.
|
|
417
|
+
|
|
368
418
|
## Frequently asked questions
|
|
369
419
|
|
|
370
420
|
Please consult the [FAQ](FAQ.md).
|
data/Rakefile
CHANGED
|
@@ -19,6 +19,7 @@ task :gemspec do
|
|
|
19
19
|
spec.add_development_dependency 'colorize'
|
|
20
20
|
spec.add_development_dependency 'benchmark-ips'
|
|
21
21
|
spec.add_development_dependency 'activejob'
|
|
22
|
+
spec.add_development_dependency 'pry'
|
|
22
23
|
spec.add_dependency 'activesupport'
|
|
23
24
|
spec.add_dependency 'activerecord'
|
|
24
25
|
spec.add_dependency 'schemacop', '~> 2.0'
|
data/VERSION
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
0.
|
|
1
|
+
1.0.0
|
data/lib/workhorse.rb
CHANGED
|
@@ -30,6 +30,17 @@ module Workhorse
|
|
|
30
30
|
# ExceptionNotifier.notify_exception(exception)
|
|
31
31
|
end
|
|
32
32
|
|
|
33
|
+
# If set to `true`, the defined `on_exception` will not be called when the
|
|
34
|
+
# poller encounters an exception and the worker has to be shut down. The
|
|
35
|
+
# exception will still be logged.
|
|
36
|
+
mattr_accessor :silence_poller_exceptions
|
|
37
|
+
self.silence_poller_exceptions = false
|
|
38
|
+
|
|
39
|
+
# If set to `true`, the `watch` command won't produce any output. This does
|
|
40
|
+
# not include warnings such as the "development mode" warning.
|
|
41
|
+
mattr_accessor :silence_watcher
|
|
42
|
+
self.silence_watcher = false
|
|
43
|
+
|
|
33
44
|
mattr_accessor :perform_jobs_in_tx
|
|
34
45
|
self.perform_jobs_in_tx = true
|
|
35
46
|
|
|
@@ -46,6 +57,7 @@ require 'workhorse/worker'
|
|
|
46
57
|
require 'workhorse/jobs/run_rails_op'
|
|
47
58
|
require 'workhorse/jobs/run_active_job'
|
|
48
59
|
require 'workhorse/jobs/cleanup_succeeded_jobs'
|
|
60
|
+
require 'workhorse/jobs/detect_stale_jobs_job'
|
|
49
61
|
|
|
50
62
|
# Daemon functionality is not available on java platforms
|
|
51
63
|
if RUBY_PLATFORM != 'java'
|
data/lib/workhorse/daemon.rb
CHANGED
|
@@ -38,21 +38,21 @@ module Workhorse
|
|
|
38
38
|
@workers << Worker.new(@workers.size + 1, name, &block)
|
|
39
39
|
end
|
|
40
40
|
|
|
41
|
-
def start
|
|
41
|
+
def start(quiet: false)
|
|
42
42
|
code = 0
|
|
43
43
|
|
|
44
44
|
for_each_worker do |worker|
|
|
45
45
|
pid_file, pid = read_pid(worker)
|
|
46
46
|
|
|
47
47
|
if pid_file && pid
|
|
48
|
-
warn "Worker ##{worker.id} (#{worker.name}): Already started (PID #{pid})"
|
|
48
|
+
warn "Worker ##{worker.id} (#{worker.name}): Already started (PID #{pid})" unless quiet
|
|
49
49
|
code = 1
|
|
50
50
|
elsif pid_file
|
|
51
51
|
File.delete pid_file
|
|
52
|
-
puts "Worker ##{worker.id} (#{worker.name}): Starting (stale pid file)"
|
|
52
|
+
puts "Worker ##{worker.id} (#{worker.name}): Starting (stale pid file)" unless quiet
|
|
53
53
|
start_worker worker
|
|
54
54
|
else
|
|
55
|
-
warn "Worker ##{worker.id} (#{worker.name}): Starting"
|
|
55
|
+
warn "Worker ##{worker.id} (#{worker.name}): Starting" unless quiet
|
|
56
56
|
start_worker worker
|
|
57
57
|
end
|
|
58
58
|
end
|
|
@@ -109,7 +109,7 @@ module Workhorse
|
|
|
109
109
|
end
|
|
110
110
|
|
|
111
111
|
if should_be_running && status(quiet: true) != 0
|
|
112
|
-
return start
|
|
112
|
+
return start(quiet: Workhorse.silence_watcher)
|
|
113
113
|
else
|
|
114
114
|
return 0
|
|
115
115
|
end
|
|
@@ -164,7 +164,7 @@ module Workhorse
|
|
|
164
164
|
return begin
|
|
165
165
|
Process.kill(0, pid)
|
|
166
166
|
true
|
|
167
|
-
rescue Errno::EPERM
|
|
167
|
+
rescue Errno::EPERM, Errno::ESRCH
|
|
168
168
|
false
|
|
169
169
|
end
|
|
170
170
|
end
|
data/lib/workhorse/db_job.rb
CHANGED
|
@@ -69,7 +69,14 @@ module Workhorse
|
|
|
69
69
|
fail "Dirty jobs can't be locked."
|
|
70
70
|
end
|
|
71
71
|
|
|
72
|
+
# TODO: Remove this debug output
|
|
73
|
+
# if Workhorse::DbJob.lock.find(id).locked_at
|
|
74
|
+
# puts "Already locked (with FOR UPDATE)"
|
|
75
|
+
# end
|
|
76
|
+
|
|
72
77
|
if locked_at
|
|
78
|
+
# TODO: Remove this debug output
|
|
79
|
+
# puts "Already locked. Job: #{self.id} Worker: #{worker_id}"
|
|
73
80
|
fail "Job #{id} is already locked by #{locked_by.inspect}."
|
|
74
81
|
end
|
|
75
82
|
|
|
@@ -0,0 +1,48 @@
|
|
|
1
|
+
module Workhorse::Jobs
|
|
2
|
+
class DetectStaleJobsJob
|
|
3
|
+
# Instantiates a new stale detection job.
|
|
4
|
+
#
|
|
5
|
+
# @param locked_to_started_threshold [Integer] The maximum number of seconds
|
|
6
|
+
# a job is allowed to stay 'locked' before this job throws an exception.
|
|
7
|
+
# Set this to 0 to skip this check.
|
|
8
|
+
# @param run_time_threshold [Integer] The maximum number of seconds
|
|
9
|
+
# a job is allowed to run before this job throws an exception. Set this to
|
|
10
|
+
# 0 to skip this check.
|
|
11
|
+
def initialize(locked_to_started_threshold: 3 * 60, run_time_threshold: 12 * 60)
|
|
12
|
+
@locked_to_started_threshold = locked_to_started_threshold
|
|
13
|
+
@run_time_threshold = run_time_threshold
|
|
14
|
+
end
|
|
15
|
+
|
|
16
|
+
def perform
|
|
17
|
+
messages = []
|
|
18
|
+
|
|
19
|
+
# Detect jobs that are locked for too long #
|
|
20
|
+
if @locked_to_started_threshold != 0
|
|
21
|
+
rel = Workhorse::DbJob.locked
|
|
22
|
+
rel = rel.where('locked_at < ?', @locked_to_started_threshold.seconds.ago)
|
|
23
|
+
ids = rel.pluck(:id)
|
|
24
|
+
|
|
25
|
+
if ids.size > 0
|
|
26
|
+
messages << "Detected #{ids.size} jobs that were locked more than "\
|
|
27
|
+
"#{@locked_to_started_threshold}s ago and might be stale: #{ids.inspect}."
|
|
28
|
+
end
|
|
29
|
+
end
|
|
30
|
+
|
|
31
|
+
# Detect jobs that are running for too long #
|
|
32
|
+
if @run_time_threshold != 0
|
|
33
|
+
rel = Workhorse::DbJob.started
|
|
34
|
+
rel = rel.where('started_at < ?', @run_time_threshold.seconds.ago)
|
|
35
|
+
ids = rel.pluck(:id)
|
|
36
|
+
|
|
37
|
+
if ids.size > 0
|
|
38
|
+
messages << "Detected #{ids.size} jobs that are running for longer than "\
|
|
39
|
+
"#{@run_time_threshold}s ago and might be stale: #{ids.inspect}."
|
|
40
|
+
end
|
|
41
|
+
end
|
|
42
|
+
|
|
43
|
+
if messages.any?
|
|
44
|
+
fail messages.join(' ')
|
|
45
|
+
end
|
|
46
|
+
end
|
|
47
|
+
end
|
|
48
|
+
end
|
data/lib/workhorse/performer.rb
CHANGED
data/lib/workhorse/poller.rb
CHANGED
|
@@ -1,5 +1,11 @@
|
|
|
1
1
|
module Workhorse
|
|
2
2
|
class Poller
|
|
3
|
+
MIN_LOCK_TIMEOUT = 0.1 # In seconds
|
|
4
|
+
MAX_LOCK_TIMEOUT = 1.0 # In seconds
|
|
5
|
+
|
|
6
|
+
ORACLE_LOCK_MODE = 6 # X_MODE (exclusive)
|
|
7
|
+
ORACLE_LOCK_HANDLE = 478564848 # Randomly chosen number
|
|
8
|
+
|
|
3
9
|
attr_reader :worker
|
|
4
10
|
attr_reader :table
|
|
5
11
|
|
|
@@ -20,15 +26,20 @@ module Workhorse
|
|
|
20
26
|
@running = true
|
|
21
27
|
|
|
22
28
|
@thread = Thread.new do
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
29
|
+
loop do
|
|
30
|
+
break unless running?
|
|
31
|
+
|
|
32
|
+
begin
|
|
26
33
|
poll
|
|
27
34
|
sleep
|
|
35
|
+
rescue Exception => e
|
|
36
|
+
worker.log %(Poll encountered exception:\n#{e.message}\n#{e.backtrace.join("\n")})
|
|
37
|
+
worker.log 'Worker shutting down...'
|
|
38
|
+
Workhorse.on_exception.call(e) unless Workhorse.silence_poller_exceptions
|
|
39
|
+
@running = false
|
|
40
|
+
worker.instance_variable_get(:@pool).shutdown
|
|
41
|
+
break
|
|
28
42
|
end
|
|
29
|
-
rescue Exception => e
|
|
30
|
-
worker.log %(Poller stopped with exception:\n#{e.message}\n#{e.backtrace.join("\n")})
|
|
31
|
-
Workhorse.on_exception.call(e)
|
|
32
43
|
end
|
|
33
44
|
end
|
|
34
45
|
end
|
|
@@ -61,41 +72,66 @@ module Workhorse
|
|
|
61
72
|
end
|
|
62
73
|
end
|
|
63
74
|
|
|
75
|
+
def with_global_lock(name: :workhorse, timeout: 2, &block)
|
|
76
|
+
if @is_oracle
|
|
77
|
+
result = Workhorse::DbJob.connection.select_all(
|
|
78
|
+
"SELECT DBMS_LOCK.REQUEST(#{ORACLE_LOCK_HANDLE}, #{ORACLE_LOCK_MODE}, #{timeout}) FROM DUAL"
|
|
79
|
+
).first.values.last
|
|
80
|
+
|
|
81
|
+
success = result == 0
|
|
82
|
+
else
|
|
83
|
+
result = Workhorse::DbJob.connection.select_all(
|
|
84
|
+
"SELECT GET_LOCK(CONCAT(DATABASE(), '_#{name}'), #{timeout})"
|
|
85
|
+
).first.values.last
|
|
86
|
+
success = result == 1
|
|
87
|
+
end
|
|
88
|
+
|
|
89
|
+
return unless success
|
|
90
|
+
|
|
91
|
+
yield
|
|
92
|
+
ensure
|
|
93
|
+
if success
|
|
94
|
+
if @is_oracle
|
|
95
|
+
Workhorse::DbJob.connection.execute("SELECT DBMS_LOCK.RELEASE(#{ORACLE_LOCK_HANDLE}) FROM DUAL")
|
|
96
|
+
else
|
|
97
|
+
Workhorse::DbJob.connection.execute("SELECT RELEASE_LOCK(CONCAT(DATABASE(), '_#{name}'))")
|
|
98
|
+
end
|
|
99
|
+
end
|
|
100
|
+
end
|
|
101
|
+
|
|
64
102
|
def poll
|
|
65
103
|
@instant_repoll.make_false
|
|
66
104
|
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
105
|
+
timeout = [MIN_LOCK_TIMEOUT, [MAX_LOCK_TIMEOUT, worker.polling_interval].min].max
|
|
106
|
+
|
|
107
|
+
with_global_lock timeout: timeout do
|
|
108
|
+
job_ids = []
|
|
109
|
+
|
|
110
|
+
Workhorse.tx_callback.call do
|
|
111
|
+
# As we are the only thread posting into the worker pool, it is safe to
|
|
112
|
+
# get the number of idle threads without mutex synchronization. The
|
|
113
|
+
# actual number of idle workers at time of posting can only be larger
|
|
114
|
+
# than or equal to the number we get here.
|
|
115
|
+
idle = worker.idle
|
|
116
|
+
|
|
117
|
+
worker.log "Polling DB for jobs (#{idle} available threads)...", :debug
|
|
118
|
+
|
|
119
|
+
unless idle.zero?
|
|
120
|
+
jobs = queued_db_jobs(idle)
|
|
121
|
+
jobs.each do |job|
|
|
122
|
+
worker.log "Marking job #{job.id} as locked", :debug
|
|
123
|
+
job.mark_locked!(worker.id)
|
|
124
|
+
job_ids << job.id
|
|
125
|
+
end
|
|
82
126
|
end
|
|
83
127
|
end
|
|
128
|
+
|
|
129
|
+
job_ids.each { |job_id| worker.perform(job_id) }
|
|
84
130
|
end
|
|
85
131
|
end
|
|
86
132
|
|
|
87
133
|
# Returns an Array of #{Workhorse::DbJob}s that can be started
|
|
88
134
|
def queued_db_jobs(limit)
|
|
89
|
-
# ---------------------------------------------------------------
|
|
90
|
-
# Lock all queued jobs that are waiting
|
|
91
|
-
# ---------------------------------------------------------------
|
|
92
|
-
Workhorse::DbJob.connection.execute(
|
|
93
|
-
Workhorse::DbJob.select('null').where(
|
|
94
|
-
table[:queue].not_eq(nil)
|
|
95
|
-
.and(table[:state].eq(:waiting))
|
|
96
|
-
).lock.to_sql
|
|
97
|
-
)
|
|
98
|
-
|
|
99
135
|
# ---------------------------------------------------------------
|
|
100
136
|
# Select jobs to execute
|
|
101
137
|
# ---------------------------------------------------------------
|
|
@@ -147,20 +183,6 @@ module Workhorse
|
|
|
147
183
|
# Limit number of records
|
|
148
184
|
select = agnostic_limit(select, limit)
|
|
149
185
|
|
|
150
|
-
# Wrap the entire query in an other subselect to enable locking under
|
|
151
|
-
# Oracle SQL. As MySQL is able to lock the records without this additional
|
|
152
|
-
# complication, only do this when using the Oracle backend.
|
|
153
|
-
if @is_oracle
|
|
154
|
-
if AREL_GTE_7
|
|
155
|
-
select = Arel::SelectManager.new(Arel.sql('(' + select.to_sql + ')'))
|
|
156
|
-
else
|
|
157
|
-
select = Arel::SelectManager.new(ActiveRecord::Base, Arel.sql('(' + select.to_sql + ')'))
|
|
158
|
-
end
|
|
159
|
-
select = table.project(Arel.star).where(table[:id].in(select.project(:id)))
|
|
160
|
-
end
|
|
161
|
-
|
|
162
|
-
select = select.lock
|
|
163
|
-
|
|
164
186
|
return Workhorse::DbJob.find_by_sql(select.to_sql).to_a
|
|
165
187
|
end
|
|
166
188
|
|
|
@@ -214,7 +236,7 @@ module Workhorse
|
|
|
214
236
|
.where(table[:state].in(bad_states))
|
|
215
237
|
# .distinct is not chainable in older Arel versions
|
|
216
238
|
bad_queues_select.distinct
|
|
217
|
-
select = select.where(table[:queue].not_in(bad_queues_select))
|
|
239
|
+
select = select.where(table[:queue].not_in(bad_queues_select).or(table[:queue].eq(nil)))
|
|
218
240
|
|
|
219
241
|
# Restrict queues to valid ones as indicated by the options given to the
|
|
220
242
|
# worker
|
data/lib/workhorse/worker.rb
CHANGED
|
@@ -127,14 +127,14 @@ module Workhorse
|
|
|
127
127
|
@pool.idle
|
|
128
128
|
end
|
|
129
129
|
|
|
130
|
-
def perform(
|
|
130
|
+
def perform(db_job_id)
|
|
131
131
|
mutex.synchronize do
|
|
132
132
|
assert_state! :running
|
|
133
|
-
log "Posting job #{
|
|
133
|
+
log "Posting job #{db_job_id} to thread pool"
|
|
134
134
|
|
|
135
135
|
@pool.post do
|
|
136
136
|
begin
|
|
137
|
-
Workhorse::Performer.new(
|
|
137
|
+
Workhorse::Performer.new(db_job_id, self).perform
|
|
138
138
|
rescue Exception => e
|
|
139
139
|
log %(#{e.message}\n#{e.backtrace.join("\n")}), :error
|
|
140
140
|
end
|
data/test/lib/test_helper.rb
CHANGED
|
@@ -1,6 +1,8 @@
|
|
|
1
1
|
require 'minitest/autorun'
|
|
2
2
|
require 'active_record'
|
|
3
3
|
require 'active_job'
|
|
4
|
+
require 'pry'
|
|
5
|
+
require 'colorize'
|
|
4
6
|
require 'mysql2'
|
|
5
7
|
require 'benchmark'
|
|
6
8
|
require 'jobs'
|
|
@@ -40,7 +42,14 @@ class WorkhorseTest < ActiveSupport::TestCase
|
|
|
40
42
|
end
|
|
41
43
|
end
|
|
42
44
|
|
|
43
|
-
ActiveRecord::Base.establish_connection
|
|
45
|
+
ActiveRecord::Base.establish_connection(
|
|
46
|
+
adapter: 'mysql2',
|
|
47
|
+
database: ENV['DB_NAME'] || 'workhorse',
|
|
48
|
+
username: ENV['DB_USERNAME'] || 'root',
|
|
49
|
+
password: ENV['DB_PASSWORD'] || '',
|
|
50
|
+
host: ENV['DB_HOST'] || '127.0.0.1',
|
|
51
|
+
pool: 10
|
|
52
|
+
)
|
|
44
53
|
|
|
45
54
|
require 'db_schema'
|
|
46
55
|
require 'workhorse'
|
|
@@ -48,6 +48,24 @@ class Workhorse::PollerTest < WorkhorseTest
|
|
|
48
48
|
assert_equal %w[q1 q2], w.poller.send(:valid_queues)
|
|
49
49
|
end
|
|
50
50
|
|
|
51
|
+
def test_valid_queues
|
|
52
|
+
w = Workhorse::Worker.new(polling_interval: 60)
|
|
53
|
+
|
|
54
|
+
assert_equal [], w.poller.send(:valid_queues)
|
|
55
|
+
|
|
56
|
+
Workhorse.enqueue BasicJob.new(sleep_time: 2), queue: nil
|
|
57
|
+
|
|
58
|
+
assert_equal [nil], w.poller.send(:valid_queues)
|
|
59
|
+
|
|
60
|
+
a_job = Workhorse.enqueue BasicJob.new(sleep_time: 2), queue: :a
|
|
61
|
+
|
|
62
|
+
assert_equal [nil, 'a'], w.poller.send(:valid_queues)
|
|
63
|
+
|
|
64
|
+
a_job.update_attribute :state, :locked
|
|
65
|
+
|
|
66
|
+
assert_equal [nil], w.poller.send(:valid_queues)
|
|
67
|
+
end
|
|
68
|
+
|
|
51
69
|
def test_no_queues
|
|
52
70
|
w = Workhorse::Worker.new(polling_interval: 60)
|
|
53
71
|
assert_equal [], w.poller.send(:valid_queues)
|
|
@@ -96,6 +114,67 @@ class Workhorse::PollerTest < WorkhorseTest
|
|
|
96
114
|
assert_equal 1, Workhorse::DbJob.where(state: :succeeded).count
|
|
97
115
|
end
|
|
98
116
|
|
|
117
|
+
def test_already_locked_issue
|
|
118
|
+
# Create 100 jobs
|
|
119
|
+
100.times do |i|
|
|
120
|
+
Workhorse.enqueue BasicJob.new(some_param: i, sleep_time: 0)
|
|
121
|
+
end
|
|
122
|
+
|
|
123
|
+
# Create 25 worker processes that work for 10s each
|
|
124
|
+
25.times do
|
|
125
|
+
Process.fork do
|
|
126
|
+
work 10, pool_size: 1, polling_interval: 0.1
|
|
127
|
+
end
|
|
128
|
+
end
|
|
129
|
+
|
|
130
|
+
# Create additional 100 jobs that are scheduled while the workers are
|
|
131
|
+
# already polling (to make sure those are picked up as well)
|
|
132
|
+
100.times do
|
|
133
|
+
sleep 0.05
|
|
134
|
+
Workhorse.enqueue BasicJob.new(sleep_time: 0)
|
|
135
|
+
end
|
|
136
|
+
|
|
137
|
+
# Wait for all forked processes to finish (should take ~10s)
|
|
138
|
+
Process.waitall
|
|
139
|
+
|
|
140
|
+
total = Workhorse::DbJob.count
|
|
141
|
+
succeeded = Workhorse::DbJob.succeeded.count
|
|
142
|
+
used_workers = Workhorse::DbJob.lock.pluck(:locked_by).uniq.size
|
|
143
|
+
|
|
144
|
+
# Make sure there are 200 jobs, all jobs have succeeded and that all of the
|
|
145
|
+
# workers have had their turn.
|
|
146
|
+
assert_equal 200, total
|
|
147
|
+
assert_equal 200, succeeded
|
|
148
|
+
assert_equal 25, used_workers
|
|
149
|
+
end
|
|
150
|
+
|
|
151
|
+
def test_connection_loss
|
|
152
|
+
$thread_conn = nil
|
|
153
|
+
|
|
154
|
+
Workhorse.enqueue BasicJob.new(sleep_time: 3)
|
|
155
|
+
|
|
156
|
+
t = Thread.new do
|
|
157
|
+
w = Workhorse::Worker.new(pool_size: 5, polling_interval: 0.1)
|
|
158
|
+
w.start
|
|
159
|
+
|
|
160
|
+
sleep 0.5
|
|
161
|
+
|
|
162
|
+
w.poller.define_singleton_method :poll do
|
|
163
|
+
fail ActiveRecord::StatementInvalid, 'Mysql2::Error: Connection was killed'
|
|
164
|
+
end
|
|
165
|
+
|
|
166
|
+
w.wait
|
|
167
|
+
end
|
|
168
|
+
|
|
169
|
+
assert_nothing_raised do
|
|
170
|
+
Timeout.timeout(6) do
|
|
171
|
+
t.join
|
|
172
|
+
end
|
|
173
|
+
end
|
|
174
|
+
|
|
175
|
+
assert_equal 1, Workhorse::DbJob.succeeded.count
|
|
176
|
+
end
|
|
177
|
+
|
|
99
178
|
private
|
|
100
179
|
|
|
101
180
|
def setup
|
data/workhorse.gemspec
CHANGED
|
@@ -1,15 +1,15 @@
|
|
|
1
1
|
# -*- encoding: utf-8 -*-
|
|
2
|
-
# stub: workhorse 0.
|
|
2
|
+
# stub: workhorse 1.0.0 ruby lib
|
|
3
3
|
|
|
4
4
|
Gem::Specification.new do |s|
|
|
5
5
|
s.name = "workhorse".freeze
|
|
6
|
-
s.version = "0.
|
|
6
|
+
s.version = "1.0.0"
|
|
7
7
|
|
|
8
8
|
s.required_rubygems_version = Gem::Requirement.new(">= 0".freeze) if s.respond_to? :required_rubygems_version=
|
|
9
9
|
s.require_paths = ["lib".freeze]
|
|
10
10
|
s.authors = ["Sitrox".freeze]
|
|
11
|
-
s.date = "2020-
|
|
12
|
-
s.files = [".gitignore".freeze, ".releaser_config".freeze, ".rubocop.yml".freeze, ".travis.yml".freeze, "CHANGELOG.md".freeze, "FAQ.md".freeze, "Gemfile".freeze, "LICENSE".freeze, "README.md".freeze, "RUBY_VERSION".freeze, "Rakefile".freeze, "VERSION".freeze, "bin/rubocop".freeze, "lib/active_job/queue_adapters/workhorse_adapter.rb".freeze, "lib/generators/workhorse/install_generator.rb".freeze, "lib/generators/workhorse/templates/bin/workhorse.rb".freeze, "lib/generators/workhorse/templates/config/initializers/workhorse.rb".freeze, "lib/generators/workhorse/templates/create_table_jobs.rb".freeze, "lib/workhorse.rb".freeze, "lib/workhorse/daemon.rb".freeze, "lib/workhorse/daemon/shell_handler.rb".freeze, "lib/workhorse/db_job.rb".freeze, "lib/workhorse/enqueuer.rb".freeze, "lib/workhorse/jobs/cleanup_succeeded_jobs.rb".freeze, "lib/workhorse/jobs/run_active_job.rb".freeze, "lib/workhorse/jobs/run_rails_op.rb".freeze, "lib/workhorse/performer.rb".freeze, "lib/workhorse/poller.rb".freeze, "lib/workhorse/pool.rb".freeze, "lib/workhorse/scoped_env.rb".freeze, "lib/workhorse/worker.rb".freeze, "test/active_job/queue_adapters/workhorse_adapter_test.rb".freeze, "test/lib/db_schema.rb".freeze, "test/lib/jobs.rb".freeze, "test/lib/test_helper.rb".freeze, "test/workhorse/db_job_test.rb".freeze, "test/workhorse/enqueuer_test.rb".freeze, "test/workhorse/performer_test.rb".freeze, "test/workhorse/poller_test.rb".freeze, "test/workhorse/pool_test.rb".freeze, "test/workhorse/worker_test.rb".freeze, "workhorse.gemspec".freeze]
|
|
11
|
+
s.date = "2020-09-21"
|
|
12
|
+
s.files = [".gitignore".freeze, ".releaser_config".freeze, ".rubocop.yml".freeze, ".travis.yml".freeze, "CHANGELOG.md".freeze, "FAQ.md".freeze, "Gemfile".freeze, "LICENSE".freeze, "README.md".freeze, "RUBY_VERSION".freeze, "Rakefile".freeze, "VERSION".freeze, "bin/rubocop".freeze, "lib/active_job/queue_adapters/workhorse_adapter.rb".freeze, "lib/generators/workhorse/install_generator.rb".freeze, "lib/generators/workhorse/templates/bin/workhorse.rb".freeze, "lib/generators/workhorse/templates/config/initializers/workhorse.rb".freeze, "lib/generators/workhorse/templates/create_table_jobs.rb".freeze, "lib/workhorse.rb".freeze, "lib/workhorse/daemon.rb".freeze, "lib/workhorse/daemon/shell_handler.rb".freeze, "lib/workhorse/db_job.rb".freeze, "lib/workhorse/enqueuer.rb".freeze, "lib/workhorse/jobs/cleanup_succeeded_jobs.rb".freeze, "lib/workhorse/jobs/detect_stale_jobs_job.rb".freeze, "lib/workhorse/jobs/run_active_job.rb".freeze, "lib/workhorse/jobs/run_rails_op.rb".freeze, "lib/workhorse/performer.rb".freeze, "lib/workhorse/poller.rb".freeze, "lib/workhorse/pool.rb".freeze, "lib/workhorse/scoped_env.rb".freeze, "lib/workhorse/worker.rb".freeze, "test/active_job/queue_adapters/workhorse_adapter_test.rb".freeze, "test/lib/db_schema.rb".freeze, "test/lib/jobs.rb".freeze, "test/lib/test_helper.rb".freeze, "test/workhorse/db_job_test.rb".freeze, "test/workhorse/enqueuer_test.rb".freeze, "test/workhorse/performer_test.rb".freeze, "test/workhorse/poller_test.rb".freeze, "test/workhorse/pool_test.rb".freeze, "test/workhorse/worker_test.rb".freeze, "workhorse.gemspec".freeze]
|
|
13
13
|
s.rubygems_version = "3.0.3".freeze
|
|
14
14
|
s.summary = "Multi-threaded job backend with database queuing for ruby.".freeze
|
|
15
15
|
s.test_files = ["test/active_job/queue_adapters/workhorse_adapter_test.rb".freeze, "test/lib/db_schema.rb".freeze, "test/lib/jobs.rb".freeze, "test/lib/test_helper.rb".freeze, "test/workhorse/db_job_test.rb".freeze, "test/workhorse/enqueuer_test.rb".freeze, "test/workhorse/performer_test.rb".freeze, "test/workhorse/poller_test.rb".freeze, "test/workhorse/pool_test.rb".freeze, "test/workhorse/worker_test.rb".freeze]
|
|
@@ -26,6 +26,7 @@ Gem::Specification.new do |s|
|
|
|
26
26
|
s.add_development_dependency(%q<colorize>.freeze, [">= 0"])
|
|
27
27
|
s.add_development_dependency(%q<benchmark-ips>.freeze, [">= 0"])
|
|
28
28
|
s.add_development_dependency(%q<activejob>.freeze, [">= 0"])
|
|
29
|
+
s.add_development_dependency(%q<pry>.freeze, [">= 0"])
|
|
29
30
|
s.add_runtime_dependency(%q<activesupport>.freeze, [">= 0"])
|
|
30
31
|
s.add_runtime_dependency(%q<activerecord>.freeze, [">= 0"])
|
|
31
32
|
s.add_runtime_dependency(%q<schemacop>.freeze, ["~> 2.0"])
|
|
@@ -39,6 +40,7 @@ Gem::Specification.new do |s|
|
|
|
39
40
|
s.add_dependency(%q<colorize>.freeze, [">= 0"])
|
|
40
41
|
s.add_dependency(%q<benchmark-ips>.freeze, [">= 0"])
|
|
41
42
|
s.add_dependency(%q<activejob>.freeze, [">= 0"])
|
|
43
|
+
s.add_dependency(%q<pry>.freeze, [">= 0"])
|
|
42
44
|
s.add_dependency(%q<activesupport>.freeze, [">= 0"])
|
|
43
45
|
s.add_dependency(%q<activerecord>.freeze, [">= 0"])
|
|
44
46
|
s.add_dependency(%q<schemacop>.freeze, ["~> 2.0"])
|
|
@@ -53,6 +55,7 @@ Gem::Specification.new do |s|
|
|
|
53
55
|
s.add_dependency(%q<colorize>.freeze, [">= 0"])
|
|
54
56
|
s.add_dependency(%q<benchmark-ips>.freeze, [">= 0"])
|
|
55
57
|
s.add_dependency(%q<activejob>.freeze, [">= 0"])
|
|
58
|
+
s.add_dependency(%q<pry>.freeze, [">= 0"])
|
|
56
59
|
s.add_dependency(%q<activesupport>.freeze, [">= 0"])
|
|
57
60
|
s.add_dependency(%q<activerecord>.freeze, [">= 0"])
|
|
58
61
|
s.add_dependency(%q<schemacop>.freeze, ["~> 2.0"])
|
metadata
CHANGED
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: workhorse
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.
|
|
4
|
+
version: 1.0.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Sitrox
|
|
8
8
|
autorequire:
|
|
9
9
|
bindir: bin
|
|
10
10
|
cert_chain: []
|
|
11
|
-
date: 2020-
|
|
11
|
+
date: 2020-09-21 00:00:00.000000000 Z
|
|
12
12
|
dependencies:
|
|
13
13
|
- !ruby/object:Gem::Dependency
|
|
14
14
|
name: bundler
|
|
@@ -122,6 +122,20 @@ dependencies:
|
|
|
122
122
|
- - ">="
|
|
123
123
|
- !ruby/object:Gem::Version
|
|
124
124
|
version: '0'
|
|
125
|
+
- !ruby/object:Gem::Dependency
|
|
126
|
+
name: pry
|
|
127
|
+
requirement: !ruby/object:Gem::Requirement
|
|
128
|
+
requirements:
|
|
129
|
+
- - ">="
|
|
130
|
+
- !ruby/object:Gem::Version
|
|
131
|
+
version: '0'
|
|
132
|
+
type: :development
|
|
133
|
+
prerelease: false
|
|
134
|
+
version_requirements: !ruby/object:Gem::Requirement
|
|
135
|
+
requirements:
|
|
136
|
+
- - ">="
|
|
137
|
+
- !ruby/object:Gem::Version
|
|
138
|
+
version: '0'
|
|
125
139
|
- !ruby/object:Gem::Dependency
|
|
126
140
|
name: activesupport
|
|
127
141
|
requirement: !ruby/object:Gem::Requirement
|
|
@@ -208,6 +222,7 @@ files:
|
|
|
208
222
|
- lib/workhorse/db_job.rb
|
|
209
223
|
- lib/workhorse/enqueuer.rb
|
|
210
224
|
- lib/workhorse/jobs/cleanup_succeeded_jobs.rb
|
|
225
|
+
- lib/workhorse/jobs/detect_stale_jobs_job.rb
|
|
211
226
|
- lib/workhorse/jobs/run_active_job.rb
|
|
212
227
|
- lib/workhorse/jobs/run_rails_op.rb
|
|
213
228
|
- lib/workhorse/performer.rb
|