distributed_job 2.0.0 → 3.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.gitignore +1 -0
- data/.rubocop.yml +3 -0
- data/CHANGELOG.md +5 -0
- data/README.md +34 -11
- data/lib/distributed_job/client.rb +55 -0
- data/lib/distributed_job/job.rb +281 -0
- data/lib/distributed_job/version.rb +2 -2
- data/lib/distributed_job.rb +3 -268
- metadata +4 -4
- data/.travis.yml +0 -6
- data/Gemfile.lock +0 -57
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 62f2efef7acceccdd2dc3339969cb9eed235f5e8df8367cdb88ddafdf00bc386
|
4
|
+
data.tar.gz: 90c9eb945bc7a3b0f4ec7526e2c87b953a866f7ea56972ecfe01dc296c325516
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 7e919014940a31859cf709a00f1be04d8b4cc015f7cd068808645474687c47ca55f77f4c5b608f511137a7c3c5c1fefca1a8c99989ca51f911f4fd9d893b6c50
|
7
|
+
data.tar.gz: d02553e6c36caffdfdfa78940ccfdce1478a41e4750facee08b5829540b362582fb740e6cc612f853b545ed73e69ebfbbb1e7480d756c4a2fd4da4889962909b
|
data/.gitignore
CHANGED
data/.rubocop.yml
CHANGED
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,10 @@
|
|
1
1
|
# CHANGELOG
|
2
2
|
|
3
|
+
## v3.0.0
|
4
|
+
|
5
|
+
* Split `DistributedJob` in `DistributedJob::Client` and `DistributedJob::Job`
|
6
|
+
* Add native namespace support and drop support for `Redis::Namespace`
|
7
|
+
|
3
8
|
## v2.0.0
|
4
9
|
|
5
10
|
* `#push_each` no longer returns an enum when no block is given
|
data/README.md
CHANGED
@@ -1,5 +1,8 @@
|
|
1
1
|
# DistributedJob
|
2
2
|
|
3
|
+
[](https://github.com/mrkamel/distributed_job/actions?query=workflow%3Atest+branch%3Amaster)
|
4
|
+
[](http://badge.fury.io/rb/distributed_job)
|
5
|
+
|
3
6
|
Easily keep track of distributed jobs consisting of an arbitrary number of
|
4
7
|
parts spanning multiple workers using redis. Can be used with any kind of
|
5
8
|
backround job processing queue.
|
@@ -26,10 +29,22 @@ Getting started is very easy. A `DistributedJob` allows to keep track of a
|
|
26
29
|
distributed job, i.e. a job which is split into multiple units running in
|
27
30
|
parallel and in multiple workers.
|
28
31
|
|
29
|
-
|
32
|
+
First, create a `DistributedJob::Client`:
|
30
33
|
|
31
34
|
```ruby
|
32
|
-
|
35
|
+
DistributedJobClient = DistributedJob::Client.new(redis: Redis.new)
|
36
|
+
```
|
37
|
+
|
38
|
+
You can specify a `namespace` to be additionally used for redis keys and set a
|
39
|
+
`default_ttl` for keys (Default is `86_400`, i.e. one day), which will be used
|
40
|
+
every time when keys in redis are updated to guarantee that the distributed
|
41
|
+
job metadata is cleaned up properly from redis at some point in time.
|
42
|
+
|
43
|
+
Afterwards, to create a distributed job and add parts, i.e. units of work, to
|
44
|
+
it, simply do:
|
45
|
+
|
46
|
+
```ruby
|
47
|
+
distributed_job = DistributedJobClient.build(token: SecureRandom.hex)
|
33
48
|
|
34
49
|
distributed_job.push_each(Date.parse('2021-01-01')..Date.today) do |date, part|
|
35
50
|
SomeBackgroundJob.perform_async(date, distributed_job.token, part)
|
@@ -44,13 +59,13 @@ this part finished after it has been successfully processed. Therefore, when
|
|
44
59
|
all those background jobs have successfully finished, all parts will be marked
|
45
60
|
as finished, such that the distributed job will finally be finished as well.
|
46
61
|
|
47
|
-
The `token` can be used to
|
62
|
+
The `token` can also be used to query the status of the distributed job, e.g.
|
48
63
|
on a job summary page or similar. You can show some progress bar in the browser
|
49
64
|
or in the terminal, etc:
|
50
65
|
|
51
66
|
```ruby
|
52
67
|
# token is given via URL or via some other means
|
53
|
-
distributed_job =
|
68
|
+
distributed_job = DistributedJobClient.build(token: params[:token])
|
54
69
|
|
55
70
|
distributed_job.total # total number of parts
|
56
71
|
distributed_job.count # number of unfinished parts
|
@@ -58,14 +73,14 @@ distributed_job.finished? # whether or not all parts are finished
|
|
58
73
|
distributed_job.open_parts # returns all not yet finished part id's
|
59
74
|
```
|
60
75
|
|
61
|
-
Within the background job, you use the passed token and part to query
|
62
|
-
update the status of the distributed job and part accordingly. Please note
|
76
|
+
Within the background job, you must use the passed `token` and `part` to query
|
77
|
+
and update the status of the distributed job and part accordingly. Please note
|
63
78
|
that you can use whatever background job processing tool you like most.
|
64
79
|
|
65
80
|
```ruby
|
66
81
|
class SomeBackgroundJob
|
67
82
|
def perform(whatever, token, part)
|
68
|
-
distributed_job =
|
83
|
+
distributed_job = DistributedJobClient.build(redis: Redis.new, token: token)
|
69
84
|
|
70
85
|
return if distributed_job.stopped?
|
71
86
|
|
@@ -85,10 +100,18 @@ end
|
|
85
100
|
```
|
86
101
|
|
87
102
|
The `#stop` and `#stopped?` methods can be used to globally stop a distributed
|
88
|
-
job in case of errors. Contrary, the `#done` method tells the
|
89
|
-
specified part has successfully finished. Finally, the `#finished?`
|
90
|
-
returns true when all parts of the distributed job are finished, which
|
91
|
-
to start cleanup jobs or to even start another subsequent distributed
|
103
|
+
job in case of errors. Contrary, the `#done` method tells the distributed job
|
104
|
+
that the specified part has successfully finished. Finally, the `#finished?`
|
105
|
+
method returns true when all parts of the distributed job are finished, which
|
106
|
+
is useful to start cleanup jobs or to even start another subsequent distributed
|
107
|
+
job.
|
108
|
+
|
109
|
+
That's it.
|
110
|
+
|
111
|
+
## Reference docs
|
112
|
+
|
113
|
+
Please find the reference docs at
|
114
|
+
[http://www.rubydoc.info/github/mrkamel/distributed_job](http://www.rubydoc.info/github/mrkamel/distributed_job)
|
92
115
|
|
93
116
|
## Development
|
94
117
|
|
@@ -0,0 +1,55 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module DistributedJob
|
4
|
+
# A `DistributedJob::Client` allows to easily manage distributed jobs. The
|
5
|
+
# main purpose of the client object is to configure settings to all
|
6
|
+
# distributed jobs or a group of distributed jobs like e.g. the redis
|
7
|
+
# connection and an optional namespace to be used to prefix all redis keys.
|
8
|
+
#
|
9
|
+
# @example
|
10
|
+
# DistributedJobClient = DistributedJob::Client.new(redis: Redis.new)
|
11
|
+
#
|
12
|
+
# distributed_job = DistributedJobClient.build(token: SecureRandom.hex)
|
13
|
+
#
|
14
|
+
# # Add job parts and queue background jobs
|
15
|
+
# distributed_job.push_each(Date.parse('2021-01-01')..Date.today) do |date, part|
|
16
|
+
# SomeBackgroundJob.perform_async(date, distributed_job.token, part)
|
17
|
+
# end
|
18
|
+
#
|
19
|
+
# distributed_job.token # can be used to query the status of the distributed job
|
20
|
+
|
21
|
+
class Client
|
22
|
+
attr_reader :redis, :namespace, :default_ttl
|
23
|
+
|
24
|
+
# Creates a new `DistributedJob::Client`.
|
25
|
+
#
|
26
|
+
# @param redis [Redis] The redis connection instance
|
27
|
+
# @param namespace [String] An optional namespace used to prefix redis keys
|
28
|
+
# @param default_ttl [Integer] The default number of seconds the jobs will
|
29
|
+
# stay available in redis. This value is used to automatically expire and
|
30
|
+
# clean up the jobs in redis. Default is 86400, i.e. one day. The ttl is
|
31
|
+
# used everytime the job is modified in redis.
|
32
|
+
#
|
33
|
+
# @example
|
34
|
+
# DistributedJobClient = DistributedJob::Client.new(redis: Redis.new)
|
35
|
+
|
36
|
+
def initialize(redis:, namespace: nil, default_ttl: 86_400)
|
37
|
+
@redis = redis
|
38
|
+
@namespace = namespace
|
39
|
+
@default_ttl = default_ttl
|
40
|
+
end
|
41
|
+
|
42
|
+
# Builds a new `DistributedJob::Job` instance.
|
43
|
+
#
|
44
|
+
# @param token [String] Some token to be used to identify the job. You can
|
45
|
+
# e.g. use SecureRandom.hex to generate one.
|
46
|
+
# @param ttl [Integer] The number of seconds the job will stay available
|
47
|
+
# in redis. This value is used to automatically expire and clean up the
|
48
|
+
# job in redis. Default is `default_ttl`, i.e. one day. The ttl is used
|
49
|
+
# everytime the job is modified in redis.
|
50
|
+
|
51
|
+
def build(token:, ttl: default_ttl)
|
52
|
+
Job.new(client: self, token: token, ttl: ttl)
|
53
|
+
end
|
54
|
+
end
|
55
|
+
end
|
@@ -0,0 +1,281 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module DistributedJob
|
4
|
+
# A `DistributedJob::Job` instance allows to keep track of a distributed job, i.e.
|
5
|
+
# a job which is split into multiple units running in parallel and in multiple
|
6
|
+
# workers using redis.
|
7
|
+
#
|
8
|
+
# @example Creating a distributed job
|
9
|
+
# distributed_job = DistributedJobClient.build(token: SecureRandom.hex)
|
10
|
+
#
|
11
|
+
# # Add job parts and queue background jobs
|
12
|
+
# distributed_job.push_each(Date.parse('2021-01-01')..Date.today) do |date, part|
|
13
|
+
# SomeBackgroundJob.perform_async(date, distributed_job.token, part)
|
14
|
+
# end
|
15
|
+
#
|
16
|
+
# distributed_job.token # can be used to query the status of the distributed job
|
17
|
+
#
|
18
|
+
# @example Processing a distributed job part
|
19
|
+
# class SomeBackgroundJob
|
20
|
+
# def perform(whatever, token, part)
|
21
|
+
# distributed_job = DistributedJobClient.build(token: token)
|
22
|
+
#
|
23
|
+
# return if distributed_job.stopped?
|
24
|
+
#
|
25
|
+
# # ...
|
26
|
+
#
|
27
|
+
# distributed_job.done(part)
|
28
|
+
#
|
29
|
+
# if distributed_job.finished?
|
30
|
+
# # perform e.g. cleanup or the some other job
|
31
|
+
# end
|
32
|
+
# rescue
|
33
|
+
# distributed_job.stop
|
34
|
+
#
|
35
|
+
# raise
|
36
|
+
# end
|
37
|
+
# end
|
38
|
+
|
39
|
+
class Job
|
40
|
+
attr_reader :client, :token, :ttl
|
41
|
+
|
42
|
+
# Initializes a new distributed job.
|
43
|
+
#
|
44
|
+
# @param client [DistributedJob::Client] The client instance
|
45
|
+
# @param token [String] Some token to be used to identify the job. You can
|
46
|
+
# e.g. use SecureRandom.hex to generate one.
|
47
|
+
# @param ttl [Integer] The number of seconds this job will stay available
|
48
|
+
# in redis. This value is used to automatically expire and clean up the
|
49
|
+
# job in redis. Default is 86400, i.e. one day. The ttl is used everytime
|
50
|
+
# the job is modified in redis.
|
51
|
+
#
|
52
|
+
# @example
|
53
|
+
# DistributedJobClient = DistributedJob::Client.new(redis: Redis.new)
|
54
|
+
#
|
55
|
+
# distributed_job = DistributedJob::Job.new(client: DistributedJobClient, token: SecureRandom.hex)
|
56
|
+
#
|
57
|
+
# # However, the preferred way to build a distributed job is:
|
58
|
+
#
|
59
|
+
# distributed_job = DistributedJobClient.build(token: SecureRandom.hex)
|
60
|
+
|
61
|
+
def initialize(client:, token:, ttl: 86_400)
|
62
|
+
@client = client
|
63
|
+
@token = token
|
64
|
+
@ttl = ttl
|
65
|
+
end
|
66
|
+
|
67
|
+
# Pass an enum to be used to iterate all the units of work of the distributed
|
68
|
+
# job. The distributed job needs to know all of them to keep track of the
|
69
|
+
# overall number and status of the parts. Passing an enum is much better
|
70
|
+
# compared to pushing the parts manually, because the distributed job needs
|
71
|
+
# to be closed before the last part of the distributed job is enqueued into
|
72
|
+
# some job queue. Otherwise it could potentially happen that the last part is
|
73
|
+
# already processed in the job queue before it is pushed to redis, such that
|
74
|
+
# the last job doesn't know that the distributed job is finished.
|
75
|
+
#
|
76
|
+
# @param enum [#each_with_index] The enum which can be iterated to get all
|
77
|
+
# job parts
|
78
|
+
#
|
79
|
+
# @example
|
80
|
+
# distributed_job.push_each(Date.parse('2021-01-01')..Date.today) do |date, part|
|
81
|
+
# # e.g. SomeBackgroundJob.perform_async(date, distributed_job.token, part)
|
82
|
+
# end
|
83
|
+
#
|
84
|
+
# @example ActiveRecord
|
85
|
+
# distributed_job.push_each(User.select(:id).find_in_batches) do |batch, part|
|
86
|
+
# # e.g. SomeBackgroundJob.perform_async(batch.first.id, batch.last.id, distributed_job.token, part)
|
87
|
+
# end
|
88
|
+
|
89
|
+
def push_each(enum)
|
90
|
+
previous_object = nil
|
91
|
+
previous_index = nil
|
92
|
+
|
93
|
+
enum.each_with_index do |current_object, current_index|
|
94
|
+
push(current_index)
|
95
|
+
|
96
|
+
yield(previous_object, previous_index.to_s) if previous_index
|
97
|
+
|
98
|
+
previous_object = current_object
|
99
|
+
previous_index = current_index
|
100
|
+
end
|
101
|
+
|
102
|
+
close
|
103
|
+
|
104
|
+
yield(previous_object, previous_index.to_s) if previous_index
|
105
|
+
end
|
106
|
+
|
107
|
+
# Returns all parts of the distributed job which are not yet finished.
|
108
|
+
#
|
109
|
+
# @return [Enumerator] The enum which allows to iterate all parts
|
110
|
+
|
111
|
+
def open_parts
|
112
|
+
redis.sscan_each("#{redis_key}:parts")
|
113
|
+
end
|
114
|
+
|
115
|
+
# Removes the specified part from the distributed job, i.e. from the set of
|
116
|
+
# unfinished parts. Use this method when the respective job part has been
|
117
|
+
# successfully processed, i.e. finished.
|
118
|
+
#
|
119
|
+
# @param part [String] The job part
|
120
|
+
# @returns [Boolean] Returns true when there are no more unfinished parts
|
121
|
+
# left or false otherwise
|
122
|
+
#
|
123
|
+
# @example
|
124
|
+
# class SomeBackgroundJob
|
125
|
+
# def perform(whatever, token, part)
|
126
|
+
# distributed_job = DistributedJobClient.build(token: token)
|
127
|
+
#
|
128
|
+
# # ...
|
129
|
+
#
|
130
|
+
# distributed_job.done(part)
|
131
|
+
# end
|
132
|
+
# end
|
133
|
+
|
134
|
+
def done(part)
|
135
|
+
@done_script ||= <<~SCRIPT
|
136
|
+
local key, part, ttl = ARGV[1], ARGV[2], tonumber(ARGV[3])
|
137
|
+
|
138
|
+
if redis.call('srem', key .. ':parts', part) == 0 then return end
|
139
|
+
|
140
|
+
redis.call('expire', key .. ':parts', ttl)
|
141
|
+
redis.call('expire', key .. ':state', ttl)
|
142
|
+
|
143
|
+
return redis.call('scard', key .. ':parts')
|
144
|
+
SCRIPT
|
145
|
+
|
146
|
+
redis.eval(@done_script, argv: [redis_key, part.to_s, ttl]) == 0 && closed?
|
147
|
+
end
|
148
|
+
|
149
|
+
# Returns the total number of pushed parts, no matter if finished or not.
|
150
|
+
#
|
151
|
+
# @example
|
152
|
+
# distributed_job.total # => e.g. 13
|
153
|
+
|
154
|
+
def total
|
155
|
+
redis.hget("#{redis_key}:state", 'total').to_i
|
156
|
+
end
|
157
|
+
|
158
|
+
# Returns the number of pushed parts which are not finished.
|
159
|
+
#
|
160
|
+
# @example
|
161
|
+
# distributed_job.count # => e.g. 8
|
162
|
+
|
163
|
+
def count
|
164
|
+
redis.scard("#{redis_key}:parts")
|
165
|
+
end
|
166
|
+
|
167
|
+
# Returns true if there are no more unfinished parts.
|
168
|
+
#
|
169
|
+
# @example
|
170
|
+
# distributed_job.finished? #=> true/false
|
171
|
+
|
172
|
+
def finished?
|
173
|
+
closed? && count.zero?
|
174
|
+
end
|
175
|
+
|
176
|
+
# Allows to stop a distributed job. This is useful if some error occurred in
|
177
|
+
# some part, i.e. background job, of the distributed job and you then want to
|
178
|
+
# stop all other not yet finished parts. Please note that only jobs can be
|
179
|
+
# stopped which ask the distributed job actively whether or not it was
|
180
|
+
# stopped.
|
181
|
+
#
|
182
|
+
# @returns [Boolean] Always returns true
|
183
|
+
#
|
184
|
+
# @example
|
185
|
+
# class SomeBackgroundJob
|
186
|
+
# def perform(whatever, token, part)
|
187
|
+
# distributed_job = DistributedJobClient.build(token: token)
|
188
|
+
#
|
189
|
+
# return if distributed_job.stopped?
|
190
|
+
#
|
191
|
+
# # ...
|
192
|
+
#
|
193
|
+
# distributed_job.done(part)
|
194
|
+
# rescue
|
195
|
+
# distributed_job.stop
|
196
|
+
#
|
197
|
+
# raise
|
198
|
+
# end
|
199
|
+
# end
|
200
|
+
|
201
|
+
def stop
|
202
|
+
redis.multi do
|
203
|
+
redis.hset("#{redis_key}:state", 'stopped', 1)
|
204
|
+
|
205
|
+
redis.expire("#{redis_key}:state", ttl)
|
206
|
+
redis.expire("#{redis_key}:parts", ttl)
|
207
|
+
end
|
208
|
+
|
209
|
+
true
|
210
|
+
end
|
211
|
+
|
212
|
+
# Returns true when the distributed job was stopped or false otherwise.
|
213
|
+
#
|
214
|
+
# @returns [Boolean] Returns true or false
|
215
|
+
#
|
216
|
+
# @example
|
217
|
+
# class SomeBackgroundJob
|
218
|
+
# def perform(whatever, token, part)
|
219
|
+
# distributed_job = DistributedJobClient.build(token: token)
|
220
|
+
#
|
221
|
+
# return if distributed_job.stopped?
|
222
|
+
#
|
223
|
+
# # ...
|
224
|
+
#
|
225
|
+
# distributed_job.done(part)
|
226
|
+
# rescue
|
227
|
+
# distributed_job.stop
|
228
|
+
#
|
229
|
+
# raise
|
230
|
+
# end
|
231
|
+
# end
|
232
|
+
|
233
|
+
def stopped?
|
234
|
+
redis.hget("#{redis_key}:state", 'stopped') == '1'
|
235
|
+
end
|
236
|
+
|
237
|
+
private
|
238
|
+
|
239
|
+
def redis
|
240
|
+
client.redis
|
241
|
+
end
|
242
|
+
|
243
|
+
def namespace
|
244
|
+
client.namespace
|
245
|
+
end
|
246
|
+
|
247
|
+
def close
|
248
|
+
redis.multi do
|
249
|
+
redis.hset("#{redis_key}:state", 'closed', 1)
|
250
|
+
|
251
|
+
redis.expire("#{redis_key}:state", ttl)
|
252
|
+
redis.expire("#{redis_key}:parts", ttl)
|
253
|
+
end
|
254
|
+
|
255
|
+
true
|
256
|
+
end
|
257
|
+
|
258
|
+
def closed?
|
259
|
+
redis.hget("#{redis_key}:state", 'closed') == '1'
|
260
|
+
end
|
261
|
+
|
262
|
+
def push(part)
|
263
|
+
@push_script ||= <<~SCRIPT
|
264
|
+
local key, part, ttl = ARGV[1], ARGV[2], tonumber(ARGV[3])
|
265
|
+
|
266
|
+
if redis.call('sadd', key .. ':parts', part) == 1 then
|
267
|
+
redis.call('hincrby', key .. ':state', 'total', 1)
|
268
|
+
end
|
269
|
+
|
270
|
+
redis.call('expire', key .. ':parts', ttl)
|
271
|
+
redis.call('expire', key .. ':state', ttl)
|
272
|
+
SCRIPT
|
273
|
+
|
274
|
+
redis.eval(@push_script, argv: [redis_key, part.to_s, ttl])
|
275
|
+
end
|
276
|
+
|
277
|
+
def redis_key
|
278
|
+
@redis_key ||= [namespace, 'distributed_jobs', token].compact.join(':')
|
279
|
+
end
|
280
|
+
end
|
281
|
+
end
|
data/lib/distributed_job.rb
CHANGED
@@ -1,273 +1,8 @@
|
|
1
1
|
# frozen_string_literal: true
|
2
2
|
|
3
3
|
require 'distributed_job/version'
|
4
|
+
require 'distributed_job/client'
|
5
|
+
require 'distributed_job/job'
|
4
6
|
require 'redis'
|
5
7
|
|
6
|
-
|
7
|
-
# jobs which are split into multiple units running in parallel and in multiple
|
8
|
-
# workers using redis.
|
9
|
-
#
|
10
|
-
# @example Creating a distributed job
|
11
|
-
# distributed_job = DistributedJob.new(redis: Redis.new, token: SecureRandom.hex)
|
12
|
-
#
|
13
|
-
# distributed_job.push_each(Date.parse('2021-01-01')..Date.today) do |date, part|
|
14
|
-
# SomeBackgroundJob.perform_async(date, distributed_job.token, part)
|
15
|
-
# end
|
16
|
-
#
|
17
|
-
# distributed_job.token # can be used to query the status of the distributed job
|
18
|
-
#
|
19
|
-
# @example Processing a distributed job part
|
20
|
-
# class SomeBackgroundJob
|
21
|
-
# def perform(whatever, token, part)
|
22
|
-
# distributed_job = DistributedJob.new(redis: Redis.new, token: token)
|
23
|
-
#
|
24
|
-
# return if distributed_job.stopped?
|
25
|
-
#
|
26
|
-
# # ...
|
27
|
-
#
|
28
|
-
# distributed_job.done(part)
|
29
|
-
#
|
30
|
-
# if distributed_job.finished?
|
31
|
-
# # perform e.g. cleanup or the some other job
|
32
|
-
# end
|
33
|
-
# rescue
|
34
|
-
# distributed_job.stop
|
35
|
-
#
|
36
|
-
# raise
|
37
|
-
# end
|
38
|
-
# end
|
39
|
-
|
40
|
-
class DistributedJob
|
41
|
-
attr_reader :redis, :token, :ttl
|
42
|
-
|
43
|
-
# Initializes a new distributed job.
|
44
|
-
#
|
45
|
-
# @param redis [Redis] The redis connection instance
|
46
|
-
# @param token [String] Some token to be used to identify the job. You can
|
47
|
-
# e.g. use SecureRandom.hex to generate one.
|
48
|
-
# @param ttl [Integer] The number of seconds this job will stay available
|
49
|
-
# in redis. This value is used to automatically expire and clean up the
|
50
|
-
# job in redis. Default is 86400, i.e. one day. The ttl is used everytime
|
51
|
-
# the job is modified in redis.
|
52
|
-
#
|
53
|
-
# @example
|
54
|
-
# distributed_job = DistributedJob.new(redis: Redis.new, token: SecureRandom.hex)
|
55
|
-
|
56
|
-
def initialize(redis:, token:, ttl: 86_400)
|
57
|
-
@redis = redis
|
58
|
-
@token = token
|
59
|
-
@ttl = ttl
|
60
|
-
end
|
61
|
-
|
62
|
-
# Pass an enum to be used to iterate all the units of work of the distributed
|
63
|
-
# job. The distributed job needs to know all of them to keep track of the
|
64
|
-
# overall number and status of the parts. Passing an enum is much better
|
65
|
-
# compared to pushing the parts manually, because the distributed job needs
|
66
|
-
# to be closed before the last part of the distributed job is enqueued into
|
67
|
-
# some job queue. Otherwise it could potentially happen that the last part is
|
68
|
-
# already processed in the job queue before it is pushed to redis, such that
|
69
|
-
# the last job doesn't know that the distributed job is finished.
|
70
|
-
#
|
71
|
-
# @param enum [#each_with_index] The enum which can be iterated to get all
|
72
|
-
# job parts
|
73
|
-
#
|
74
|
-
# @example
|
75
|
-
# distributed_job.push_each(Date.parse('2021-01-01')..Date.today) do |date, part|
|
76
|
-
# # e.g. SomeBackgroundJob.perform_async(date, distributed_job.token, part)
|
77
|
-
# end
|
78
|
-
#
|
79
|
-
# @example ActiveRecord
|
80
|
-
# distributed_job.push_each(User.select(:id).find_in_batches) do |batch, part|
|
81
|
-
# # e.g. SomeBackgroundJob.perform_async(batch.first.id, batch.last.id, distributed_job.token, part)
|
82
|
-
# end
|
83
|
-
|
84
|
-
def push_each(enum)
|
85
|
-
previous_object = nil
|
86
|
-
previous_index = nil
|
87
|
-
|
88
|
-
enum.each_with_index do |current_object, current_index|
|
89
|
-
push(current_index)
|
90
|
-
|
91
|
-
yield(previous_object, previous_index.to_s) if previous_index
|
92
|
-
|
93
|
-
previous_object = current_object
|
94
|
-
previous_index = current_index
|
95
|
-
end
|
96
|
-
|
97
|
-
close
|
98
|
-
|
99
|
-
yield(previous_object, previous_index.to_s) if previous_index
|
100
|
-
end
|
101
|
-
|
102
|
-
# Returns all parts of the distributed job which are not yet finished.
|
103
|
-
#
|
104
|
-
# @return [Enumerator] The enum which allows to iterate all parts
|
105
|
-
|
106
|
-
def open_parts
|
107
|
-
redis.sscan_each("#{redis_key}:parts")
|
108
|
-
end
|
109
|
-
|
110
|
-
# Removes the specified part from the distributed job, i.e. from the set of
|
111
|
-
# unfinished parts. Use this method when the respective job part has been
|
112
|
-
# successfully processed, i.e. finished.
|
113
|
-
#
|
114
|
-
# @param part [String] The job part
|
115
|
-
# @returns [Boolean] Returns true when there are no more unfinished parts
|
116
|
-
# left or false otherwise
|
117
|
-
#
|
118
|
-
# @example
|
119
|
-
# class SomeBackgroundJob
|
120
|
-
# def perform(whatever, token, part)
|
121
|
-
# distributed_job = DistributedJob.new(redis: Redis.new, token: token)
|
122
|
-
#
|
123
|
-
# # ...
|
124
|
-
#
|
125
|
-
# distributed_job.done(part)
|
126
|
-
# end
|
127
|
-
# end
|
128
|
-
|
129
|
-
def done(part)
|
130
|
-
@done_script ||= <<~SCRIPT
|
131
|
-
local key, part, ttl = ARGV[1], ARGV[2], tonumber(ARGV[3])
|
132
|
-
|
133
|
-
if redis.call('srem', key .. ':parts', part) == 0 then return end
|
134
|
-
|
135
|
-
redis.call('expire', key .. ':parts', ttl)
|
136
|
-
redis.call('expire', key .. ':state', ttl)
|
137
|
-
|
138
|
-
return redis.call('scard', key .. ':parts')
|
139
|
-
SCRIPT
|
140
|
-
|
141
|
-
redis.eval(@done_script, argv: [redis_script_key, part.to_s, ttl]) == 0 && closed?
|
142
|
-
end
|
143
|
-
|
144
|
-
# Returns the total number of pushed parts, no matter if finished or not.
|
145
|
-
#
|
146
|
-
# @example
|
147
|
-
# distributed_job.total # => e.g. 13
|
148
|
-
|
149
|
-
def total
|
150
|
-
redis.hget("#{redis_key}:state", 'total').to_i
|
151
|
-
end
|
152
|
-
|
153
|
-
# Returns the number of pushed parts which are not finished.
|
154
|
-
#
|
155
|
-
# @example
|
156
|
-
# distributed_job.count # => e.g. 8
|
157
|
-
|
158
|
-
def count
|
159
|
-
redis.scard("#{redis_key}:parts")
|
160
|
-
end
|
161
|
-
|
162
|
-
# Returns true if there are no more unfinished parts.
|
163
|
-
#
|
164
|
-
# @example
|
165
|
-
# distributed_job.finished? #=> true/false
|
166
|
-
|
167
|
-
def finished?
|
168
|
-
closed? && count.zero?
|
169
|
-
end
|
170
|
-
|
171
|
-
# Allows to stop a distributed job. This is useful if some error occurred in
|
172
|
-
# some part, i.e. background job, of the distributed job and you then want to
|
173
|
-
# stop all other not yet finished parts. Please note that only jobs can be
|
174
|
-
# stopped which ask the distributed job actively whether or not it was
|
175
|
-
# stopped.
|
176
|
-
#
|
177
|
-
# @returns [Boolean] Always returns true
|
178
|
-
#
|
179
|
-
# @example
|
180
|
-
# class SomeBackgroundJob
|
181
|
-
# def perform(whatever, token, part)
|
182
|
-
# distributed_job = DistributedJob.new(redis: Redis.new, token: token)
|
183
|
-
#
|
184
|
-
# return if distributed_job.stopped?
|
185
|
-
#
|
186
|
-
# # ...
|
187
|
-
#
|
188
|
-
# distributed_job.done(part)
|
189
|
-
# rescue
|
190
|
-
# distributed_job.stop
|
191
|
-
#
|
192
|
-
# raise
|
193
|
-
# end
|
194
|
-
# end
|
195
|
-
|
196
|
-
def stop
|
197
|
-
redis.multi do
|
198
|
-
redis.hset("#{redis_key}:state", 'stopped', 1)
|
199
|
-
|
200
|
-
redis.expire("#{redis_key}:state", ttl)
|
201
|
-
redis.expire("#{redis_key}:parts", ttl)
|
202
|
-
end
|
203
|
-
|
204
|
-
true
|
205
|
-
end
|
206
|
-
|
207
|
-
# Returns true when the distributed job was stopped or false otherwise.
|
208
|
-
#
|
209
|
-
# @returns [Boolean] Returns true or false
|
210
|
-
#
|
211
|
-
# @example
|
212
|
-
# class SomeBackgroundJob
|
213
|
-
# def perform(whatever, token, part)
|
214
|
-
# distributed_job = DistributedJob.new(redis: Redis.new, token: token)
|
215
|
-
#
|
216
|
-
# return if distributed_job.stopped?
|
217
|
-
#
|
218
|
-
# # ...
|
219
|
-
#
|
220
|
-
# distributed_job.done(part)
|
221
|
-
# rescue
|
222
|
-
# distributed_job.stop
|
223
|
-
#
|
224
|
-
# raise
|
225
|
-
# end
|
226
|
-
# end
|
227
|
-
|
228
|
-
def stopped?
|
229
|
-
redis.hget("#{redis_key}:state", 'stopped') == '1'
|
230
|
-
end
|
231
|
-
|
232
|
-
private
|
233
|
-
|
234
|
-
def close
|
235
|
-
redis.multi do
|
236
|
-
redis.hset("#{redis_key}:state", 'closed', 1)
|
237
|
-
|
238
|
-
redis.expire("#{redis_key}:state", ttl)
|
239
|
-
redis.expire("#{redis_key}:parts", ttl)
|
240
|
-
end
|
241
|
-
|
242
|
-
true
|
243
|
-
end
|
244
|
-
|
245
|
-
def closed?
|
246
|
-
redis.hget("#{redis_key}:state", 'closed') == '1'
|
247
|
-
end
|
248
|
-
|
249
|
-
def push(part)
|
250
|
-
@push_script ||= <<~SCRIPT
|
251
|
-
local key, part, ttl = ARGV[1], ARGV[2], tonumber(ARGV[3])
|
252
|
-
|
253
|
-
if redis.call('sadd', key .. ':parts', part) == 1 then
|
254
|
-
redis.call('hincrby', key .. ':state', 'total', 1)
|
255
|
-
end
|
256
|
-
|
257
|
-
redis.call('expire', key .. ':parts', ttl)
|
258
|
-
redis.call('expire', key .. ':state', ttl)
|
259
|
-
SCRIPT
|
260
|
-
|
261
|
-
redis.eval(@push_script, argv: [redis_script_key, part.to_s, ttl])
|
262
|
-
end
|
263
|
-
|
264
|
-
def redis_key
|
265
|
-
@redis_key ||= "distributed_jobs:#{token}"
|
266
|
-
end
|
267
|
-
|
268
|
-
def redis_script_key
|
269
|
-
return "#{redis.namespace}:#{redis_key}" if redis.respond_to?(:namespace)
|
270
|
-
|
271
|
-
redis_key
|
272
|
-
end
|
273
|
-
end
|
8
|
+
module DistributedJob; end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: distributed_job
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version:
|
4
|
+
version: 3.0.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Benjamin Vetter
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2021-
|
11
|
+
date: 2021-11-02 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: rspec
|
@@ -63,10 +63,8 @@ files:
|
|
63
63
|
- ".gitignore"
|
64
64
|
- ".rspec"
|
65
65
|
- ".rubocop.yml"
|
66
|
-
- ".travis.yml"
|
67
66
|
- CHANGELOG.md
|
68
67
|
- Gemfile
|
69
|
-
- Gemfile.lock
|
70
68
|
- LICENSE.txt
|
71
69
|
- README.md
|
72
70
|
- Rakefile
|
@@ -75,6 +73,8 @@ files:
|
|
75
73
|
- distributed_job.gemspec
|
76
74
|
- docker-compose.yml
|
77
75
|
- lib/distributed_job.rb
|
76
|
+
- lib/distributed_job/client.rb
|
77
|
+
- lib/distributed_job/job.rb
|
78
78
|
- lib/distributed_job/version.rb
|
79
79
|
homepage: https://github.com/mrkamel/distributed_job
|
80
80
|
licenses:
|
data/.travis.yml
DELETED
data/Gemfile.lock
DELETED
@@ -1,57 +0,0 @@
|
|
1
|
-
PATH
|
2
|
-
remote: .
|
3
|
-
specs:
|
4
|
-
distributed_job (2.0.0)
|
5
|
-
redis
|
6
|
-
|
7
|
-
GEM
|
8
|
-
remote: https://rubygems.org/
|
9
|
-
specs:
|
10
|
-
ast (2.4.2)
|
11
|
-
diff-lcs (1.4.4)
|
12
|
-
parallel (1.21.0)
|
13
|
-
parser (3.0.2.0)
|
14
|
-
ast (~> 2.4.1)
|
15
|
-
rainbow (3.0.0)
|
16
|
-
rake (12.3.3)
|
17
|
-
redis (4.5.1)
|
18
|
-
regexp_parser (2.1.1)
|
19
|
-
rexml (3.2.5)
|
20
|
-
rspec (3.10.0)
|
21
|
-
rspec-core (~> 3.10.0)
|
22
|
-
rspec-expectations (~> 3.10.0)
|
23
|
-
rspec-mocks (~> 3.10.0)
|
24
|
-
rspec-core (3.10.1)
|
25
|
-
rspec-support (~> 3.10.0)
|
26
|
-
rspec-expectations (3.10.1)
|
27
|
-
diff-lcs (>= 1.2.0, < 2.0)
|
28
|
-
rspec-support (~> 3.10.0)
|
29
|
-
rspec-mocks (3.10.2)
|
30
|
-
diff-lcs (>= 1.2.0, < 2.0)
|
31
|
-
rspec-support (~> 3.10.0)
|
32
|
-
rspec-support (3.10.2)
|
33
|
-
rubocop (1.22.1)
|
34
|
-
parallel (~> 1.10)
|
35
|
-
parser (>= 3.0.0.0)
|
36
|
-
rainbow (>= 2.2.2, < 4.0)
|
37
|
-
regexp_parser (>= 1.8, < 3.0)
|
38
|
-
rexml
|
39
|
-
rubocop-ast (>= 1.12.0, < 2.0)
|
40
|
-
ruby-progressbar (~> 1.7)
|
41
|
-
unicode-display_width (>= 1.4.0, < 3.0)
|
42
|
-
rubocop-ast (1.12.0)
|
43
|
-
parser (>= 3.0.1.1)
|
44
|
-
ruby-progressbar (1.11.0)
|
45
|
-
unicode-display_width (2.1.0)
|
46
|
-
|
47
|
-
PLATFORMS
|
48
|
-
ruby
|
49
|
-
|
50
|
-
DEPENDENCIES
|
51
|
-
distributed_job!
|
52
|
-
rake (~> 12.0)
|
53
|
-
rspec (~> 3.0)
|
54
|
-
rubocop
|
55
|
-
|
56
|
-
BUNDLED WITH
|
57
|
-
2.1.4
|