distributed_job 3.0.0 → 3.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 62f2efef7acceccdd2dc3339969cb9eed235f5e8df8367cdb88ddafdf00bc386
4
- data.tar.gz: 90c9eb945bc7a3b0f4ec7526e2c87b953a866f7ea56972ecfe01dc296c325516
3
+ metadata.gz: 16178ecd755b5a3b3666711c23d6a29c87ec93ac04027df87b8e580c9f974599
4
+ data.tar.gz: c469b7af9c967395e6cf1fea21aac21d0cad14ac92a59eb8320c1cf601460c65
5
5
  SHA512:
6
- metadata.gz: 7e919014940a31859cf709a00f1be04d8b4cc015f7cd068808645474687c47ca55f77f4c5b608f511137a7c3c5c1fefca1a8c99989ca51f911f4fd9d893b6c50
7
- data.tar.gz: d02553e6c36caffdfdfa78940ccfdce1478a41e4750facee08b5829540b362582fb740e6cc612f853b545ed73e69ebfbbb1e7480d756c4a2fd4da4889962909b
6
+ metadata.gz: 534bb12b74aa562af37d913164ffe134084b417cc5f1e51663682181bffcabb883501dcfc0bdeb8a2d60c9a95bfe872a3a6efad0c7ac015c27dcf3109d941643
7
+ data.tar.gz: 74a0c64a0f31e00e17a7e977a201b7a6c8f51c013960b83068023c4f3c7bddcb161bab9bef9abe005cf38407c2c3c35482d59753c20578d6cdbed4c62137a8a2
data/.rubocop.yml CHANGED
@@ -3,6 +3,9 @@ AllCops:
3
3
  TargetRubyVersion: 2.5
4
4
  SuggestExtensions: false
5
5
 
6
+ Gemspec/RequireMFA:
7
+ Enabled: false
8
+
6
9
  Metrics/MethodLength:
7
10
  Enabled: false
8
11
 
data/CHANGELOG.md CHANGED
@@ -1,5 +1,14 @@
1
1
  # CHANGELOG
2
2
 
3
+ ## v3.1.0
4
+
5
+ * Added `DistributedJob::Job#push_all`
6
+ * Added `DistributedJob::Job#open_part?`
7
+
8
+ ## v3.0.1
9
+
10
+ * Fix pipelining with regards to redis-rb 4.6.0
11
+
3
12
  ## v3.0.0
4
13
 
5
14
  * Split `DistributedJob` in `DistributedJob::Client` and `DistributedJob::Job`
data/README.md CHANGED
@@ -40,28 +40,28 @@ You can specify a `namespace` to be additionally used for redis keys and set a
40
40
  every time when keys in redis are updated to guarantee that the distributed
41
41
  job metadata is cleaned up properly from redis at some point in time.
42
42
 
43
- Afterwards, to create a distributed job and add parts, i.e. units of work, to
44
- it, simply do:
43
+ Afterwards, you have two options to add parts, i.e. units of work, to the
44
+ distributed job. The first option is to use `#push_all` and pass an enum:
45
45
 
46
46
  ```ruby
47
47
  distributed_job = DistributedJobClient.build(token: SecureRandom.hex)
48
+ distributed_job.push_all(["job1", "job2", "job3"])
48
49
 
49
- distributed_job.push_each(Date.parse('2021-01-01')..Date.today) do |date, part|
50
- SomeBackgroundJob.perform_async(date, distributed_job.token, part)
51
- end
50
+ Job1.perform_async(distributed_job.token)
51
+ Job2.perform_async(distributed_job.token)
52
+ Job3.perform_async(distributed_job.token)
52
53
 
53
54
  distributed_job.token # can be used to query the status of the distributed job
54
55
  ```
55
56
 
56
- The `part` which is passed to the block is some id for one particular part of
57
- the distributed job. It must be used in a respective background job to mark
58
- this part finished after it has been successfully processed. Therefore, when
59
- all those background jobs have successfully finished, all parts will be marked
60
- as finished, such that the distributed job will finally be finished as well.
61
-
62
- The `token` can also be used to query the status of the distributed job, e.g.
63
- on a job summary page or similar. You can show some progress bar in the browser
64
- or in the terminal, etc:
57
+ Here, 3 parts named `job1`, `job2` and `job3` are added to the distributed job
58
+ and then 3 corresponding background jobs are enqueued. It is important to push
59
+ the parts before the background jobs are enqueued. Otherwise the background
60
+ jobs maybe can't find them. The `token` must be passed to the background jobs,
61
+ such that the background job can update the status of the distributed job by
62
+ marking the respective part as done. The token can also be used to query the
63
+ status of the distributed job, e.g. on a job summary page or similar. You can
64
+ also show some progress bar in the browser or in the terminal, etc.
65
65
 
66
66
  ```ruby
67
67
  # token is given via URL or via some other means
@@ -71,8 +71,29 @@ distributed_job.total # total number of parts
71
71
  distributed_job.count # number of unfinished parts
72
72
  distributed_job.finished? # whether or not all parts are finished
73
73
  distributed_job.open_parts # returns all not yet finished part id's
74
+
75
+ distributed_job.done('job1') # marks the respective part as done
74
76
  ```
75
77
 
78
+ The second option is to use `#push_each`:
79
+
80
+ ```ruby
81
+ distributed_job = DistributedJobClient.build(token: SecureRandom.hex)
82
+
83
+ distributed_job.push_each(Date.parse('2021-01-01')..Date.today) do |date, part|
84
+ SomeBackgroundJob.perform_async(date, distributed_job.token, part)
85
+ end
86
+
87
+ distributed_job.token # again, can be used to query the status of the distributed job
88
+ ```
89
+
90
+ Here, the part name is automatically generated to be some id and passed as
91
+ `part` to the block. The part must also be passed to the respective background
92
+ job for it be able to mark the part as finished after it has been successfully
93
+ processed. Therefore, when all those background jobs have successfully
94
+ finished, all parts will be marked as finished, such that the distributed job
95
+ will finally be finished as well.
96
+
76
97
  Within the background job, you must use the passed `token` and `part` to query
77
98
  and update the status of the distributed job and part accordingly. Please note
78
99
  that you can use whatever background job processing tool you like most.
@@ -86,9 +107,7 @@ class SomeBackgroundJob
86
107
 
87
108
  # ...
88
109
 
89
- distributed_job.done(part)
90
-
91
- if distributed_job.finished?
110
+ if distributed_job.done(part)
92
111
  # perform e.g. cleanup or the some other job
93
112
  end
94
113
  rescue
@@ -101,10 +120,9 @@ end
101
120
 
102
121
  The `#stop` and `#stopped?` methods can be used to globally stop a distributed
103
122
  job in case of errors. Contrary, the `#done` method tells the distributed job
104
- that the specified part has successfully finished. Finally, the `#finished?`
105
- method returns true when all parts of the distributed job are finished, which
106
- is useful to start cleanup jobs or to even start another subsequent distributed
107
- job.
123
+ that the specified part has successfully finished. The `#done` method returns
124
+ true when all parts of the distributed job have finished, which is useful to
125
+ start cleanup jobs or to even start another subsequent distributed job.
108
126
 
109
127
  That's it.
110
128
 
@@ -29,5 +29,5 @@ Gem::Specification.new do |spec|
29
29
 
30
30
  spec.add_development_dependency 'rspec'
31
31
  spec.add_development_dependency 'rubocop'
32
- spec.add_dependency 'redis'
32
+ spec.add_dependency 'redis', '>= 4.1.0'
33
33
  end
@@ -1,6 +1,8 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module DistributedJob
4
+ class AlreadyClosed < StandardError; end
5
+
4
6
  # A `DistributedJob::Job` instance allows to keep track of a distributed job, i.e.
5
7
  # a job which is split into multiple units running in parallel and in multiple
6
8
  # workers using redis.
@@ -24,9 +26,7 @@ module DistributedJob
24
26
  #
25
27
  # # ...
26
28
  #
27
- # distributed_job.done(part)
28
- #
29
- # if distributed_job.finished?
29
+ # if distributed_job.done(part)
30
30
  # # perform e.g. cleanup or the some other job
31
31
  # end
32
32
  # rescue
@@ -87,6 +87,8 @@ module DistributedJob
87
87
  # end
88
88
 
89
89
  def push_each(enum)
90
+ raise(AlreadyClosed, 'The distributed job is already closed') if closed?
91
+
90
92
  previous_object = nil
91
93
  previous_index = nil
92
94
 
@@ -104,6 +106,35 @@ module DistributedJob
104
106
  yield(previous_object, previous_index.to_s) if previous_index
105
107
  end
106
108
 
109
+ # Pass an enum to be used to iterate all the units of work of the
110
+ # distributed job. The values of the enum are used for the names of the
111
+ # parts, such that values listed multiple times (duplicates) will only be
112
+ # added once to the distributed job. The distributed job needs to know all
113
+ # of them to keep track of the overall number and status of the parts.
114
+ # Passing an enum is much better compared to pushing the parts manually,
115
+ # because the distributed job needs to be closed before the last part of
116
+ # the distributed job is enqueued into some job queue. Otherwise it could
117
+ # potentially happen that the last part is already processed in the job
118
+ # queue before it is pushed to redis, such that the last job doesn't know
119
+ # that the distributed job is finished.
120
+ #
121
+ # @param enum [#each] The enum which can be iterated to get all
122
+ # job parts
123
+ #
124
+ # @example
125
+ # distributed_job.push_all(0..128)
126
+ # distributed_job.push(['part1', 'part2', 'part3'])
127
+
128
+ def push_all(enum)
129
+ raise(AlreadyClosed, 'The distributed job is already closed') if closed?
130
+
131
+ enum.each do |part|
132
+ push(part)
133
+ end
134
+
135
+ close
136
+ end
137
+
107
138
  # Returns all parts of the distributed job which are not yet finished.
108
139
  #
109
140
  # @return [Enumerator] The enum which allows to iterate all parts
@@ -112,6 +143,15 @@ module DistributedJob
112
143
  redis.sscan_each("#{redis_key}:parts")
113
144
  end
114
145
 
146
+ # Returns whether or not the part is in the list of open parts of the
147
+ # distributed job.
148
+ #
149
+ # @return [Boolean] Returns true or false
150
+
151
+ def open_part?(part)
152
+ redis.sismember("#{redis_key}:parts", part.to_s)
153
+ end
154
+
115
155
  # Removes the specified part from the distributed job, i.e. from the set of
116
156
  # unfinished parts. Use this method when the respective job part has been
117
157
  # successfully processed, i.e. finished.
@@ -199,11 +239,11 @@ module DistributedJob
199
239
  # end
200
240
 
201
241
  def stop
202
- redis.multi do
203
- redis.hset("#{redis_key}:state", 'stopped', 1)
242
+ redis.multi do |transaction|
243
+ transaction.hset("#{redis_key}:state", 'stopped', 1)
204
244
 
205
- redis.expire("#{redis_key}:state", ttl)
206
- redis.expire("#{redis_key}:parts", ttl)
245
+ transaction.expire("#{redis_key}:state", ttl)
246
+ transaction.expire("#{redis_key}:parts", ttl)
207
247
  end
208
248
 
209
249
  true
@@ -245,11 +285,11 @@ module DistributedJob
245
285
  end
246
286
 
247
287
  def close
248
- redis.multi do
249
- redis.hset("#{redis_key}:state", 'closed', 1)
288
+ redis.multi do |transaction|
289
+ transaction.hset("#{redis_key}:state", 'closed', 1)
250
290
 
251
- redis.expire("#{redis_key}:state", ttl)
252
- redis.expire("#{redis_key}:parts", ttl)
291
+ transaction.expire("#{redis_key}:state", ttl)
292
+ transaction.expire("#{redis_key}:parts", ttl)
253
293
  end
254
294
 
255
295
  true
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module DistributedJob
4
- VERSION = '3.0.0'
4
+ VERSION = '3.1.0'
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: distributed_job
3
3
  version: !ruby/object:Gem::Version
4
- version: 3.0.0
4
+ version: 3.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Benjamin Vetter
8
- autorequire:
8
+ autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2021-11-02 00:00:00.000000000 Z
11
+ date: 2022-09-22 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rspec
@@ -44,14 +44,14 @@ dependencies:
44
44
  requirements:
45
45
  - - ">="
46
46
  - !ruby/object:Gem::Version
47
- version: '0'
47
+ version: 4.1.0
48
48
  type: :runtime
49
49
  prerelease: false
50
50
  version_requirements: !ruby/object:Gem::Requirement
51
51
  requirements:
52
52
  - - ">="
53
53
  - !ruby/object:Gem::Version
54
- version: '0'
54
+ version: 4.1.0
55
55
  description: Keep track of distributed jobs spanning multiple workers using redis
56
56
  email:
57
57
  - benjamin.vetter@wlw.de
@@ -83,7 +83,7 @@ metadata:
83
83
  homepage_uri: https://github.com/mrkamel/distributed_job
84
84
  source_code_uri: https://github.com/mrkamel/distributed_job
85
85
  changelog_uri: https://github.com/mrkamel/distributed_job/blob/master/CHANGELOG.md
86
- post_install_message:
86
+ post_install_message:
87
87
  rdoc_options: []
88
88
  require_paths:
89
89
  - lib
@@ -98,8 +98,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
98
98
  - !ruby/object:Gem::Version
99
99
  version: '0'
100
100
  requirements: []
101
- rubygems_version: 3.0.3
102
- signing_key:
101
+ rubygems_version: 3.3.3
102
+ signing_key:
103
103
  specification_version: 4
104
104
  summary: Keep track of distributed jobs using redis
105
105
  test_files: []