RubyGems - distributed_job - Versions diffs - 3.0.1 → 3.1.0 - Mend

distributed_job 3.0.1 → 3.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +5 -0
data/README.md +35 -14
data/lib/distributed_job/job.rb +42 -0
data/lib/distributed_job/version.rb +1 -1
metadata +2 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 8bda4273dc59888e0269d7d14d69180e9a67bdb793e9c7f9f24bcc9f88a1b8fd
-  data.tar.gz: 8a3108b573a9e46e78d57979dfab70906459947651c1b2cda1391f66aa8e0f93
+  metadata.gz: 16178ecd755b5a3b3666711c23d6a29c87ec93ac04027df87b8e580c9f974599
+  data.tar.gz: c469b7af9c967395e6cf1fea21aac21d0cad14ac92a59eb8320c1cf601460c65
 SHA512:
-  metadata.gz: e553072546911dffba6bd7f40e9f4eb6c7fe4f0a0338437ffc21cdc8e442c91cb0b51fa2afbcf070e257d2caaade5954867710e14e7919fde335aaebb8ec3867
-  data.tar.gz: 86ae97dd953642cc225f6b4280776c84dfc080cc4af90d8519352f6d4faf0a26ab736df372cac734a1884382a0173e9a44784371858b5e5f137bf68b2b6193d0
+  metadata.gz: 534bb12b74aa562af37d913164ffe134084b417cc5f1e51663682181bffcabb883501dcfc0bdeb8a2d60c9a95bfe872a3a6efad0c7ac015c27dcf3109d941643
+  data.tar.gz: 74a0c64a0f31e00e17a7e977a201b7a6c8f51c013960b83068023c4f3c7bddcb161bab9bef9abe005cf38407c2c3c35482d59753c20578d6cdbed4c62137a8a2

data/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,10 @@
 # CHANGELOG
+## v3.1.0
+* Added `DistributedJob::Job#push_all`
+* Added `DistributedJob::Job#open_part?`
 ## v3.0.1
 * Fix pipelining with regards to redis-rb 4.6.0

data/README.md CHANGED Viewed

@@ -40,28 +40,28 @@ You can specify a `namespace` to be additionally used for redis keys and set a
 every time when keys in redis are updated to guarantee that the distributed
 job metadata is cleaned up properly from redis at some point in time.
-Afterwards, to create a distributed job and add parts, i.e. units of work, to
-it, simply do:
+Afterwards, you have two options to add parts, i.e. units of work, to the
+distributed job. The first option is to use `#push_all` and pass an enum:
 ```ruby
   distributed_job = DistributedJobClient.build(token: SecureRandom.hex)
+  distributed_job.push_all(["job1", "job2", "job3"])
-  distributed_job.push_each(Date.parse('2021-01-01')..Date.today) do |date, part|
-    SomeBackgroundJob.perform_async(date, distributed_job.token, part)
-  end
+  Job1.perform_async(distributed_job.token)
+  Job2.perform_async(distributed_job.token)
+  Job3.perform_async(distributed_job.token)
   distributed_job.token # can be used to query the status of the distributed job
 ```
-The `part` which is passed to the block is some id for one particular part of
-the distributed job. It must be used in a respective background job to mark
-this part finished after it has been successfully processed. Therefore, when
-all those background jobs have successfully finished, all parts will be marked
-as finished, such that the distributed job will finally be finished as well.
-The `token` can also be used to query the status of the distributed job, e.g.
-on a job summary page or similar. You can show some progress bar in the browser
-or in the terminal, etc:
+Here, 3 parts named `job1`, `job2` and `job3` are added to the distributed job
+and then 3 corresponding background jobs are enqueued. It is important to push
+the parts before the background jobs are enqueued. Otherwise the background
+jobs maybe can't find them. The `token` must be passed to the background jobs,
+such that the background job can update the status of the distributed job by
+marking the respective part as done. The token can also be used to query the
+status of the distributed job, e.g. on a job summary page or similar. You can
+also show some progress bar in the browser or in the terminal, etc.
 ```ruby
 # token is given via URL or via some other means
@@ -71,8 +71,29 @@ distributed_job.total # total number of parts
 distributed_job.count # number of unfinished parts
 distributed_job.finished? # whether or not all parts are finished
 distributed_job.open_parts # returns all not yet finished part id's
+distributed_job.done('job1') # marks the respective part as done
+```
+The second option is to use `#push_each`:
+```ruby
+  distributed_job = DistributedJobClient.build(token: SecureRandom.hex)
+  distributed_job.push_each(Date.parse('2021-01-01')..Date.today) do |date, part|
+    SomeBackgroundJob.perform_async(date, distributed_job.token, part)
+  end
+  distributed_job.token # again, can be used to query the status of the distributed job
 ```
+Here, the part name is automatically generated to be some id and passed as
+`part` to the block. The part must also be passed to the respective background
+job for it be able to mark the part as finished after it has been successfully
+processed. Therefore, when all those background jobs have successfully
+finished, all parts will be marked as finished, such that the distributed job
+will finally be finished as well.
 Within the background job, you must use the passed `token` and `part` to query
 and update the status of the distributed job and part accordingly. Please note
 that you can use whatever background job processing tool you like most.

data/lib/distributed_job/job.rb CHANGED Viewed

@@ -1,6 +1,8 @@
 # frozen_string_literal: true
 module DistributedJob
+  class AlreadyClosed < StandardError; end
   # A `DistributedJob::Job` instance allows to keep track of a distributed job, i.e.
   # a job which is split into multiple units running in parallel and in multiple
   # workers using redis.
@@ -85,6 +87,8 @@ module DistributedJob
     #   end
     def push_each(enum)
+      raise(AlreadyClosed, 'The distributed job is already closed') if closed?
       previous_object = nil
       previous_index = nil
@@ -102,6 +106,35 @@ module DistributedJob
       yield(previous_object, previous_index.to_s) if previous_index
     end
+    # Pass an enum to be used to iterate all the units of work of the
+    # distributed job. The values of the enum are used for the names of the
+    # parts, such that values listed multiple times (duplicates) will only be
+    # added once to the distributed job. The distributed job needs to know all
+    # of them to keep track of the overall number and status of the parts.
+    # Passing an enum is much better compared to pushing the parts manually,
+    # because the distributed job needs to be closed before the last part of
+    # the distributed job is enqueued into some job queue. Otherwise it could
+    # potentially happen that the last part is already processed in the job
+    # queue before it is pushed to redis, such that the last job doesn't know
+    # that the distributed job is finished.
+    #
+    # @param enum [#each] The enum which can be iterated to get all
+    #   job parts
+    #
+    # @example
+    #   distributed_job.push_all(0..128)
+    #   distributed_job.push(['part1', 'part2', 'part3'])
+    def push_all(enum)
+      raise(AlreadyClosed, 'The distributed job is already closed') if closed?
+      enum.each do |part|
+        push(part)
+      end
+      close
+    end
     # Returns all parts of the distributed job which are not yet finished.
     #
     # @return [Enumerator] The enum which allows to iterate all parts
@@ -110,6 +143,15 @@ module DistributedJob
       redis.sscan_each("#{redis_key}:parts")
     end
+    # Returns whether or not the part is in the list of open parts of the
+    # distributed job.
+    #
+    # @return [Boolean] Returns true or false
+    def open_part?(part)
+      redis.sismember("#{redis_key}:parts", part.to_s)
+    end
     # Removes the specified part from the distributed job, i.e. from the set of
     # unfinished parts. Use this method when the respective job part has been
     # successfully processed, i.e. finished.

data/lib/distributed_job/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module DistributedJob
-  VERSION = '3.0.1'
+  VERSION = '3.1.0'
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: distributed_job
 version: !ruby/object:Gem::Version
-  version: 3.0.1
+  version: 3.1.0
 platform: ruby
 authors:
 - Benjamin Vetter
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2022-02-28 00:00:00.000000000 Z
+date: 2022-09-22 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: rspec