RubyGems - cloudtasker - Versions diffs - 0.12.rc2 → 0.12.rc7 - Mend

cloudtasker 0.12.rc2 → 0.12.rc7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

checksums.yaml +4 -4
data/.rubocop.yml +1 -1
data/CHANGELOG.md +7 -3
data/README.md +1 -1
data/app/controllers/cloudtasker/worker_controller.rb +1 -1
data/docs/BATCH_JOBS.md +24 -3
data/lib/cloudtasker/backend/redis_task.rb +1 -0
data/lib/cloudtasker/batch/job.rb +52 -20
data/lib/cloudtasker/redis_client.rb +6 -2
data/lib/cloudtasker/version.rb +1 -1
data/lib/cloudtasker/worker.rb +17 -6
data/lib/cloudtasker/worker_handler.rb +1 -1
data/lib/cloudtasker/worker_logger.rb +29 -2
metadata +2 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 11c4d8d88b554792e888929324e79521426f5484973253a44c50a97849a5842c
-  data.tar.gz: 1f10788d0ced509c82eea06ddfbd0b1d47a5731507c8bac5c597f41cc13b2d98
+  metadata.gz: 96524dc4a6825a3760a462277f7b0a710d779521e759bce44374ea044dbd649d
+  data.tar.gz: 908f1497b7ad316549f4f347c363b183181f0e0a17a92bda3f50ac9c34e2247c
 SHA512:
-  metadata.gz: abe6437f179edd589afb88ed970c7870377c836446ba6b4169ffd60565be15fcf92557dd71965795579f58ba19af63a9225ad55cf8c97bb8ab5612e31d5c8bdf
-  data.tar.gz: 23d17b1fade5f62594250b239b07e603bc8d8f0862465dd28ec3efc6b260cb86774e88d3b9aead4704eff1d071a3dda2f35d9d68302c6dcef56a1d8b707caffe
+  metadata.gz: e18579d27321e09f1e997ad3fdd3e3e0d1d0c344cd5fc553b64f665937fbe510cd11957381eee1fee5c100e2f14d8390434f26c60307fec4ecf4a681b15885d8
+  data.tar.gz: 3b688838f1a518f2f5c7b914a590f035d8a828c2c2b67838173a1712b8765e2ce34cddb8ddf1214aaefc26ce8ec43b7cc1dbe4eb450ccb24feb9e06bcd19bac7

data/.rubocop.yml CHANGED Viewed

@@ -6,7 +6,7 @@ AllCops:
     - 'vendor/**/*'
 Metrics/ClassLength:
-  Max: 150
+  Max: 200
 Metrics/ModuleLength:
   Max: 150

data/CHANGELOG.md CHANGED Viewed

@@ -1,15 +1,19 @@
 # Changelog
-## Latest RC [v0.12.rc1](https://github.com/keypup-io/cloudtasker/tree/v0.12.rc1) (2021-03-11)
+## Latest RC [v0.12.rc7](https://github.com/keypup-io/cloudtasker/tree/v0.12.rc7) (2021-03-31)
-[Full Changelog](https://github.com/keypup-io/cloudtasker/compare/v0.11.0...v0.12.rc1)
+[Full Changelog](https://github.com/keypup-io/cloudtasker/compare/v0.11.0...v0.12.rc7)
 **Improvements:**
 - ActiveJob: do not double log errors (ActiveJob has its own error logging)
+- Cron jobs: Use Redis Sets instead of key pattern matching for resource listing
 - Error logging: Use worker logger so as to include context (job args etc.)
 - Error logging: Do not log exception and stack trace separately, combine them instead.
 - Batch callbacks: Retry jobs when completion callback fails
-- Redis: Use Redis Sets instead of key pattern matching for listing methods (Cron jobs and Local Server)
+- Batch state: use native Redis hashes to store batch state instead of a serialized hash in a string key
+- Batch progress: restrict calculation to direct children by default. Allow depth to be specified. Calculating progress using all tree jobs created significant delays on large batches.
+- Local server: Use Redis Sets instead of key pattern matching for resource listing
+- Worker: raise DeadWorkerError instead of MissingWorkerArgumentsError when arguments are missing. This is more consistent with what middlewares expect.
 **Fixed bugs:**
 - Retries: Enforce job retry limit on job processing. There was an edge case where jobs could be retried indefinitely on batch callback errors.

data/README.md CHANGED Viewed

@@ -136,7 +136,7 @@ That's it! Your job was picked up by the Cloudtasker local server and sent for p
 Now jump to the next section to configure your app to use Google Cloud Tasks as a backend.
 ## Get started with Rails & ActiveJob
-**Note**: ActiveJob is supported since `0.11.0`
+**Note**: ActiveJob is supported since `0.11.0`
 **Note**: Cloudtasker extensions (cron, batch and unique jobs) are not available when using cloudtasker via ActiveJob.
 Cloudtasker is pre-integrated with ActiveJob. Follow the steps below to get started.

data/app/controllers/cloudtasker/worker_controller.rb CHANGED Viewed

@@ -19,7 +19,7 @@ module Cloudtasker
       # Process payload
       WorkerHandler.execute_from_payload!(payload)
       head :no_content
-    rescue DeadWorkerError, MissingWorkerArgumentsError
+    rescue DeadWorkerError
       # 205: job will NOT be retried
       head :reset_content
     rescue InvalidWorkerError

data/docs/BATCH_JOBS.md CHANGED Viewed

@@ -84,8 +84,29 @@ You can access progression statistics in callback using `batch.progress`. See th
 E.g.
 ```ruby
 def on_batch_node_complete(_child_job)
-  logger.info("Total: #{batch.progress.total}")
-  logger.info("Completed: #{batch.progress.completed}")
-  logger.info("Progress: #{batch.progress.percent.to_i}%")
+  progress = batch.progress
+  logger.info("Total: #{progress.total}")
+  logger.info("Completed: #{progress.completed}")
+  logger.info("Progress: #{progress.percent.to_i}%")
+end
+```
+**Since:** `v0.12.rc5`
+By default the `progress` method only considers the direct child jobs to evaluate the batch progress. You can pass `depth: somenumber` to the `progress` method to calculate the actual batch progress in a more granular way. Be careful however that this method recursively calculates progress on the sub-batches and is therefore expensive.
+E.g.
+```ruby
+def on_batch_node_complete(_child_job)
+  # Considers the children for batch progress calculation
+  progress_0 = batch.progress # same as batch.progress(depth: 0)
+  # Considers the children and grand-children for batch progress calculation
+  progress_1 = batch.progress(depth: 1)
+  # Considers the children, grand-children and grand-grand-children for batch progress calculation
+  progress_2 = batch.progress(depth: 3)
+  logger.info("Progress: #{progress_1.percent.to_i}%")
+  logger.info("Progress: #{progress_2.percent.to_i}%")
 end
 ```

data/lib/cloudtasker/backend/redis_task.rb CHANGED Viewed

@@ -178,6 +178,7 @@ module Cloudtasker
           schedule_time: (Time.now + interval).to_i,
           queue: queue
         )
+        redis.sadd(self.class.key, id)
       end
       #

data/lib/cloudtasker/batch/job.rb CHANGED Viewed

@@ -17,6 +17,10 @@ module Cloudtasker
       # because the jobs will be either retried or dropped
       IGNORED_ERRORED_CALLBACKS = %i[on_child_error on_child_dead].freeze
+      # The maximum number of seconds to wait for a batch state lock
+      # to be acquired.
+      BATCH_MAX_LOCK_WAIT = 60
       #
       # Return the cloudtasker redis client
       #
@@ -176,7 +180,9 @@ module Cloudtasker
       # @return [Hash] The state  of each child worker.
       #
       def batch_state
-        redis.fetch(batch_state_gid)
+        migrate_batch_state_to_redis_hash
+        redis.hgetall(batch_state_gid)
       end
       #
@@ -208,6 +214,24 @@ module Cloudtasker
         )
       end
+      #
+      # This method migrates the batch state to be a Redis hash instead
+      # of a hash stored in a string key.
+      #
+      def migrate_batch_state_to_redis_hash
+        return unless redis.type(batch_state_gid) == 'string'
+        # Migrate batch state to Redis hash if it is still using a legacy string key
+        # We acquire a lock then check again
+        redis.with_lock(batch_state_gid, max_wait: BATCH_MAX_LOCK_WAIT) do
+          if redis.type(batch_state_gid) == 'string'
+            state = redis.fetch(batch_state_gid)
+            redis.del(batch_state_gid)
+            redis.hset(batch_state_gid, state) if state.any?
+          end
+        end
+      end
       #
       # Save the batch.
       #
@@ -218,8 +242,11 @@ module Cloudtasker
         # complete (success or failure).
         redis.write(batch_gid, worker.to_h)
+        # Stop there if no jobs to save
+        return if jobs.empty?
         # Save list of child workers
-        redis.write(batch_state_gid, jobs.map { |e| [e.job_id, 'scheduled'] }.to_h)
+        redis.hset(batch_state_gid, jobs.map { |e| [e.job_id, 'scheduled'] }.to_h)
       end
       #
@@ -228,28 +255,27 @@ module Cloudtasker
       # @param [String] job_id The batch id.
       # @param [String] status The status of the sub-batch.
       #
-      # @return [<Type>] <description>
-      #
       def update_state(batch_id, status)
-        redis.with_lock(batch_state_gid) do
-          state = batch_state
-          state[batch_id.to_sym] = status.to_s if state.key?(batch_id.to_sym)
-          redis.write(batch_state_gid, state)
+        migrate_batch_state_to_redis_hash
+        # Update the batch state batch_id entry with the new status
+        redis.with_lock("#{batch_state_gid}/#{batch_id}", max_wait: BATCH_MAX_LOCK_WAIT) do
+          redis.hset(batch_state_gid, batch_id, status) if redis.hexists(batch_state_gid, batch_id)
         end
       end
       #
       # Return true if all the child workers have completed.
       #
-      # @return [<Type>] <description>
+      # @return [Boolean] True if the batch is complete.
       #
       def complete?
-        redis.with_lock(batch_state_gid) do
-          state = redis.fetch(batch_state_gid)
-          return true unless state
+        migrate_batch_state_to_redis_hash
+        # Check that all child jobs have completed
+        redis.with_lock(batch_state_gid, max_wait: BATCH_MAX_LOCK_WAIT) do
           # Check that all children are complete
-          state.values.all? { |e| COMPLETION_STATUSES.include?(e) }
+          redis.hvals(batch_state_gid).all? { |e| COMPLETION_STATUSES.include?(e) }
         end
       end
@@ -331,11 +357,10 @@ module Cloudtasker
       # Remove all batch and sub-batch keys from Redis.
       #
       def cleanup
-        # Capture batch state
-        state = batch_state
+        migrate_batch_state_to_redis_hash
         # Delete child batches recursively
-        state.to_h.keys.each { |id| self.class.find(id)&.cleanup }
+        redis.hkeys(batch_state_gid).each { |id| self.class.find(id)&.cleanup }
         # Delete batch redis entries
         redis.del(batch_gid)
@@ -347,13 +372,20 @@ module Cloudtasker
       #
       # @return [Cloudtasker::Batch::BatchProgress] The batch progress.
       #
-      def progress
+      def progress(depth: 0)
+        depth = depth.to_i
         # Capture batch state
         state = batch_state
-        # Sum batch progress of current batch and all sub-batches
+        # Return immediately if we do not need to go down the tree
+        return BatchProgress.new(state) if depth <= 0
+        # Sum batch progress of current batch and sub-batches up to the specified
+        # depth
         state.to_h.reduce(BatchProgress.new(state)) do |memo, (child_id, child_status)|
-          memo + (self.class.find(child_id)&.progress || BatchProgress.new(child_id => child_status))
+          memo + (self.class.find(child_id)&.progress(depth: depth - 1) ||
+            BatchProgress.new(child_id => child_status))
         end
       end
@@ -395,7 +427,7 @@ module Cloudtasker
         # Perform job
         yield
-        # Save batch (if child worker has been enqueued)
+        # Save batch (if child workers have been enqueued)
         setup
         # Complete batch

data/lib/cloudtasker/redis_client.rb CHANGED Viewed

@@ -75,14 +75,18 @@ module Cloudtasker
     #   end
     #
     # @param [String] cache_key The cache key to access.
+    # @param [Integer] max_wait The number of seconds after which the lock will be cleared anyway.
     #
-    def with_lock(cache_key)
+    def with_lock(cache_key, max_wait: nil)
       return nil unless cache_key
+      # Set max wait
+      max_wait = (max_wait || LOCK_DURATION).to_i
       # Wait to acquire lock
       lock_key = [LOCK_KEY_PREFIX, cache_key].join('/')
       client.with do |conn|
-        sleep(LOCK_WAIT_DURATION) until conn.set(lock_key, true, nx: true, ex: LOCK_DURATION)
+        sleep(LOCK_WAIT_DURATION) until conn.set(lock_key, true, nx: true, ex: max_wait)
       end
       # yield content

data/lib/cloudtasker/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Cloudtasker
-  VERSION = '0.12.rc2'
+  VERSION = '0.12.rc7'
 end

data/lib/cloudtasker/worker.rb CHANGED Viewed

@@ -332,6 +332,22 @@ module Cloudtasker
       job_retries > job_max_retries
     end
+    #
+    # Return true if the job arguments are missing.
+    #
+    # This may happen if a job
+    # was successfully run but retried due to Cloud Task dispatch deadline
+    # exceeded. If the arguments were stored in Redis then they may have
+    # been flushed already after the successful completion.
+    #
+    # If job arguments are missing then the job will simply be declared dead.
+    #
+    # @return [Boolean] True if the arguments are missing.
+    #
+    def arguments_missing?
+      job_args.empty? && [0, -1].exclude?(method(:perform).arity)
+    end
     #
     # Return the time taken (in seconds) to perform the job. This duration
     # includes the middlewares and the actual perform method.
@@ -384,14 +400,9 @@ module Cloudtasker
       Cloudtasker.config.server_middleware.invoke(self) do
         # Immediately abort the job if it is already dead
         flag_as_dead if job_dead?
+        flag_as_dead(MissingWorkerArgumentsError.new('worker arguments are missing')) if arguments_missing?
         begin
-          # Abort if arguments are missing. This may happen with redis arguments storage
-          # if Cloud Tasks times out on a job but the job still succeeds
-          if job_args.empty? && [0, -1].exclude?(method(:perform).arity)
-            raise(MissingWorkerArgumentsError, 'worker arguments are missing')
-          end
           # Perform the job
           perform(*job_args)
         rescue StandardError => e

data/lib/cloudtasker/worker_handler.rb CHANGED Viewed

@@ -107,7 +107,7 @@ module Cloudtasker
       redis.expire(args_payload_key, ARGS_PAYLOAD_CLEANUP_TTL) if args_payload_key && !worker.job_reenqueued
       resp
-    rescue DeadWorkerError, MissingWorkerArgumentsError => e
+    rescue DeadWorkerError => e
       # Delete stored args payload if job is dead
       redis.expire(args_payload_key, ARGS_PAYLOAD_CLEANUP_TTL) if args_payload_key
       log_execution_error(worker, e)

data/lib/cloudtasker/worker_logger.rb CHANGED Viewed

@@ -51,6 +51,26 @@ module Cloudtasker
       Cloudtasker.logger
     end
+    #
+    # Format the log message as string.
+    #
+    # @param [Object] msg The log message or object.
+    #
+    # @return [String] The formatted message
+    #
+    def formatted_message_as_string(msg)
+      # Format message
+      msg_content = if msg.is_a?(Exception)
+                      [msg.inspect, msg.backtrace].flatten(1).join("\n")
+                    elsif msg.is_a?(String)
+                      msg
+                    else
+                      msg.inspect
+                    end
+      "[Cloudtasker][#{worker.class}][#{worker.job_id}] #{msg_content}"
+    end
     #
     # Format main log message.
     #
@@ -59,7 +79,12 @@ module Cloudtasker
     # @return [String] The formatted log message
     #
     def formatted_message(msg)
-      "[Cloudtasker][#{worker.class}][#{worker.job_id}] #{msg}"
+      if msg.is_a?(String)
+        formatted_message_as_string(msg)
+      else
+        # Delegate object formatting to logger
+        msg
+      end
     end
     #
@@ -147,7 +172,9 @@ module Cloudtasker
       # ActiveSupport::Logger does not support passing a payload through a block on top
       # of a message.
       if defined?(ActiveSupport::Logger) && logger.is_a?(ActiveSupport::Logger)
-        logger.send(level) { "#{formatted_message(msg)} -- #{payload_block.call}" }
+        # The logger is fairly basic in terms of formatting. All inputs get converted
+        # as regular strings.
+        logger.send(level) { "#{formatted_message_as_string(msg)} -- #{payload_block.call}" }
       else
         logger.send(level, formatted_message(msg), &payload_block)
       end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: cloudtasker
 version: !ruby/object:Gem::Version
-  version: 0.12.rc2
+  version: 0.12.rc7
 platform: ruby
 authors:
 - Arnaud Lachaume
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2021-03-12 00:00:00.000000000 Z
+date: 2021-03-31 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: activesupport