RubyGems - cloudtasker - Versions diffs - 0.10.rc3 → 0.10.rc8 - Mend

cloudtasker 0.10.rc3 → 0.10.rc8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

checksums.yaml +4 -4
data/.rubocop.yml +7 -1
data/README.md +122 -11
data/app/controllers/cloudtasker/worker_controller.rb +11 -2
data/cloudtasker.gemspec +3 -3
data/docs/UNIQUE_JOBS.md +62 -0
data/lib/cloudtasker/backend/google_cloud_task.rb +19 -7
data/lib/cloudtasker/backend/memory_task.rb +14 -5
data/lib/cloudtasker/backend/redis_task.rb +2 -1
data/lib/cloudtasker/batch/middleware/server.rb +1 -1
data/lib/cloudtasker/config.rb +3 -0
data/lib/cloudtasker/cron/job.rb +0 -5
data/lib/cloudtasker/cron/middleware/server.rb +1 -1
data/lib/cloudtasker/cron/schedule.rb +0 -3
data/lib/cloudtasker/redis_client.rb +32 -12
data/lib/cloudtasker/unique_job.rb +27 -0
data/lib/cloudtasker/unique_job/job.rb +41 -6
data/lib/cloudtasker/unique_job/middleware/client.rb +1 -1
data/lib/cloudtasker/unique_job/middleware/server.rb +1 -1
data/lib/cloudtasker/version.rb +1 -1
data/lib/cloudtasker/worker.rb +28 -8
data/lib/cloudtasker/worker_handler.rb +3 -26
data/lib/cloudtasker/worker_logger.rb +1 -1
metadata +32 -4

checksums.yaml CHANGED

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 4b2692b6e4fe2fbd98cf70fb2fe7a2027992d83be63188108113d523cf410160
-  data.tar.gz: 892910f2b59d84e2f516478926fefb63bd913bb4db4507133cde614eca933482
+  metadata.gz: 9d8d250dbd0e47a0e1f102a131dd831d44d4db4272c5aa429caa0c27d00cf4f7
+  data.tar.gz: 4e6ea0c77b8be277e74e1a54d227389fd61c9568cedddcdbb94d3fc9c1a6788e
 SHA512:
-  metadata.gz: 3f7a4bc94036af569901cde01fe7596acc9eea3019d9216db033de2833eb471668f231c6dbf00ac231247f599291fdd0bec0ed4eb700cf06521b01ea2a753739
-  data.tar.gz: f37e39c7ce6fff9a755a457694f5b46aca1c64b9a46b8b1fa74e10208e19c6c6beffc7e5c4e50a74040d1cca8032088478be2168f80f17e2be4d1845d4d69306
+  metadata.gz: 8fdeafa8bfcb2f50695faaca28618f572d1610bfb15acd63b9766cd46c0449171384dc4eaef459b1b68cfc50700acb321fd4bbfde20664311f899bcdc1db9d58
+  data.tar.gz: 59ba684ad2a9358904cc6a7d6fcce46f5ae3e48d1cded624f77145b552a7693914552230fb7e210832b681c7cbb47b2a413ddd40eb8fba560a734e385b4b8a2b

data/.rubocop.yml CHANGED

@@ -37,4 +37,10 @@ Metrics/BlockLength:
 Style/Documentation:
   Exclude:
     - 'examples/**/*'
-    - 'spec/**/*'
+    - 'spec/**/*'
+Metrics/ParameterLists:
+  CountKeywordArgs: false
+RSpec/MessageSpies:
+  Enabled: false

data/README.md CHANGED

@@ -6,11 +6,11 @@ Background jobs for Ruby using Google Cloud Tasks.
 Cloudtasker provides an easy to manage interface to Google Cloud Tasks for background job processing. Workers can be defined programmatically using the Cloudtasker DSL and enqueued for processing using a simple to use API.
-Cloudtasker is particularly suited for serverless applications only responding to HTTP requests and where running a dedicated job processing is not an option (e.g. deploy via [Cloud Run](https://cloud.google.com/run)). All jobs enqueued in Cloud Tasks via Cloudtasker eventually get processed by your application via HTTP requests.
+Cloudtasker is particularly suited for serverless applications only responding to HTTP requests and where running a dedicated job processing server is not an option (e.g. deploy via [Cloud Run](https://cloud.google.com/run)). All jobs enqueued in Cloud Tasks via Cloudtasker eventually get processed by your application via HTTP requests.
 Cloudtasker also provides optional modules for running [cron jobs](docs/CRON_JOBS.md), [batch jobs](docs/BATCH_JOBS.md) and [unique jobs](docs/UNIQUE_JOBS.md).
-A local processing server is also available in development. This local server processes jobs in lieu of Cloud Tasks and allows you to work offline.
+A local processing server is also available for development. This local server processes jobs in lieu of Cloud Tasks and allows you to work offline.
 ## Summary
@@ -34,7 +34,11 @@ A local processing server is also available in development. This local server pr
     1. [HTTP Error codes](#http-error-codes)
     2. [Error callbacks](#error-callbacks)
     3. [Max retries](#max-retries)
-10. [Best practices building workers](#best-practices-building-workers)
+10. [Testing](#testing)
+    1. [Test helper setup](#test-helper-setup)
+    2. [In-memory queues](#in-memory-queues)
+    3. [Unit tests](#unit-tests)
+11. [Best practices building workers](#best-practices-building-workers)
 ## Installation
@@ -48,7 +52,7 @@ And then execute:
     $ bundle
-Or install it yourself as:
+Or install it yourself with:
     $ gem install cloudtasker
@@ -218,7 +222,7 @@ Cloudtasker.configure do |config|
   #
   # Specify how many retries are allowed on jobs. This number of retries excludes any
-  # connectivity error that would be due to the application being down or unreachable.
+  # connectivity error due to the application being down or unreachable.
   #
   # Default: 25
   #
@@ -289,7 +293,7 @@ MyWorker.schedule(args: [arg1, arg2], time_at: Time.parse('2025-01-01 00:50:00Z'
 MyWorker.schedule(args: [arg1, arg2], time_in: 5 * 60, queue: 'critical')
 ```
-Cloudtasker also provides a helper for re-enqueuing jobs. Re-enqueued jobs keep the same worker id. Some middlewares may rely on this to track the fact that that a job didn't actually complete (e.g. Cloustasker batch). This is optional and you can always fallback to using exception management (raise an error) to retry/re-enqueue jobs.
+Cloudtasker also provides a helper for re-enqueuing jobs. Re-enqueued jobs keep the same job id. Some middlewares may rely on this to track the fact that that a job didn't actually complete (e.g. Cloustasker batch). This is optional and you can always fallback to using exception management (raise an error) to retry/re-enqueue jobs.
 E.g.
 ```ruby
@@ -467,14 +471,14 @@ end
 Will generate the following log with context `{:worker=> ..., :job_id=> ..., :job_meta=> ...}`
 ```log
-[Cloudtasker][d76040a1-367e-4e3b-854e-e05a74d5f773] Job run with foo. This is working!: {:worker=>"DummyWorker", :job_id=>"d76040a1-367e-4e3b-854e-e05a74d5f773", :job_meta=>{}}
+[Cloudtasker][d76040a1-367e-4e3b-854e-e05a74d5f773] Job run with foo. This is working!: {:worker=>"DummyWorker", :job_id=>"d76040a1-367e-4e3b-854e-e05a74d5f773", :job_meta=>{}, :task_id => "4e755d3f-6de0-426c-b4ac-51edd445c045"}
 ```
 The way contextual information is displayed depends on the logger itself. For example with [semantic_logger](http://rocketjob.github.io/semantic_logger) contextual information might not appear in the log message but show up as payload data on the log entry itself (e.g. using the fluentd adapter).
 Contextual information can be customised globally and locally using a log context_processor. By default the `Cloudtasker::WorkerLogger` is configured the following way:
 ```ruby
-Cloudtasker::WorkerLogger.log_context_processor = ->(worker) { worker.to_h.slice(:worker, :job_id, :job_meta) }
+Cloudtasker::WorkerLogger.log_context_processor = ->(worker) { worker.to_h.slice(:worker, :job_id, :job_meta, :job_queue, :task_id) }
 ```
 You can decide to add a global identifier for your worker logs using the following:
@@ -482,7 +486,7 @@ You can decide to add a global identifier for your worker logs using the followi
 # config/initializers/cloudtasker.rb
 Cloudtasker::WorkerLogger.log_context_processor = lambda { |worker|
-  worker.to_h.slice(:worker, :job_id, :job_meta).merge(app: 'my-app')
+  worker.to_h.slice(:worker, :job_id, :job_meta, :job_queue, :task_id).merge(app: 'my-app')
 }
 ```
@@ -503,9 +507,24 @@ end
 See the [Cloudtasker::Worker class](lib/cloudtasker/worker.rb) for more information on attributes available to be logged in your `log_context_processor` proc.
+### Searching logs: Job ID vs Task ID
+**Note**: `task_id` field is available in logs starting with `0.10.rc6`
+Job instances are assigned two different different IDs for tracking and logging purpose: `job_id` and `task_id`. These IDs are found in each log entry to facilitate search.
+| Field | Definition |
+|------|-------------|
+| `job_id` | This ID is generated by Cloudtasker. It identifies the job along its entire lifecyle. It is persistent across retries and reschedules. |
+| `task_id` | This ID is generated by Google Cloud Tasks. It identifies a job instance on the Google Cloud Task side. It is persistent across retries but NOT across reschedules. |
+The Google Cloud Task UI (GCP console) lists all the tasks pending/retrying and their associated task id (also called "Task name"). From there you can:
+1. Use a task ID to lookup the logs of a specific job instance in Stackdriver Logging (or any other logging solution).
+2. From (1) you can retrieve the `job_id` attribute of the job.
+3. From (2) you can use the `job_id` to lookup the job logs along its entire lifecycle.
 ## Error Handling
-Jobs failing will automatically return an HTTP error to Cloud Task and trigger a retry at a later time. The number of Cloud Task retries Cloud Task will depend on the configuration of your queue in Cloud Tasks.
+Jobs failures will return an HTTP error to Cloud Task and trigger a retry at a later time. The number of Cloud Task retries depends on the configuration of your queue in Cloud Tasks.
 ### HTTP Error codes
@@ -513,6 +532,7 @@ Jobs failing will automatically return the following HTTP error code to Cloud Ta
 | Code | Description |
 |------|-------------|
+| 204 | The job was processed successfully |
 | 205 | The job is dead and has been removed from the queue |
 | 404 | The job has specified an incorrect worker class.  |
 | 422 | An error happened during the execution of the worker (`perform` method) |
@@ -551,7 +571,7 @@ By default jobs are retried 25 times - using an exponential backoff - before bei
 Note that the number of retries set on your Cloud Task queue should be many times higher than the number of retries configured in Cloudtasker because Cloud Task also includes failures to connect to your application. Ideally set the number of retries to `unlimited` in Cloud Tasks.
-**Note**: The `X-CloudTasks-TaskExecutionCount` header sent by Google Cloud Tasks and providing the number of retries outside of `HTTP 503` (instance not reachable) is currently bugged and remains at `0` all the time. Starting with `0.10.rc3` Cloudtasker uses the `X-CloudTasks-TaskRetryCount` header to detect the number of retries. This header includes `HTTP 503` errors which means that if your application is down at some point, jobs will fail and these failures will be counted toward the maximum number of retries. A [bug report](https://issuetracker.google.com/issues/154532072) has been raised with GCP to address this issue. Once fixed we will revert to using `X-CloudTasks-TaskExecutionCount` to avoid counting `HTTP 503` as job failures.
+**Note**: The `X-CloudTasks-TaskExecutionCount` header sent by Google Cloud Tasks and providing the number of retries outside of `HTTP 503` (instance not reachable) is currently bugged and remains at `0` all the time. Starting with `v0.10.rc3` Cloudtasker uses the `X-CloudTasks-TaskRetryCount` header to detect the number of retries. This header includes `HTTP 503` errors which means that if your application is down at some point, jobs will fail and these failures will be counted toward the maximum number of retries. A [bug report](https://issuetracker.google.com/issues/154532072) has been raised with GCP to address this issue. Once fixed we will revert to using `X-CloudTasks-TaskExecutionCount` to avoid counting `HTTP 503` as job failures.
 E.g. Set max number of retries globally via the cloudtasker initializer.
 ```ruby
@@ -586,6 +606,97 @@ class SomeErrorWorker
 end
 ```
+## Testing
+Cloudtasker provides several options to test your workers.
+### Test helper setup
+Require `cloudtasker/testing` in your `rails_helper.rb` (Rspec Rails) or `spec_helper.rb` (Rspec) or test unit helper file then enable one of the three modes:
+```ruby
+require 'cloudtasker/testing'
+# Mode 1 (default): Push jobs to Google Cloud Tasks (env != development) or Redis (env == development)
+Cloudtasker::Testing.enable!
+# Mode 2: Push jobs to an in-memory queue. Jobs will not be processed until you call
+# Cloudtasker::Worker.drain_all (process all jobs) or MyWorker.drain (process jobs for specific worker)
+Cloudtasker::Testing.fake!
+# Mode 3: Push jobs to an in-memory queue. Jobs will be processed immediately.
+Cloudtasker::Testing.inline!
+```
+You can query the current testing mode with:
+```ruby
+Cloudtasker::Testing.enabled?
+Cloudtasker::Testing.fake?
+Cloudtasker::Testing.inline?
+```
+Each testing mode accepts a block argument to temporarily switch to it:
+```ruby
+# Enable fake mode for all tests
+Cloudtasker::Testing.fake!
+# Enable inline! mode temporarily for a given test
+Cloudtasker.inline! do
+   MyWorker.perform_async(1,2)
+end
+```
+Note that extension middlewares - e.g. unique job, batch job etc. - run in test mode. You can disable middlewares in your tests by adding the following to your test helper:
+```ruby
+# Remove all middlewares
+Cloudtasker.configure do |c|
+  c.client_middleware.clear
+  c.server_middleware.clear
+end
+# Remove all unique job middlewares
+Cloudtasker.configure do |c|
+  c.client_middleware.remove(Cloudtasker::UniqueJob::Middleware::Client)
+  c.server_middleware.remove(Cloudtasker::UniqueJob::Middleware::Server)
+end
+```
+### In-memory queues
+The `fake!` or `inline!` modes use in-memory queues, which can be queried and controlled using the following methods:
+```ruby
+# Perform all jobs in queue
+Cloudtasker::Worker.drain_all
+# Remove all jobs in queue
+Cloudtasker::Worker.clear_all
+# Perform all jobs in queue for a specific worker type
+MyWorker.drain
+# Return the list of jobs in queue for a specific worker type
+MyWorker.jobs
+```
+### Unit tests
+Below are examples of rspec tests. It is assumed that `Cloudtasker::Testing.fake!` has been set in the test helper.
+**Example 1**: Testing that a job is scheduled
+```ruby
+describe 'worker scheduling'
+  subject(:enqueue_job) { MyWorker.perform_async(1,2) }
+  it { expect { enqueue_job }.to change(MyWorker.jobs, :size).by(1) }
+end
+```
+**Example 2**: Testing job execution logic
+```ruby
+describe 'worker calls api'
+  subject { Cloudtasker::Testing.inline! { MyApiWorker.perform_async(1,2) } }
+  before { expect(MyApi).to receive(:fetch).and_return([]) }
+  it { is_expected.to be_truthy }
+end
+```
 ## Best practices building workers

data/app/controllers/cloudtasker/worker_controller.rb CHANGED

@@ -51,19 +51,28 @@ module Cloudtasker
         end
         # Return content parsed as JSON and add job retries count
-        JSON.parse(content).merge(job_retries: job_retries)
+        JSON.parse(content).merge(job_retries: job_retries, task_id: task_id)
       end
     end
     #
     # Extract the number of times this task failed at runtime.
     #
-    # @return [Integer] The number of failures
+    # @return [Integer] The number of failures.
     #
     def job_retries
       request.headers[Cloudtasker::Config::RETRY_HEADER].to_i
     end
+    #
+    # Return the Google Cloud Task ID from headers.
+    #
+    # @return [String] The task ID.
+    #
+    def task_id
+      request.headers[Cloudtasker::Config::TASK_ID_HEADER]
+    end
     #
     # Authenticate incoming requests using a bearer token
     #

data/cloudtasker.gemspec CHANGED

@@ -15,8 +15,6 @@ Gem::Specification.new do |spec|
   spec.homepage      = 'https://github.com/keypup-io/cloudtasker'
   spec.license       = 'MIT'
-  # spec.metadata["allowed_push_host"] = "TODO: Set to 'http://mygemserver.com'"
   spec.metadata['homepage_uri'] = spec.homepage
   spec.metadata['source_code_uri'] = 'https://github.com/keypup-io/cloudtasker'
   spec.metadata['changelog_uri'] = 'https://github.com/keypup-io/cloudtasker/master/tree/CHANGELOG.md'
@@ -31,10 +29,12 @@ Gem::Specification.new do |spec|
   spec.require_paths = ['lib']
   spec.add_dependency 'activesupport'
+  spec.add_dependency 'connection_pool'
   spec.add_dependency 'fugit'
-  spec.add_dependency 'google-cloud-tasks'
+  spec.add_dependency 'google-cloud-tasks', '~> 1.0'
   spec.add_dependency 'jwt'
   spec.add_dependency 'redis'
+  spec.add_dependency 'retriable'
   spec.add_development_dependency 'appraisal'
   spec.add_development_dependency 'bundler', '~> 2.0'

data/docs/UNIQUE_JOBS.md CHANGED

@@ -81,6 +81,68 @@ Below is the list of available conflict strategies can be specified through the
 | `raise` | All locks | A `Cloudtasker::UniqueJob::LockError` will be raised when a conflict occurs |
 | `reschedule` | `while_executing` | The job will be rescheduled 5 seconds later when a conflict occurs |
+## Lock Time To Live (TTL) & deadlocks
+**Note**: Lock TTL has been introduced in `v0.10.rc6`
+To make jobs unique Cloudtasker sets a lock key - a hash of class name + job arguments - in Redis. Unique crash situations may lead to lock keys not being cleaned up when jobs complete - e.g. Redis crash with rollback from last known state on disk. Situations like these may lead to having a unique job deadlock: jobs with the same class and arguments would stop being processed because they're unable to acquire a lock that will never be cleaned up.
+In order to prevent deadlocks Cloudtasker configures lock keys to automatically expire in Redis after `job schedule time + lock_ttl (default: 10 minutes)`. This forced expiration ensures that deadlocks eventually get cleaned up shortly after the expected run time of a job.
+The `lock_ttl (default: 10 minutes)` duration represent the expected max duration of the job. The default 10 minutes value was chosen because it's twice the default request timeout value in Cloud Run. This usually leaves enough room for queue lag (5 minutes) + job processing (5 minutes).
+Queue lag is certainly the most unpredictable factor here. Job processing time is less of a factor. Jobs running for more than 5 minutes should be split into sub-jobs to limit invocation time over HTTP anyway. Cloudtasker [batch jobs](BATCH_JOBS.md) can help split big jobs into sub-jobs in an atomic way.
+The default lock key expiration of `job schedule time + 10 minutes` may look aggressive but it is a better choice than having real-time jobs stuck for X hours after a crash recovery.
+We **strongly recommend** adapting the `lock_ttl` option either globally or for each worker based on expected queue lag and job duration.
+**Example 1**: Global configuration
+```ruby
+# config/initializers/cloudtasker.rb
+# General Cloudtasker configuration
+Cloudtasker.configure do |config|
+  # ...
+end
+# Unique job extension configuration
+Cloudtasker::UniqueJob.configure do |config|
+  config.lock_ttl = 3 * 60 # 3 minutes
+end
+```
+**Example 2**: Worker-level - fast
+```ruby
+# app/workers/realtime_worker_on_fast_queue.rb
+class RealtimeWorkerOnFastQueue
+  include Cloudtasker::Worker
+  # Ensure lock is removed 30 seconds after schedule time
+  cloudtasker_options lock: :until_executing, lock_ttl: 30
+  def perform(arg1, arg2)
+    # ...
+  end
+end
+```
+**Example 3**: Worker-level - slow
+```ruby
+# app/workers/non_critical_worker_on_slow_queue.rb
+class NonCriticalWorkerOnSlowQueue
+  include Cloudtasker::Worker
+  # Ensure lock is removed 24 hours after schedule time
+  cloudtasker_options lock: :until_executing, lock_ttl: 3600 * 24
+  def perform(arg1, arg2)
+    # ...
+  end
+end
+```
 ## Configuring unique arguments
 By default Cloudtasker considers all job arguments to evaluate the uniqueness of a job. This behaviour is configurable per worker by defining a `unique_args` method on the worker itself returning the list of args defining uniqueness.

data/lib/cloudtasker/backend/google_cloud_task.rb CHANGED

@@ -1,5 +1,8 @@
 # frozen_string_literal: true
+require 'google/cloud/tasks'
+require 'retriable'
 module Cloudtasker
   module Backend
     # Manage tasks pushed to GCP Cloud Task
@@ -113,9 +116,10 @@ module Cloudtasker
       # @return [Cloudtasker::Backend::GoogleCloudTask, nil] The retrieved task.
       #
       def self.find(id)
-        resp = client.get_task(id)
+        resp = with_gax_retries { client.get_task(id) }
         resp ? new(resp) : nil
-      rescue Google::Gax::RetryError
+      rescue Google::Gax::RetryError, Google::Gax::NotFoundError, GRPC::NotFound
+        # The ID does not exist
         nil
       end
@@ -133,10 +137,8 @@ module Cloudtasker
         relative_queue = payload.delete(:queue)
         # Create task
-        resp = client.create_task(queue_path(relative_queue), payload)
+        resp = with_gax_retries { client.create_task(queue_path(relative_queue), payload) }
         resp ? new(resp) : nil
-      rescue Google::Gax::RetryError
-        nil
       end
       #
@@ -145,11 +147,21 @@ module Cloudtasker
       # @param [String] id The id of the task.
       #
       def self.delete(id)
-        client.delete_task(id)
-      rescue Google::Gax::NotFoundError, Google::Gax::RetryError, GRPC::NotFound, Google::Gax::PermissionDeniedError
+        with_gax_retries { client.delete_task(id) }
+      rescue Google::Gax::RetryError, Google::Gax::NotFoundError, GRPC::NotFound, Google::Gax::PermissionDeniedError
+        # The ID does not exist
         nil
       end
+      #
+      # Helper method encapsulating the retry strategy for GAX calls
+      #
+      def self.with_gax_retries
+        Retriable.retriable(on: [Google::Gax::UnavailableError], tries: 3) do
+          yield
+        end
+      end
       #
       # Build a new instance of the class.
       #

data/lib/cloudtasker/backend/memory_task.rb CHANGED

@@ -1,7 +1,5 @@
 # frozen_string_literal: true
-require 'cloudtasker/redis_client'
 module Cloudtasker
   module Backend
     # Manage local tasks pushed to memory.
@@ -10,6 +8,15 @@ module Cloudtasker
       attr_accessor :job_retries
       attr_reader :id, :http_request, :schedule_time, :queue
+      #
+      # Return true if we are in test inline execution mode.
+      #
+      # @return [Boolean] True if inline mode enabled.
+      #
+      def self.inline_mode?
+        defined?(Cloudtasker::Testing) && Cloudtasker::Testing.inline?
+      end
       #
       # Return the task queue. A worker class name
       #
@@ -59,7 +66,7 @@ module Cloudtasker
         queue << task
         # Execute task immediately if in testing and inline mode enabled
-        task.execute if defined?(Cloudtasker::Testing) && Cloudtasker::Testing.inline?
+        task.execute if inline_mode?
         task
       end
@@ -153,13 +160,15 @@ module Cloudtasker
       #
       def execute
         # Execute worker
-        resp = WorkerHandler.with_worker_handling(payload, &:execute)
+        worker_payload = payload.merge(job_retries: job_retries, task_id: id)
+        resp = WorkerHandler.with_worker_handling(worker_payload, &:execute)
         # Delete task
         self.class.delete(id)
         resp
-      rescue StandardError
+      rescue StandardError => e
         self.job_retries += 1
+        raise(e) if self.class.inline_mode?
       end
       #

data/lib/cloudtasker/backend/redis_task.rb CHANGED

@@ -247,7 +247,8 @@ module Cloudtasker
           uri = URI(http_request[:url])
           req = Net::HTTP::Post.new(uri.path, http_request[:headers])
-          # Add retries header
+          # Add task headers
+          req[Cloudtasker::Config::TASK_ID_HEADER] = id
           req[Cloudtasker::Config::RETRY_HEADER] = retries
           # Set job payload

data/lib/cloudtasker/batch/middleware/server.rb CHANGED

@@ -5,7 +5,7 @@ module Cloudtasker
     module Middleware
       # Server middleware, invoked when jobs are executed
       class Server
-        def call(worker)
+        def call(worker, **_kwargs)
           Job.for(worker).execute { yield }
         end
       end

data/lib/cloudtasker/config.rb CHANGED

@@ -25,6 +25,9 @@ module Cloudtasker
     #
     RETRY_HEADER = 'X-CloudTasks-TaskRetryCount'
+    # Cloud Task ID header
+    TASK_ID_HEADER = 'X-CloudTasks-TaskName'
     # Content-Transfer-Encoding header in Cloud Task responses
     ENCODING_HEADER = 'Content-Transfer-Encoding'

data/lib/cloudtasker/cron/job.rb CHANGED

@@ -4,15 +4,10 @@ require 'fugit'
 module Cloudtasker
   module Cron
-    # TODO: handle deletion of cron jobs
-    #
     # Manage cron jobs
     class Job
       attr_reader :worker
-      # Key Namespace used for object saved under this class
-      SUB_NAMESPACE = 'job'
       #
       # Build a new instance of the class
       #

data/lib/cloudtasker/cron/middleware/server.rb CHANGED

@@ -5,7 +5,7 @@ module Cloudtasker
     module Middleware
       # Server middleware, invoked when jobs are executed
       class Server
-        def call(worker)
+        def call(worker, **_kwargs)
           Job.new(worker).execute { yield }
         end
       end

data/lib/cloudtasker/cron/schedule.rb CHANGED

@@ -9,9 +9,6 @@ module Cloudtasker
     class Schedule
       attr_accessor :id, :cron, :worker, :task_id, :job_id, :queue, :args
-      # Key Namespace used for object saved under this class
-      SUB_NAMESPACE = 'schedule'
       #
       # Return the redis client.
       #

data/lib/cloudtasker/redis_client.rb CHANGED

@@ -1,15 +1,28 @@
 # frozen_string_literal: true
 require 'redis'
+require 'connection_pool'
 module Cloudtasker
   # A wrapper with helper methods for redis
   class RedisClient
     # Suffix added to cache keys when locking them
     LOCK_KEY_PREFIX = 'cloudtasker/lock'
+    LOCK_DURATION = 2 # seconds
+    LOCK_WAIT_DURATION = 0.03 # seconds
+    # Default pool size used for Redis
+    DEFAULT_POOL_SIZE = ENV.fetch('RAILS_MAX_THREADS') { 25 }
+    DEFAULT_POOL_TIMEOUT = 5
     def self.client
-      @client ||= Redis.new(Cloudtasker.config.redis || {})
+      @client ||= begin
+        pool_size = Cloudtasker.config.redis&.dig(:pool_size) || DEFAULT_POOL_SIZE
+        pool_timeout = Cloudtasker.config.redis&.dig(:pool_timeout) || DEFAULT_POOL_TIMEOUT
+        ConnectionPool.new(size: pool_size, timeout: pool_timeout) do
+          Redis.new(Cloudtasker.config.redis || {})
+        end
+      end
     end
     #
@@ -29,7 +42,7 @@ module Cloudtasker
     # @return [Hash, Array] The content of the cache key, parsed as JSON.
     #
     def fetch(key)
-      return nil unless (val = client.get(key.to_s))
+      return nil unless (val = get(key.to_s))
       JSON.parse(val, symbolize_names: true)
     rescue JSON::ParserError
@@ -45,12 +58,15 @@ module Cloudtasker
     # @return [String] Redis response code.
     #
     def write(key, content)
-      client.set(key.to_s, content.to_json)
+      set(key.to_s, content.to_json)
     end
     #
     # Acquire a lock on a cache entry.
     #
+    # Locks are enforced to be short-lived (2s).
+    # The yielded block should limit its logic to short operations (e.g. redis get/set).
+    #
     # @example
     #   redis = RedisClient.new
     #   redis.with_lock('foo')
@@ -65,12 +81,14 @@ module Cloudtasker
       # Wait to acquire lock
       lock_key = [LOCK_KEY_PREFIX, cache_key].join('/')
-      true until client.setnx(lock_key, true)
+      client.with do |conn|
+        sleep(LOCK_WAIT_DURATION) until conn.set(lock_key, true, nx: true, ex: LOCK_DURATION)
+      end
       # yield content
       yield
     ensure
-      client.del(lock_key)
+      del(lock_key)
     end
     #
@@ -99,10 +117,12 @@ module Cloudtasker
       list = []
       # Scan and capture matching keys
-      while cursor != 0
-        scan = client.scan(cursor || 0, match: pattern)
-        list += scan[1]
-        cursor = scan[0].to_i
+      client.with do |conn|
+        while cursor != 0
+          scan = conn.scan(cursor || 0, match: pattern)
+          list += scan[1]
+          cursor = scan[0].to_i
+        end
       end
       list
@@ -118,8 +138,8 @@ module Cloudtasker
     # @return [Any] The method return value
     #
     def method_missing(name, *args, &block)
-      if client.respond_to?(name)
-        client.send(name, *args, &block)
+      if Redis.method_defined?(name)
+        client.with { |c| c.send(name, *args, &block) }
       else
         super
       end
@@ -134,7 +154,7 @@ module Cloudtasker
     # @return [Boolean] Return true if the class respond to this method.
     #
     def respond_to_missing?(name, include_private = false)
-      client.respond_to?(name) || super
+      Redis.method_defined?(name) || super
     end
   end
 end

data/lib/cloudtasker/unique_job.rb CHANGED

@@ -3,3 +3,30 @@
 require_relative 'unique_job/middleware'
 Cloudtasker::UniqueJob::Middleware.configure
+module Cloudtasker
+  # UniqueJob configurator
+  module UniqueJob
+    # The maximum duration a lock can remain in place
+    # after schedule time.
+    DEFAULT_LOCK_TTL = 10 * 60 # 10 minutes
+    class << self
+      attr_writer :lock_ttl
+      # Configure the middleware
+      def configure
+        yield(self)
+      end
+      #
+      # Return the max TTL for locks
+      #
+      # @return [Integer] The lock TTL.
+      #
+      def lock_ttl
+        @lock_ttl || DEFAULT_LOCK_TTL
+      end
+    end
+  end
+end

data/lib/cloudtasker/unique_job/job.rb CHANGED

@@ -5,21 +5,19 @@ module Cloudtasker
     # Wrapper class for Cloudtasker::Worker delegating to lock
     # and conflict strategies
     class Job
-      attr_reader :worker
+      attr_reader :worker, :call_opts
       # The default lock strategy to use. Defaults to "no lock".
       DEFAULT_LOCK = UniqueJob::Lock::NoOp
-      # Key Namespace used for object saved under this class
-      SUB_NAMESPACE = 'job'
       #
       # Build a new instance of the class.
       #
       # @param [Cloudtasker::Worker] worker The worker at hand
       #
-      def initialize(worker)
+      def initialize(worker, **kwargs)
         @worker = worker
+        @call_opts = kwargs
       end
       #
@@ -31,6 +29,43 @@ module Cloudtasker
         worker.class.cloudtasker_options_hash
       end
+      #
+      # Return the Time To Live (TTL) that should be set in Redis for
+      # the lock key. Having a TTL on lock keys ensures that jobs
+      # do not end up stuck due to a dead lock situation.
+      #
+      # The TTL is calculated using schedule time + expected
+      # max job duration.
+      #
+      # The expected max job duration is set to 10 minutes by default.
+      # This value was chosen because it's twice the default request timeout
+      # value in Cloud Run. This leaves enough room for queue lag (5 minutes)
+      # + job processing (5 minutes).
+      #
+      # Queue lag is certainly the most unpredictable factor here.
+      # Job processing time is less of a factor. Jobs running for more than 5 minutes
+      # should be split into sub-jobs to limit invocation time over HTTP. Cloudtasker batch
+      # jobs can help achieve that if you need to make one big job split into sub-jobs "atomic".
+      #
+      # The default lock key expiration of "time_at + 10 minutes" may look aggressive but it
+      # is still a better choice than potentially having real-time jobs stuck for X hours.
+      #
+      # The expected max job duration can be configured via the `lock_ttl`
+      # option on the job itself.
+      #
+      # @return [Integer] The TTL in seconds
+      #
+      def lock_ttl
+        now = Time.now.to_i
+        # Get scheduled at and lock duration
+        scheduled_at = [call_opts[:time_at].to_i, now].compact.max
+        lock_duration = (options[:lock_ttl] || Cloudtasker::UniqueJob.lock_ttl).to_i
+        # Return TTL
+        scheduled_at + lock_duration - now
+      end
       #
       # Return the instantiated lock.
       #
@@ -121,7 +156,7 @@ module Cloudtasker
           raise(LockError, locked_id) if locked_id && locked_id != id
           # Take job lock if the lock is currently free
-          redis.set(unique_gid, id) unless locked_id
+          redis.set(unique_gid, id, ex: lock_ttl) unless locked_id
         end
       end

data/lib/cloudtasker/unique_job/middleware/client.rb CHANGED

@@ -5,7 +5,7 @@ module Cloudtasker
     module Middleware
       # Client middleware, invoked when jobs are scheduled
       class Client
-        def call(worker)
+        def call(worker, **_kwargs)
           Job.new(worker).lock_instance.schedule { yield }
         end
       end

data/lib/cloudtasker/unique_job/middleware/server.rb CHANGED

@@ -5,7 +5,7 @@ module Cloudtasker
     module Middleware
       # Server middleware, invoked when jobs are executed
       class Server
-        def call(worker)
+        def call(worker, **_kwargs)
           Job.new(worker).lock_instance.execute { yield }
         end
       end

data/lib/cloudtasker/version.rb CHANGED

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Cloudtasker
-  VERSION = '0.10.rc3'
+  VERSION = '0.10.rc8'
 end

data/lib/cloudtasker/worker.rb CHANGED

@@ -8,7 +8,7 @@ module Cloudtasker
       base.extend(ClassMethods)
       base.attr_writer :job_queue
       base.attr_accessor :job_args, :job_id, :job_meta, :job_reenqueued, :job_retries,
-                         :perform_started_at, :perform_ended_at
+                         :perform_started_at, :perform_ended_at, :task_id
     end
     #
@@ -47,7 +47,7 @@ module Cloudtasker
       return nil unless worker_klass.include?(self)
       # Return instantiated worker
-      worker_klass.new(payload.slice(:job_queue, :job_args, :job_id, :job_meta, :job_retries))
+      worker_klass.new(payload.slice(:job_queue, :job_args, :job_id, :job_meta, :job_retries, :task_id))
     rescue NameError
       nil
     end
@@ -140,12 +140,13 @@ module Cloudtasker
     # @param [Array<any>] job_args The list of perform args.
     # @param [String] job_id A unique ID identifying this job.
     #
-    def initialize(job_queue: nil, job_args: nil, job_id: nil, job_meta: {}, job_retries: 0)
+    def initialize(job_queue: nil, job_args: nil, job_id: nil, job_meta: {}, job_retries: 0, task_id: nil)
       @job_args = job_args || []
       @job_id = job_id || SecureRandom.uuid
       @job_meta = MetaStore.new(job_meta)
       @job_retries = job_retries || 0
       @job_queue = job_queue
+      @task_id = task_id
     end
     #
@@ -197,18 +198,36 @@ module Cloudtasker
       raise(e)
     end
+    #
+    # Return a unix timestamp specifying when to run the task.
+    #
+    # @param [Integer, nil] interval The time to wait.
+    # @param [Integer, nil] time_at The time at which the job should run.
+    #
+    # @return [Integer, nil] The Unix timestamp.
+    #
+    def schedule_time(interval: nil, time_at: nil)
+      return nil unless interval || time_at
+      # Generate the complete Unix timestamp
+      (time_at || Time.now).to_i + interval.to_i
+    end
     #
     # Enqueue a worker, with or without delay.
     #
     # @param [Integer] interval The delay in seconds.
-    #
     # @param [Time, Integer] interval The time at which the job should run
     #
     # @return [Cloudtasker::CloudTask] The Google Task response
     #
-    def schedule(interval: nil, time_at: nil)
-      Cloudtasker.config.client_middleware.invoke(self) do
-        WorkerHandler.new(self).schedule(interval: interval, time_at: time_at)
+    def schedule(**args)
+      # Evaluate when to schedule the job
+      time_at = schedule_time(args)
+      # Schedule job through client middlewares
+      Cloudtasker.config.client_middleware.invoke(self, time_at: time_at) do
+        WorkerHandler.new(self).schedule(time_at: time_at)
       end
     end
@@ -250,7 +269,8 @@ module Cloudtasker
         job_args: job_args,
         job_meta: job_meta.to_h,
         job_retries: job_retries,
-        job_queue: job_queue
+        job_queue: job_queue,
+        task_id: task_id
       }
     end

data/lib/cloudtasker/worker_handler.rb CHANGED

@@ -56,11 +56,6 @@ module Cloudtasker
       with_worker_handling(input_payload, &:execute)
     end
-    # TODO: do not delete redis payload if job has been re-enqueued
-    # worker.job_reenqueued
-    #
-    # Idea: change with_worker_handling to with_worker_handling and build the worker
-    # inside the with_worker_handling block.
     #
     # Local middleware used to retrieve the job arg payload from cache
     # if a arg payload reference is present.
@@ -210,35 +205,17 @@ module Cloudtasker
       }.merge(worker_args_payload)
     end
-    #
-    # Return a protobuf timestamp specifying how to wait
-    # before running a task.
-    #
-    # @param [Integer, nil] interval The time to wait.
-    # @param [Integer, nil] time_at The time at which the job should run.
-    #
-    # @return [Integer, nil] The Unix timestamp.
-    #
-    def schedule_time(interval: nil, time_at: nil)
-      return nil unless interval || time_at
-      # Generate the complete Unix timestamp
-      (time_at || Time.now).to_i + interval.to_i
-    end
     #
     # Schedule the task on GCP Cloud Task.
     #
-    # @param [Integer, nil] interval How to wait before running the task.
+    # @param [Integer, nil] time_at A unix timestamp specifying when to run the job.
     #   Leave to `nil` to run now.
     #
     # @return [Cloudtasker::CloudTask] The Google Task response
     #
-    def schedule(interval: nil, time_at: nil)
+    def schedule(time_at: nil)
       # Generate task payload
-      task = task_payload.merge(
-        schedule_time: schedule_time(interval: interval, time_at: time_at)
-      ).compact
+      task = task_payload.merge(schedule_time: time_at).compact
       # Create and return remote task
       CloudTask.create(task)

data/lib/cloudtasker/worker_logger.rb CHANGED

@@ -11,7 +11,7 @@ module Cloudtasker
     end
     # Only log the job meta information by default (exclude arguments)
-    DEFAULT_CONTEXT_PROCESSOR = ->(worker) { worker.to_h.slice(:worker, :job_id, :job_meta, :job_queue) }
+    DEFAULT_CONTEXT_PROCESSOR = ->(worker) { worker.to_h.slice(:worker, :job_id, :job_meta, :job_queue, :task_id) }
     #
     # Build a new instance of the class.

metadata CHANGED

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: cloudtasker
 version: !ruby/object:Gem::Version
-  version: 0.10.rc3
+  version: 0.10.rc8
 platform: ruby
 authors:
 - Arnaud Lachaume
 autorequire:
 bindir: exe
 cert_chain: []
-date: 2020-04-21 00:00:00.000000000 Z
+date: 2020-06-24 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: activesupport
@@ -25,7 +25,7 @@ dependencies:
       - !ruby/object:Gem::Version
         version: '0'
 - !ruby/object:Gem::Dependency
-  name: fugit
+  name: connection_pool
   requirement: !ruby/object:Gem::Requirement
     requirements:
     - - ">="
@@ -39,7 +39,7 @@ dependencies:
       - !ruby/object:Gem::Version
         version: '0'
 - !ruby/object:Gem::Dependency
-  name: google-cloud-tasks
+  name: fugit
   requirement: !ruby/object:Gem::Requirement
     requirements:
     - - ">="
@@ -52,6 +52,20 @@ dependencies:
     - - ">="
       - !ruby/object:Gem::Version
         version: '0'
+- !ruby/object:Gem::Dependency
+  name: google-cloud-tasks
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '1.0'
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - "~>"
+      - !ruby/object:Gem::Version
+        version: '1.0'
 - !ruby/object:Gem::Dependency
   name: jwt
   requirement: !ruby/object:Gem::Requirement
@@ -80,6 +94,20 @@ dependencies:
     - - ">="
       - !ruby/object:Gem::Version
         version: '0'
+- !ruby/object:Gem::Dependency
+  name: retriable
+  requirement: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
+  type: :runtime
+  prerelease: false
+  version_requirements: !ruby/object:Gem::Requirement
+    requirements:
+    - - ">="
+      - !ruby/object:Gem::Version
+        version: '0'
 - !ruby/object:Gem::Dependency
   name: appraisal
   requirement: !ruby/object:Gem::Requirement