cloudtasker 0.7.0 → 0.8.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 92abbab630f4e60c7ae8a6b2e249a5e19d49ad9730788381790463cfda136aea
4
- data.tar.gz: c71fcf022efd34d3bba1e11341583b18581d1ae93bcd128b769b2e871e5e76f6
3
+ metadata.gz: e2a110c5354118a009e8620c887eb0c8bb9ff5c7aa63fefcf242ac9a649bd0e1
4
+ data.tar.gz: 58772d851865727f326bc30dfa3b94f170fe602eff264ce8661f2abd175033e9
5
5
  SHA512:
6
- metadata.gz: 4a0f52436da444c75530ceb49ac16ce4c76392383011774accb31418447b6d0b2677e1289cb5768be39bb51b2665b46f347e0496c98c9552bf8c1a82a324f6e1
7
- data.tar.gz: 5597abc655cef50010bbb38850cc26a5f440404aa8e3d0b645d49fe6bf34f4222d8402e1d61d3e99cf4a0093f7ccd66f561258a3b02595de7a7774cb21cda643
6
+ metadata.gz: af8b0e59a08d7e65bcc46f03135c17932b69816c2282bc651b1ae60887bde3ea0e52af051accb5606f1c2d042dd15dc224313a6aa7b8e6222aeef01806147464
7
+ data.tar.gz: 8b9d7e921aca496913a36357a43e8e3c0f77e059ac31439b3ccbe749afde946117a9faca70de11ee50ed6a9ff97b3439889b90bdb30db534ba4ac7690a0e3bfb
data/CHANGELOG.md CHANGED
@@ -1,5 +1,9 @@
1
1
  # Changelog
2
2
 
3
+ ## [v0.7.0](https://github.com/keypup-io/cloudtasker/tree/v0.7.0) (2019-11-25)
4
+
5
+ [Full Changelog](https://github.com/keypup-io/cloudtasker/compare/v0.6.0...v0.7.0)
6
+
3
7
  ## [v0.6.0](https://github.com/keypup-io/cloudtasker/tree/v0.6.0) (2019-11-25)
4
8
 
5
9
  [Full Changelog](https://github.com/keypup-io/cloudtasker/compare/v0.5.0...v0.6.0)
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- cloudtasker (0.7.0)
4
+ cloudtasker (0.8.0)
5
5
  activesupport
6
6
  fugit
7
7
  google-cloud-tasks
@@ -110,7 +110,7 @@ GEM
110
110
  googleauth (~> 0.9)
111
111
  grpc (~> 1.24)
112
112
  rly (~> 0.2.3)
113
- google-protobuf (3.10.1-universal-darwin)
113
+ google-protobuf (3.11.0-universal-darwin)
114
114
  googleapis-common-protos (1.3.9)
115
115
  google-protobuf (~> 3.0)
116
116
  googleapis-common-protos-types (~> 1.0)
data/README.md CHANGED
@@ -18,18 +18,21 @@ A local processing server is also available in development. This local server pr
18
18
  1. [Cloud Tasks authentication & permissions](#cloud-tasks-authentication--permissions)
19
19
  2. [Cloudtasker initializer](#cloudtasker-initializer)
20
20
  4. [Enqueuing jobs](#enqueuing-jobs)
21
- 5. [Extensions](#extensions)
22
- 6. [Working locally](#working-locally)
21
+ 5. [Managing worker queues](#managing-worker-queues)
22
+ 1. [Creating queues](#creating-queues)
23
+ 2. [Assigning queues to workers](#assigning-queues-to-workers)
24
+ 6. [Extensions](#extensions)
25
+ 7. [Working locally](#working-locally)
23
26
  1. [Option 1: Cloudtasker local server](#option-1-cloudtasker-local-server)
24
27
  2. [Option 2: Using ngrok](#option-2-using-ngrok)
25
- 7. [Logging](#logging)
28
+ 8. [Logging](#logging)
26
29
  1. [Configuring a logger](#configuring-a-logger)
27
30
  2. [Logging context](#logging-context)
28
- 8. [Error Handling](#error-handling)
31
+ 9. [Error Handling](#error-handling)
29
32
  1. [HTTP Error codes](#http-error-codes)
30
33
  2. [Error callbacks](#error-callbacks)
31
34
  3. [Max retries](#max-retries)
32
- 9. [Best practices building workers](#best-practices-building-workers)
35
+ 10. [Best practices building workers](#best-practices-building-workers)
33
36
 
34
37
  ## Installation
35
38
 
@@ -157,13 +160,31 @@ Cloudtasker.configure do |config|
157
160
  # config.secret = 'some-long-token'
158
161
 
159
162
  #
160
- # Specify the details of your Google Cloud Task queue.
163
+ # Specify the details of your Google Cloud Task location.
161
164
  #
162
165
  # This not required in development using the Cloudtasker local server.
163
166
  #
164
167
  config.gcp_location_id = 'us-central1' # defaults to 'us-east1'
165
168
  config.gcp_project_id = 'my-gcp-project'
166
- config.gcp_queue_id = 'my-queue'
169
+
170
+ #
171
+ # Specify the namespace for your Cloud Task queues.
172
+ #
173
+ # The gem assumes that a least a default queue named 'my-app-default'
174
+ # exists in Cloud Tasks. You can create this default queue using the
175
+ # gcloud SDK or via the `rake cloudtasker:setup_queue` task if you use Rails.
176
+ #
177
+ # Workers can be scheduled on different queues. The name of the queue
178
+ # in Cloud Tasks is always assumed to be prefixed with the prefix below.
179
+ #
180
+ # E.g.
181
+ # Setting `cloudtasker_options queue: 'critical'` on a worker means that
182
+ # the worker will be pushed to 'my-app-critical' in Cloud Tasks.
183
+ #
184
+ # Specific queues can be created in Cloud Tasks using the gcloud SDK or
185
+ # via the `rake cloudtasker:setup_queue name=<queue_name>` task.
186
+ #
187
+ config.gcp_queue_prefix = 'my-app'
167
188
 
168
189
  #
169
190
  # Specify the publicly accessible host for your application
@@ -215,7 +236,7 @@ Cloudtasker.configure do |config|
215
236
  end
216
237
  ```
217
238
 
218
- If your queue does not exist in Cloud Tasks you should [create it using the gcloud sdk](https://cloud.google.com/tasks/docs/creating-queues).
239
+ If the default queue `<gcp_queue_prefix>-default` does not exist in Cloud Tasks you should [create it using the gcloud sdk](https://cloud.google.com/tasks/docs/creating-queues).
219
240
 
220
241
  Alternatively with Rails you can simply run the following rake task if you have queue admin permissions (`cloudtasks.queues.get` and `cloudtasks.queues.create`).
221
242
  ```bash
@@ -235,10 +256,15 @@ MyWorker.perform_in(5 * 60, arg1, arg2)
235
256
  # or with Rails
236
257
  MyWorker.perform_in(5.minutes, arg1, arg2)
237
258
 
238
- # Worker will be processed on specific date
259
+ # Worker will be processed on a specific date
239
260
  MyWorker.perform_at(Time.parse('2025-01-01 00:50:00Z'), arg1, arg2)
240
261
  # also with Rails
241
262
  MyWorker.perform_at(3.days.from_now, arg1, arg2)
263
+
264
+ # With all options, including which queue to run the worker on.
265
+ MyWorker.schedule(args: [arg1, arg2], time_at: Time.parse('2025-01-01 00:50:00Z'), queue: 'critical')
266
+ # or
267
+ MyWorker.schedule(args: [arg1, arg2], time_in: 5 * 60, queue: 'critical')
242
268
  ```
243
269
 
244
270
  Cloudtasker also provides a helper for re-enqueuing jobs. Re-enqueued jobs keep the same worker id. Some middlewares may rely on this to track the fact that that a job didn't actually complete (e.g. Cloustasker batch). This is optional and you can always fallback to using exception management (raise an error) to retry/re-enqueue jobs.
@@ -262,6 +288,52 @@ class FetchResourceWorker
262
288
  end
263
289
  ```
264
290
 
291
+ ## Managing worker queues
292
+
293
+ Cloudtasker allows you to manage several queues and distribute workers across them based on job priority. By default jobs are pushed to the `default` queue, which is `<gcp_queue_prefix>-default` in Cloud Tasks.
294
+
295
+ ### Creating queues
296
+
297
+ More queues can be created using the gcloud sdk or the `cloudtasker:setup_queue` rake task.
298
+
299
+ E.g. Create a `critical` queue with a concurrency of 5 via the gcloud SDK
300
+ ```bash
301
+ gcloud tasks queues create <gcp_queue_prefix>-critical --max-concurrent-dispatches=5
302
+ ```
303
+
304
+ E.g. Create a `real-time` queue with a concurrency of 15 via the rake task (Rails only)
305
+ ```bash
306
+ rake cloudtasker:setup_queue name=real-time concurrency=15
307
+ ```
308
+
309
+ When running the Cloudtasker local processing server, you can specify the concurrency for each queue using:
310
+ ```bash
311
+ cloudtasker -q critical,5 -q important,4 -q default,3
312
+ ```
313
+
314
+ ### Assigning queues to workers
315
+
316
+ Queues can be assigned to workers via the `cloudtasker_options` directive on the worker class:
317
+
318
+ ```ruby
319
+ # app/workers/critical_worker.rb
320
+
321
+ class CriticalWorker
322
+ include Cloudtasker::Worker
323
+
324
+ cloudtasker_options queue: :critical
325
+
326
+ def perform(some_arg)
327
+ logger.info("This is a critical job run with arg=#{some_arg}.")
328
+ end
329
+ end
330
+ ```
331
+
332
+ Queues can also be assigned at runtime when scheduling a job:
333
+ ```ruby
334
+ CriticalWorker.schedule(args: [1], queue: :important)
335
+ ```
336
+
265
337
  ## Extensions
266
338
  Cloudtasker comes with three optional features:
267
339
  - Cron Jobs [[docs](docs/CRON_JOBS.md)]: Run jobs at fixed intervals.
@@ -303,6 +375,11 @@ web: rails s
303
375
  worker: cloudtasker
304
376
  ```
305
377
 
378
+ Note that the local development server runs with `5` concurrent threads by default. You can tune the number of threads per queue by running `cloudtasker` the following options:
379
+ ```bash
380
+ cloudtasker -q critical,5 -q important,4 -q default,3
381
+ ```
382
+
306
383
  ### Option 2: Using ngrok
307
384
 
308
385
  Want to test your application end to end with Google Cloud Task? Then [ngrok](https://ngrok.io) is the way to go.
@@ -318,9 +395,9 @@ Take note of your ngrok domain and configure Cloudtasker to use Google Cloud Tas
318
395
 
319
396
  Cloudtasker.configure do |config|
320
397
  # Specify your Google Cloud Task queue configuration
321
- # config.gcp_location_id = 'us-central1'
322
- # config.gcp_project_id = 'my-gcp-project'
323
- # config.gcp_queue_id = 'my-queue'
398
+ config.gcp_location_id = 'us-central1'
399
+ config.gcp_project_id = 'my-gcp-project'
400
+ config.gcp_queue_prefix = 'my-app'
324
401
 
325
402
  # Use your ngrok domain as the processor host
326
403
  config.processor_host = 'https://your-tunnel-id.ngrok.io'
@@ -585,6 +662,45 @@ Rails.cache.write(payload_id, data)
585
662
  BigPayloadWorker.perform_async(payload_id)
586
663
  ```
587
664
 
665
+ ### Sizing the concurrency of your queues
666
+
667
+ When defining the max concurrency concurrency of your queues (`max_concurrent_dispatches` in Cloud Tasks) you must keep in mind the maximum number of threads that your application provides. Otherwise your application threads may eventually get exhausted and your users will experience outages if all your web threads are busy running jobs.
668
+
669
+ #### With server based applications
670
+
671
+ Let's consider an application deployed in production with 3 instances, each having `RAILS_MAX_THREADS` set to `20`. This gives us a total of `60` threads available.
672
+
673
+ Now let's say that we distribute jobs across two queues: `default` and `critical`. We can set the concurrency of each each queue depending on the profile of the application:
674
+
675
+ E.g. 1: The application serves requests from web users and runs backgrounds jobs in a balanced way
676
+ ```
677
+ concurrency for default queue: 20
678
+ concurrency for critical queue: 10
679
+
680
+ Total threads consumed by jobs at most: 30
681
+ Total threads always available to web users at worst: 30
682
+ ```
683
+
684
+ E.g. 2: The application is a micro-service API heavily focused on running jobs (e.g. data processing)
685
+ ```
686
+ concurrency for default queue: 35
687
+ concurrency for critical queue: 15
688
+
689
+ Total threads consumed by jobs at most: 50
690
+ Total threads always available to web users at worst: 10
691
+ ```
692
+
693
+ Also always ensure that your total number of threads does not exceed the available number of database connections (if you use any).
694
+
695
+ #### With serverless applications
696
+
697
+ In a serverless context your application will be scaled up/down based on traffic. When we say 'traffic' this includes requests from Cloud Tasks to run jobs.
698
+
699
+ Because your application is auto-scaled - and assuming you haven't set a maximum - your job processing capacity if theoretically unlimited. The main limiting factor in a serverless context becomes external constraints such as the number of database connections available.
700
+
701
+ To size the concurrency of your queues you should therefore take the most limiting factor - which is often the database connection pool size for RDBMS databases - and use the calculations of the previous section with this limiting factor as the capping parameter instead of the threads.
702
+
703
+
588
704
  ## Development
589
705
 
590
706
  After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
@@ -18,7 +18,7 @@ module Cloudtasker
18
18
  def run
19
19
  # Build payload
20
20
  payload = request.params
21
- .slice(:worker, :job_id, :job_args, :job_meta)
21
+ .slice(:worker, :job_id, :job_args, :job_meta, :job_queue)
22
22
  .merge(job_retries: job_retries)
23
23
 
24
24
  # Process payload
data/cloudtasker.gemspec CHANGED
@@ -10,8 +10,8 @@ Gem::Specification.new do |spec|
10
10
  spec.authors = ['Arnaud Lachaume']
11
11
  spec.email = ['arnaud.lachaume@keypup.io']
12
12
 
13
- spec.summary = 'Background jobs for Ruby using Google Cloud Tasks (alpha)'
14
- spec.description = 'Background jobs for Ruby using Google Cloud Tasks (alpha)'
13
+ spec.summary = 'Background jobs for Ruby using Google Cloud Tasks (beta)'
14
+ spec.description = 'Background jobs for Ruby using Google Cloud Tasks (beta)'
15
15
  spec.homepage = 'https://github.com/keypup-io/cloudtasker'
16
16
  spec.license = 'MIT'
17
17
 
data/docs/BATCH_JOBS.md CHANGED
@@ -59,8 +59,33 @@ The following callbacks are available on your workers to track the progress of t
59
59
  | `on_child_dead` | `The child job` | Invoked when a child has exhausted all of its retries |s
60
60
  | `on_batch_complete` | none | Invoked when all chidren have finished or died |
61
61
 
62
+ ## Queue management
63
+
64
+ Jobs added to a batch inherit the queue of the parent. It is possible to specify a different queue when adding a job to a batch using `add_to_queue` batch method.
65
+
66
+ E.g.
67
+
68
+ ```ruby
69
+ def perform
70
+ batch.add_to_queue(:critical, SubWorker, arg1, arg2, arg3)
71
+ end
72
+ ```
73
+
62
74
  ## Batch completion
63
75
 
64
76
  Batches complete when all children have successfully completed or died (all retries exhausted).
65
77
 
66
- Jobs that fail in a batch will be retried based on the `max_retries` setting configured globally or on the worker itself. The batch will be considered `pending` while workers retry. Therefore it may be a good idea to reduce the number of retries on your workers using `cloudtasker_options max_retries: 5` to ensure your batches don't hang for too long.
78
+ Jobs that fail in a batch will be retried based on the `max_retries` setting configured globally or on the worker itself. The batch will be considered `pending` while workers retry. Therefore it may be a good idea to reduce the number of retries on your workers using `cloudtasker_options max_retries: 5` to ensure your batches don't hang for too long.
79
+
80
+ ## Batch progress tracking
81
+
82
+ You can access progression statistics in callback using `batch.progress`. See the [BatchProgress](../lib/cloudtasker/batch/batch_progress.rb) class for more details.
83
+
84
+ E.g.
85
+ ```ruby
86
+ def on_batch_node_complete(_child_job)
87
+ logger.info("Total: #{batch.progress.total}")
88
+ logger.info("Completed: #{batch.progress.completed}")
89
+ logger.info("Progress: #{batch.progress.percent.to_i}%")
90
+ end
91
+ ```
data/docs/CRON_JOBS.md CHANGED
@@ -28,7 +28,11 @@ unless Rails.env.test?
28
28
  # Run job every hour on the fifteenth minute
29
29
  other_cron_schedule: {
30
30
  worker: 'OtherCronWorker',
31
- cron: '15 * * * *'
31
+ cron: '15 * * * *',
32
+ queue: 'critical'
33
+ args:
34
+ - 'foo'
35
+ - 'bar
32
36
  }
33
37
  )
34
38
  end
data/exe/cloudtasker CHANGED
@@ -3,9 +3,21 @@
3
3
 
4
4
  require 'bundler/setup'
5
5
  require 'cloudtasker/cli'
6
+ require 'optparse'
7
+
8
+ options = {}
9
+ OptionParser.new do |opts|
10
+ opts.banner = 'Usage: cloudtasker [options]'
11
+
12
+ opts.on('-q QUEUE', '--queue=QUEUE', 'Queue to process and number of threads. ' \
13
+ "Examples: '-q critical' | '-q critical,2' | '-q critical,3 -q defaults,2'") do |o|
14
+ options[:queues] ||= []
15
+ options[:queues] << o.split(',')
16
+ end
17
+ end.parse!
6
18
 
7
19
  begin
8
- Cloudtasker::CLI.run
20
+ Cloudtasker::CLI.run(options)
9
21
  rescue StandardError => e
10
22
  raise e if $DEBUG
11
23
 
data/lib/cloudtasker.rb CHANGED
@@ -47,5 +47,4 @@ module Cloudtasker
47
47
  end
48
48
  end
49
49
 
50
- require 'cloudtasker/railtie' if defined?(Rails)
51
50
  require 'cloudtasker/engine' if defined?(::Rails::Engine)
@@ -9,15 +9,28 @@ module Cloudtasker
9
9
  #
10
10
  # Create the queue configured in Cloudtasker if it does not already exist.
11
11
  #
12
+ # @param [String] queue_name The relative name of the queue.
13
+ #
12
14
  # @return [Google::Cloud::Tasks::V2beta3::Queue] The queue
13
15
  #
14
- def self.setup_queue
15
- client.get_queue(queue_path)
16
+ def self.setup_queue(**opts)
17
+ # Build full queue path
18
+ queue_name = opts[:name] || Cloudtasker::Config::DEFAULT_JOB_QUEUE
19
+ full_queue_name = queue_path(queue_name)
20
+
21
+ # Try to get existing queue
22
+ client.get_queue(full_queue_name)
16
23
  rescue Google::Gax::RetryError
24
+ # Extract options
25
+ concurrency = (opts[:concurrency] || Cloudtasker::Config::DEFAULT_QUEUE_CONCURRENCY).to_i
26
+ retries = (opts[:retries] || Cloudtasker::Config::DEFAULT_QUEUE_RETRIES).to_i
27
+
28
+ # Create queue on 'not found' error
17
29
  client.create_queue(
18
30
  client.location_path(config.gcp_project_id, config.gcp_location_id),
19
- name: queue_path,
20
- retry_config: { max_attempts: -1 }
31
+ name: full_queue_name,
32
+ retry_config: { max_attempts: retries },
33
+ rate_limits: { max_concurrent_dispatches: concurrency }
21
34
  )
22
35
  end
23
36
 
@@ -42,13 +55,15 @@ module Cloudtasker
42
55
  #
43
56
  # Return the fully qualified path for the Cloud Task queue.
44
57
  #
58
+ # @param [String] queue_name The relative name of the queue.
59
+ #
45
60
  # @return [String] The queue path.
46
61
  #
47
- def self.queue_path
62
+ def self.queue_path(queue_name)
48
63
  client.queue_path(
49
64
  config.gcp_project_id,
50
65
  config.gcp_location_id,
51
- config.gcp_queue_id
66
+ [config.gcp_queue_prefix, queue_name].join('-')
52
67
  )
53
68
  end
54
69
 
@@ -94,8 +109,11 @@ module Cloudtasker
94
109
  schedule_time: format_schedule_time(payload[:schedule_time])
95
110
  ).compact
96
111
 
112
+ # Extract relative queue name
113
+ relative_queue = payload.delete(:queue)
114
+
97
115
  # Create task
98
- resp = client.create_task(queue_path, payload)
116
+ resp = client.create_task(queue_path(relative_queue), payload)
99
117
  resp ? new(resp) : nil
100
118
  rescue Google::Gax::RetryError
101
119
  nil
@@ -121,6 +139,20 @@ module Cloudtasker
121
139
  @gcp_task = gcp_task
122
140
  end
123
141
 
142
+ #
143
+ # Return the relative queue (queue name minus prefix) the task is in.
144
+ #
145
+ # @return [String] The relative queue name
146
+ #
147
+ def relative_queue
148
+ gcp_task
149
+ .name
150
+ .match(%r{/queues/([^/]+)})
151
+ &.captures
152
+ &.first
153
+ &.sub("#{self.class.config.gcp_queue_prefix}-", '')
154
+ end
155
+
124
156
  #
125
157
  # Return a hash description of the task.
126
158
  #
@@ -131,7 +163,8 @@ module Cloudtasker
131
163
  id: gcp_task.name,
132
164
  http_request: gcp_task.to_h[:http_request],
133
165
  schedule_time: gcp_task.to_h.dig(:schedule_time, :seconds).to_i,
134
- retries: gcp_task.to_h[:response_count]
166
+ retries: gcp_task.to_h[:response_count],
167
+ queue: relative_queue
135
168
  }
136
169
  end
137
170
  end
@@ -7,7 +7,7 @@ module Cloudtasker
7
7
  # Manage local tasks pushed to memory.
8
8
  # Used for testing.
9
9
  class MemoryTask
10
- attr_reader :id, :http_request, :schedule_time
10
+ attr_reader :id, :http_request, :schedule_time, :queue
11
11
 
12
12
  #
13
13
  # Return the task queue. A worker class name
@@ -116,10 +116,11 @@ module Cloudtasker
116
116
  # @param [Hash] http_request The HTTP request content.
117
117
  # @param [Integer] schedule_time When to run the task (Unix timestamp)
118
118
  #
119
- def initialize(id:, http_request:, schedule_time: nil)
119
+ def initialize(id:, http_request:, schedule_time: nil, queue: nil)
120
120
  @id = id
121
121
  @http_request = http_request
122
122
  @schedule_time = Time.at(schedule_time || 0)
123
+ @queue = queue
123
124
  end
124
125
 
125
126
  #
@@ -149,7 +150,8 @@ module Cloudtasker
149
150
  {
150
151
  id: id,
151
152
  http_request: http_request,
152
- schedule_time: schedule_time.to_i
153
+ schedule_time: schedule_time.to_i,
154
+ queue: queue
153
155
  }
154
156
  end
155
157