plines 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
data/Gemfile ADDED
@@ -0,0 +1,12 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in plines.gemspec
4
+ gemspec
5
+
6
+ gem 'qless', git: 'git://github.com/seomoz/qless.git', branch: 'unified'
7
+
8
+ group :extras do
9
+ gem 'debugger', platform: :mri
10
+ end
11
+
12
+ gem 'rspec-fire', git: 'git://github.com/xaviershay/rspec-fire.git'
data/LICENSE ADDED
@@ -0,0 +1,22 @@
1
+ Copyright (c) 2012 Myron Marston
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ "Software"), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,420 @@
1
+ # Plines
2
+
3
+ Plines creates job pipelines out of a complex set of step dependencies.
4
+ It's intended to maximize the efficiency and throughput of the jobs
5
+ (ensuring jobs are run as soon as their dependencies have been met)
6
+ while minimizing the amount of "glue" code you have to write to make it
7
+ work.
8
+
9
+ Plines is built on top of [Qless](https://github.com/seomoz/qless) and
10
+ [Redis](http://redis.io/).
11
+
12
+ ## Installation
13
+
14
+ Add this line to your application's Gemfile:
15
+
16
+ gem 'plines'
17
+
18
+ And then execute:
19
+
20
+ $ bundle
21
+
22
+ Or install it yourself as:
23
+
24
+ $ gem install plines
25
+
26
+ ## Getting Started
27
+
28
+ First, create a pipeline using the `Plines::Pipeline` module:
29
+
30
+ ``` ruby
31
+ module MyProcessingPipeline
32
+ extend Plines::Pipeline
33
+
34
+ configure do |config|
35
+ # configuration goes here; see below for available options
36
+ end
37
+ end
38
+ ```
39
+
40
+ `MyProcessingPipeline` will function both as the namespace for your
41
+ pipeline steps and also as a singleton holding some state for your
42
+ pipeline.
43
+
44
+ Next, define some pipeline steps. Your steps should be simple ruby
45
+ classes that extend the `Plines::Step` module and define a `perform`
46
+ method:
47
+
48
+ ``` ruby
49
+ module MyProcessingPipeline
50
+ class CountWidgets
51
+ extend Plines::Step
52
+
53
+ def perform
54
+ # do some work
55
+ end
56
+ end
57
+ end
58
+ ```
59
+
60
+ The `Plines::Step` module makes available some class-level
61
+ macros for declaring step dependency relationships. See the **Step Class
62
+ DSL** section below for more details.
63
+
64
+ Once you've defined all your steps, you can enqueue jobs for them:
65
+
66
+ ``` ruby
67
+ MyProcessingPipeline.enqueue_jobs_for("some" => "data", "goes" => "here")
68
+ ```
69
+
70
+ `MyProcessingPipeline.enqueue_jobs_for` will enqueue a full set of qless
71
+ jobs (or a `JobBatch` in Plines terminology) for the given batch data
72
+ based on your step classes' macro declarations.
73
+
74
+ ## Configuring a Pipeline
75
+
76
+ Plines supports configuration at the pipeline level:
77
+
78
+ ``` ruby
79
+ module MyProcessingPipeline
80
+ extend Plines::Pipeline
81
+
82
+ configure do |config|
83
+ # Determines how job batches are identified. Plines provides an API
84
+ # to find the most recent existing job batch based on this key.
85
+ config.batch_list_key { |batch_data| batch_data.fetch(:user_id) }
86
+
87
+ # Sets the Qless client to use. If you have only one Qless server,
88
+ # have the block return a client for it. If you're sharding your
89
+ # Qless usage, you can have the block return a client based on the
90
+ # given batch list key.
91
+ config.qless_client do |user_id|
92
+ Qless::Client.new(redis: RedisShard.for(user_id))
93
+ end
94
+
95
+ # Determines how long the Plines job batch data will be kept around
96
+ # in redis after the batch reaches a final state (cancelled or
97
+ # completed). By default, this is set to 6 months, but you
98
+ # will probably want to set it to something shorter (like 2 weeks)
99
+ config.data_ttl_in_seconds = 14 * 24 * 60 * 60
100
+
101
+ # Provides a hook that gets called when job batches are cancelled.
102
+ # Use this to perform any cleanup in your system.
103
+ config.after_job_batch_cancellation do |job_batch|
104
+ # do some cleanup
105
+ end
106
+
107
+ # Use this callback to set additional global qless job
108
+ # options (such as queue, tags and priority). You can also set
109
+ # options on an individual step class (see below).
110
+ config.qless_job_options do |job|
111
+ { tags: [job.data[:user_id]] }
112
+ end
113
+ end
114
+ end
115
+ ```
116
+
117
+ ## The Step Class DSL
118
+
119
+ An example will help illustrate the Step class DSL. (Note that this
120
+ example omits the `perform` method declarations for brevity).
121
+
122
+ ``` ruby
123
+ module MakeThanksgivingDinner
124
+ extend Plines::Pipeline
125
+
126
+ class BuyGroceries
127
+ extend Plines::Step
128
+
129
+ # Indicates that the BuyGroceries step must run before all other steps.
130
+ # Essentially creates an implicit dependency of all steps on this one.
131
+ # You can have only one step declare `depended_on_by_all_steps`.
132
+ # Doing this relieves you of the burden of having to add
133
+ # `depends_on :BuyGroceries` to all step definitions.
134
+ depended_on_by_all_steps
135
+ end
136
+
137
+ # This step depends on BuyGroceries automatically due to the
138
+ # depended_on_by_all_steps declaration above.
139
+ class MakeStuffing
140
+ extend Plines::Step
141
+
142
+ # qless_options lets you set qless job options for this step.
143
+ qless_options do |qless|
144
+ # By default, jobs are enqueued to the :plines queue but you can override it
145
+ # Plines::Step overrides here will override any configurations in a Plines::Pipeline class
146
+ qless.queue = :make_stuffing
147
+ qless.tags = [:foo, :bar]
148
+ qless.priority = -10
149
+ qless.retries = 7
150
+ end
151
+ end
152
+
153
+ class PickupTurkey
154
+ extend Plines::Step
155
+
156
+ # External dependencies are named things that must be resolved
157
+ # before this step is allowed to proceed. They are intended for
158
+ # use when a step has a dependency on data from an external
159
+ # asynchronous system that operates on its own schedule.
160
+ has_external_dependencies do |deps, job_data|
161
+ deps.add "await_turkey_is_ready_for_pickup_notice", wait_up_to: 12.hours
162
+ end
163
+ end
164
+
165
+ class PrepareTurkey
166
+ extend Plines::Step
167
+
168
+ # Declares that the PrepareTurkey job cannot run until the
169
+ # PickupTurkey has run first. Note that the step class name
170
+ # is relative to the pipeline module namespace.
171
+ depends_on :PickupTurkey
172
+ end
173
+
174
+ class MakePie
175
+ extend Plines::Step
176
+
177
+ # By default, a single instance of a step will get enqueued in a
178
+ # pipeline job batch. The `fan_out` macro can be used to get multiple
179
+ # instances of the same step in a single job batch, each with
180
+ # different arguments.
181
+ #
182
+ # In this example, we will have multiple `MakePie` steps--one for
183
+ # each pie type, each with a different pie type argument.
184
+ fan_out do |batch_data|
185
+ batch_data['pie_types'].map do |type|
186
+ { 'pie_type' => type, 'family' => batch_data['family'] }
187
+ end
188
+ end
189
+
190
+ # Makes each instance of this step depend on the prior one,
191
+ # to ensure no two instances run in parallel. This isn't usually
192
+ # needed, but is occasionally useful to prevent resource contention
193
+ # when these jobs operate on a common resource.
194
+ run_jobs_in_serial
195
+ end
196
+
197
+ class AddWhipCreamToPie
198
+ extend Plines::Step
199
+
200
+ fan_out do |batch_data|
201
+ batch_data['pie_types'].map do |type|
202
+ { 'pie_type' => type, 'family' => batch_data['family'] }
203
+ end
204
+ end
205
+
206
+ # By default, `depends_on` makes all instances of this step depend on all
207
+ # instances of the named step. If you only want it to depend on some
208
+ # instances of the named step, pass a block; the instances of this step
209
+ # will only depend on the MakePie jobs for which the pie_type is the same.
210
+ depends_on :MakePie do |add_whip_cream_data, make_pie_data|
211
+ add_whip_cream_data['pie_type'] == make_pie_data['pie_type']
212
+ end
213
+ end
214
+
215
+ class SetTable
216
+ extend Plines::Step
217
+
218
+ # Indicates that this step should run last. This relieves you
219
+ # from the burden of having to add an extra `depends_on` declaration
220
+ # for each new step you create.
221
+ depends_on_all_steps
222
+ end
223
+ end
224
+ ```
225
+
226
+ ## Enqueing Jobs
227
+
228
+ To enqueue a job batch, use `#enqueue_jobs_for`:
229
+
230
+ ``` ruby
231
+ MakeThanksgivingDinner.enqueue_jobs_for(
232
+ "family" => "Smith",
233
+ "pie_types" => %w[ apple pumpkin pecan ]
234
+ )
235
+ ```
236
+
237
+ The argument given to `enqueue_jobs_for` _must_ be a hash. This
238
+ hash will be yielded to the `fan_out` blocks. In addition, this hash
239
+ (or the one returned by a `fan_out` block) will be available as
240
+ `#job_data` in a step's `#perform` method.
241
+
242
+ Based on the `MakeThanksgivingDinner` example above, the following jobs
243
+ will be enqueued in this batch:
244
+
245
+ * 1 BuyGroceries job
246
+ * 1 MakeStuffing job
247
+ * 1 PickupTurkey job
248
+ * 1 PrepareTurkey job
249
+ * 3 MakePie jobs, each with slightly different arguments (1 each with
250
+ "apple", "pumpkin" and "pecan")
251
+ * 3 AddWhipCreamToPie jobs, each with slightly different arguments (1
252
+ each with "apple", "pumpkin" and "pecan")
253
+ * 1 SetTable job
254
+
255
+ The declared dependencies will be honored as well:
256
+
257
+ * BuyGroceries is guaranteed to run first.
258
+ * MakeStuffing and the 3 MakePie jobs will be available for processing
259
+ immediately after the BuyGroceries job has finished.
260
+ * The 3 AddWhipCreamToPie jobs will be available for processing once
261
+ their corresponding MakePie jobs have completed.
262
+ * PickupTurkey will not run until the
263
+ `"await_turkey_is_ready_for_pickup_notice"` external dependency is
264
+ fulfilled (see below for more details).
265
+ * PrepareTurkey will be available for processing once the PickupTurkey
266
+ job has finished.
267
+ * SetTable will wait to be processed until all other jobs are complete.
268
+
269
+ ## Working With Job Batches
270
+
271
+ Plines stores data about the batch in redis. It also provides a
272
+ first-class `JobBatch` object that allows you to work with job batches.
273
+
274
+ First, you need to configure the pipeline so that it knows how your
275
+ batches are identified:
276
+
277
+ ``` ruby
278
+ MakeThanksgivingDinner.configure do |config|
279
+ config.batch_list_key do |batch_data|
280
+ batch_data["family"]
281
+ end
282
+ end
283
+ ```
284
+
285
+ Once this is in place, you can find a particular job batch:
286
+
287
+ ``` ruby
288
+ job_batch = MakeThanksgivingDinner.most_recent_job_batch_for("family" => "Smith")
289
+ ```
290
+
291
+ The `batch_list_key` config option above means the job batch will be
292
+ keyed by the "family" entry in the batch data hash. Thus, you can easily
293
+ look up a job batch by giving it a hash with the same "family" entry.
294
+
295
+ Once you have a job batch, there are several things you can do with it:
296
+
297
+ ``` ruby
298
+ # returns whether or not the job batch is finished.
299
+ job_batch.complete?
300
+
301
+ # returns the data hash that was used to enqueue the job batch
302
+ job_batch.data
303
+
304
+ # cancels all remaining jobs in this batch
305
+ job_batch.cancel!
306
+
307
+ # Resolves the named external dependency. For the example above,
308
+ # calling this will allow the PickupTurkey job to proceed.
309
+ job_batch.resolve_external_dependency "await_turkey_is_ready_for_pickup_notice"
310
+ ```
311
+
312
+ Plines sets expiration on the redis keys it uses to track job batches as
313
+ soon as the job batch is completed or canceled. By default, the
314
+ expiration is set to 6 months. You can configure it if you wish to
315
+ shorten it:
316
+
317
+ ``` ruby
318
+ MakeThanksgivingDinner.configure do |config|
319
+ config.data_ttl_in_seconds = 14 * 24 * 60 * 60 # 2 weeks
320
+ end
321
+ ```
322
+
323
+ ## External Dependency Timeouts
324
+
325
+ Under normal configuration, no job will run until all of its
326
+ dependencies have been met. However, plines provides support
327
+ for timing out an external dependency:
328
+
329
+ ``` ruby
330
+ module MyPipeline
331
+ class MyStep
332
+ extend Plines::Step
333
+ has_external_dependencies do |deps, job_data|
334
+ deps.add "my_async_service", wait_up_to: 3.hours
335
+ end
336
+ end
337
+ end
338
+ ```
339
+
340
+ With this configuration, Plines will schedule a Qless job to run in
341
+ 3 hours that will timeout the `"my_async_service"` external dependency,
342
+ allowing the `MyStep` job to run without the dependency being resolved.
343
+
344
+ ## Performing Work
345
+
346
+ When a job gets run, the `#perform` instance method of your step class
347
+ will be called. The return value of your perform method is ignored.
348
+ The perform method will have access to a few helper methods:
349
+
350
+ ``` ruby
351
+ module MakeThanksgivingDinner
352
+ class MakeStuffing
353
+ extend Plines::Step
354
+
355
+ def perform
356
+ # job_data gives you a struct-like object that is built off of
357
+ # your job_data hash
358
+ job_data.family # => returns "Smith" for our example
359
+
360
+ # The job_batch instance this job is a part of is available as
361
+ # well, so you can do things like cancel the batch.
362
+ job_batch.cancel!
363
+
364
+ # The underlying qless job is available as `qless_job`
365
+ qless_job.heartbeat
366
+
367
+ # External dependencies may be unresolved if it timed out (see above).
368
+ # #unresolved_external_dependencies returns an array of symbols,
369
+ # listing the external dependencies that are unresolved.
370
+ #
371
+ # Note that this does not necessarily indicate whether or not an
372
+ # external dependency timed out; it may have timed out, but then
373
+ # got resolved before this job ran.
374
+ # In addition, pending external dependencies are included (e.g.
375
+ # if the job was manually moved into the processing queue)
376
+ if unresolved_external_dependencies.any?
377
+ # do something different because there's an unresolved dependency
378
+ end
379
+ end
380
+ end
381
+ end
382
+ ```
383
+
384
+ Plines also supports a middleware stack that wraps your `perform` method.
385
+ To create a middleware, define a module with an `around_perform` method:
386
+
387
+ ``` ruby
388
+ module TimeWork
389
+ def around_perform
390
+ start_time = Time.now
391
+
392
+ # Use super at the point the work should occur...
393
+ super
394
+
395
+ end_time = Time.now
396
+ log_time(end_time - start_time)
397
+ end
398
+ end
399
+ ```
400
+
401
+ Then, include the module in your step class:
402
+
403
+ ``` ruby
404
+ module MakeThanksgivingDinner
405
+ class MakeStuffing
406
+ include TimeWork
407
+ end
408
+ end
409
+ ```
410
+
411
+ You can include as many middleware modules as you like.
412
+
413
+ ## Contributing
414
+
415
+ 1. Fork it
416
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
417
+ 3. Commit your changes (`git commit -am 'Added some feature'`)
418
+ 4. Push to the branch (`git push origin my-new-feature`)
419
+ 5. Create new Pull Request
420
+