batch_processor 0.2.6 → 0.3.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (45) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +1046 -11
  3. data/lib/batch_processor/batch/controller.rb +1 -1
  4. data/lib/batch_processor/batch/core.rb +1 -1
  5. data/lib/batch_processor/batch/job.rb +1 -1
  6. data/lib/batch_processor/batch/job_controller.rb +1 -1
  7. data/lib/batch_processor/batch/predicates.rb +3 -1
  8. data/lib/batch_processor/batch/processor.rb +16 -3
  9. data/lib/batch_processor/batch_base.rb +1 -0
  10. data/lib/batch_processor/batch_details.rb +1 -1
  11. data/lib/batch_processor/batch_job.rb +3 -1
  12. data/lib/batch_processor/collection.rb +1 -0
  13. data/lib/batch_processor/processor/execute.rb +1 -1
  14. data/lib/batch_processor/processor/process.rb +1 -1
  15. data/lib/batch_processor/processor_base.rb +1 -0
  16. data/lib/batch_processor/processors/parallel.rb +1 -0
  17. data/lib/batch_processor/processors/sequential.rb +1 -0
  18. data/lib/batch_processor/rspec/custom_matchers/set_processor_option.rb +1 -1
  19. data/lib/batch_processor/rspec/custom_matchers/use_default_job_class.rb +1 -1
  20. data/lib/batch_processor/rspec/custom_matchers/use_default_processor.rb +1 -1
  21. data/lib/batch_processor/rspec/custom_matchers/use_job_class.rb +1 -1
  22. data/lib/batch_processor/rspec/custom_matchers/use_parallel_processor.rb +1 -1
  23. data/lib/batch_processor/rspec/custom_matchers/use_sequential_processor.rb +1 -1
  24. data/lib/batch_processor/version.rb +1 -1
  25. data/lib/generators/batch_processor/USAGE +9 -0
  26. data/lib/generators/batch_processor/application_batch/USAGE +9 -0
  27. data/lib/generators/batch_processor/application_batch/application_batch_generator.rb +15 -0
  28. data/lib/generators/batch_processor/application_batch/templates/application_batch.rb +3 -0
  29. data/lib/generators/batch_processor/application_job/USAGE +0 -0
  30. data/lib/generators/batch_processor/application_job/application_job_generator.rb +15 -0
  31. data/lib/generators/batch_processor/application_job/templates/application_job.rb +4 -0
  32. data/lib/generators/batch_processor/batch_processor_generator.rb +15 -0
  33. data/lib/generators/batch_processor/install/USAGE +9 -0
  34. data/lib/generators/batch_processor/install/install_generator.rb +12 -0
  35. data/lib/generators/batch_processor/templates/batch.rb.erb +21 -0
  36. data/lib/generators/rspec/application_batch/USAGE +9 -0
  37. data/lib/generators/rspec/application_batch/application_batch_generator.rb +14 -0
  38. data/lib/generators/rspec/application_batch/templates/application_batch_spec.rb +8 -0
  39. data/lib/generators/rspec/application_job/USAGE +9 -0
  40. data/lib/generators/rspec/application_job/application_job_generator.rb +14 -0
  41. data/lib/generators/rspec/application_job/templates/application_job_spec.rb +8 -0
  42. data/lib/generators/rspec/batch_processor/USAGE +8 -0
  43. data/lib/generators/rspec/batch_processor/batch_processor_generator.rb +13 -0
  44. data/lib/generators/rspec/batch_processor/templates/batch_spec.rb.erb +29 -0
  45. metadata +22 -2
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e67aa3da97a88da1106578b16d93065d9cbcb8c1ad7045e4d1ca685a201bbb02
4
- data.tar.gz: 52017c9ea65ebcec0092d3fd6740aa3aa13df2d89561470a58f28088fa182177
3
+ metadata.gz: b654cd0471188921afcfc40b5f2fafd7017dba9fb23fbc809f52e9d629c16943
4
+ data.tar.gz: c76be590236d18b036f6d108b34b157527e5fe4063466e4c1413d1aad05dd0d3
5
5
  SHA512:
6
- metadata.gz: 725655af2c8944fcbcf837a2cf80e01992310c5c37352bd10da0ef6efdc60192ddc1224437fd6e3008be48383673d101a6d43680235c4b41d4aa2ed2d7d02441
7
- data.tar.gz: ce203c992d02fa528a5bab0971a5e38d29390de28312aba53ba0173681f5dbe5e3f5c45da95bbbe351e5c0ba34144cd5c7cab077fff8ab60fec9ab4417a67110
6
+ metadata.gz: db14cc81563997580581a31f1c96c75c85fa08aa3f9c98be060ac41a14c7b1a59bb4cc6cae9fcb065fea181af9c96c17ad9b546a92b533240ed7c07cfa2d3c51
7
+ data.tar.gz: 1f7f8df80928fde8adf155a707be309fc468a48f7abd0502d039bcd6f15c1b9fd94be7fee4d09126bb26f926186ec2c67bc37c0e249bd6e76113012e37336490
data/README.md CHANGED
@@ -7,13 +7,44 @@ Define your collection, job, and callbacks all in one clear and concise object
7
7
  [![Maintainability](https://api.codeclimate.com/v1/badges/fbdaeaf118a16a55ab7d/maintainability)](https://codeclimate.com/github/Freshly/batch_processor/maintainability)
8
8
  [![Test Coverage](https://api.codeclimate.com/v1/badges/fbdaeaf118a16a55ab7d/test_coverage)](https://codeclimate.com/github/Freshly/batch_processor/test_coverage)
9
9
 
10
- * [BatchProcessor](#batchprocessor)
11
- * [Installation](#installation)
12
- * [Usage](#usage)
10
+ * [Installation](#installation)
11
+ * [Getting Started](#getting-started)
12
+ * [What is BatchProcessor?](#what-is-batchprocessor)
13
+ * [How It Works](#how-it-works)
14
+ * [Batches](#batches)
15
+ * [Collection](#collection)
16
+ * [Input](#input)
17
+ * [Validations](#validations)
18
+ * [ActiveJob](#activejob)
19
+ * [Retries](#retries)
20
+ * [Details](#details)
21
+ * [Detail Methods](#detail-methods)
22
+ * [Status](#status)
23
+ * [Status Methods](#status-methods)
24
+ * [Callbacks](#callbacks)
25
+ * [Callback Methods](#callback-methods)
26
+ * [Processors](#processors)
27
+ * [Parallel Processor](#parallel-processor)
28
+ * [Sequential Processor](#sequential-processor)
29
+ * [Processor Options](#processor-options)
30
+ * [Jobs](#jobs)
31
+ * [Handling Errors](#handling-errors)
32
+ * [Troubleshooting](#troubleshooting)
33
+ * [Best Practice](#best-practice)
34
+ * [Aborting](#aborting)
35
+ * [Clearing](#clearing)
36
+ * [Monitor Job](#monitor-job)
37
+ * [Monitor Cron](#monitor-cron)
38
+ * [Testing](#testing)
39
+ * [Testing Setup](#testing-setup)
40
+ * [Testing Batches](#testing-batches)
41
+ * [Testing Jobs](#testing-jobs)
42
+ * [Integration Testing](#integration-testing)
43
+ * [Custom Processors](#custom-processors)
44
+ * [Testing Processors](#testing-processors)
45
+ * [Contributing](#contributing)
13
46
  * [Development](#development)
14
- * [Contributing](#contributing)
15
- * [License](#license)
16
-
47
+ * [License](#license)
17
48
 
18
49
  ## Installation
19
50
 
@@ -31,20 +62,1024 @@ Or install it yourself as:
31
62
 
32
63
  $ gem install batch_processor
33
64
 
34
- ## Usage
65
+ ## Getting Started
35
66
 
36
- TODO: Write usage instructions here
67
+ BatchProcessor comes with some nice rails generators. You are encouraged to use them!
37
68
 
38
- ## Development
69
+ ```bash
70
+ $ rails g batch_processor foo
71
+ invoke rspec
72
+ create spec/batches/foo_batch_spec.rb
73
+ create app/batches/foo_batch.rb
74
+ ```
39
75
 
40
- After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
76
+ ## What is BatchProcessor?
41
77
 
42
- To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
78
+ BatchProcessor is a framework for the sequential or parallel processing of jobs in Ruby on Rails.
79
+
80
+ BatchProcessor helps monitor, control, and orchestrate the work done by `ActiveJob`.
81
+
82
+ 💁‍ This requires [Redis](https://github.com/redis/redis-rb) and a properly configured `ActiveJob` queue adapter (like [Sidekiq](https://github.com/mperham/sidekiq)).
83
+
84
+ ## How It Works
85
+
86
+ ![BatchProcessor](docs/images/batch_processor.png)
87
+
88
+ There are three key concepts to distinguish here: [Batches](#Batches), [Processors](#Processors), and [Jobs](#Jobs).
89
+
90
+ ### Batches
91
+
92
+ A **Batch** defines, controls, and monitors the processing of a collection of items with an `ActiveJob`.
93
+
94
+ All Batches should be named with the `Batch` suffix (ex: `FooBatch`).
95
+
96
+ ```ruby
97
+ class PodSprintCalculationBatch < ApplicationBatch
98
+ set_callback(:batch_started, :before) { raise CalculationsNotRunning unless Calculator.busy? }
99
+
100
+ on_batch_finished { Calculator.done! }
101
+
102
+ class Collection < BatchCollection
103
+ argument :sprint, allow_nil: false
104
+ option :recalculate, default: false
105
+
106
+ def items
107
+ recalculate ? items_for_recalculation : items_for_calculation
108
+ end
109
+
110
+ def items_for_calculation
111
+ items_for_recalculation.without_performance_metrics
112
+ end
113
+
114
+ def items_for_recalculation
115
+ sprint.pod_sprints.with_performance_plans
116
+ end
117
+ end
118
+ end
119
+ ```
120
+
121
+ A batch is a synthesis of four concepts: a [Collection](#Collection), an [ActiveJob](#ActiveJob), granular [Details](#Details), a summary [Status](#Status), and some [Callbacks](#Callbacks).
122
+
123
+ #### Collection
124
+
125
+ A `Collection` takes input to validate and build a (possibly ordered) list of items to process with the Batch's job.
126
+
127
+ Batches accept a unique identifier and input representing the arguments and options which define it's collection.
128
+
129
+ ```ruby
130
+ batch_id = SecureRandom.hex
131
+ PodSprintCalculationBatch.process(batch_id: batch_id, sprint: Sprint.last)
132
+ ```
133
+
134
+ You can supply any unique value you want for a `batch_id`:
135
+
136
+ ```ruby
137
+ attempt_number = 1
138
+ current_date = Date.today
139
+ batch_id = "daily-charge-batch:#{current_date}:#{attempt_number}"
140
+
141
+ ChargeBatch.process(batch_id: batch_id, date: current_date)
142
+ ```
143
+
144
+ Which you can then pass to `ApplicationBatch.find` to load:
145
+
146
+ ```ruby
147
+ batch = ApplicationBatch.find("daily-charge-batch:#{Date.today}:1")
148
+ batch.class.name # => ChargeBatch
149
+ batch.batch_id # => "daily-charge-batch:2019-07-25:1"
150
+ ```
151
+
152
+ If you do not specify a `batch_id` one will be randomly generated.
153
+
154
+ ```ruby
155
+ batch = ChargeBatch.process(date: Date.today)
156
+ batch.batch_id # => XP-f-G23bNFwww
157
+ ```
158
+
159
+ ##### Input
160
+
161
+ A collection accepts input represented by arguments and options which initialize it.
162
+
163
+ Arguments describe input required to define the initial state.
164
+
165
+ If any arguments are missing, an ArgumentError is raised.
166
+
167
+ ```ruby
168
+ class ExampleJob < BatchProcessor::BatchJob
169
+ def perform(arg)
170
+ "OK #{arg}"
171
+ end
172
+ end
173
+
174
+ class ExampleBatch < ApplicationBatch
175
+ class Collection < BatchCollection
176
+ argument :foo
177
+ argument :bar
178
+
179
+ def items
180
+ [ foo, bar ]
181
+ end
182
+ end
183
+ end
184
+
185
+ ExampleBatch.process # => ArgumentError (Missing arguments: foo, bar)
186
+ ExampleBatch.process(foo: "foo") # => ArgumentError (Missing argument: bar)
187
+ ExampleBatch.process(foo: "foo", bar: "bar") # => #<ExampleBatch batch_id="XPf--GzdbRLyww">
188
+ ```
189
+
190
+ By default, nil is a valid argument:
191
+
192
+ ```ruby
193
+ ExampleBatch.process(foo: nil, bar: nil) # => #<ExampleBatch batch_id="f-GzXP-dbn3yxw">
194
+ ```
195
+
196
+ If you want to require a non-nil value for your argument, set the allow_nil option (true by default):
197
+
198
+ ```ruby
199
+ class ExampleBatch < ApplicationBatch
200
+ class Collection < BatchCollection
201
+ argument :foo
202
+ argument :bar, allow_nil: false
203
+
204
+ def items
205
+ [ foo, bar ]
206
+ end
207
+ end
208
+ end
209
+
210
+ ExampleBatch.process(foo: nil, bar: nil) # => ArgumentError (Missing argument: bar)
211
+ ```
212
+
213
+ Options describe input which may be provided to define or override the initial state.
214
+
215
+ Options can optionally define a default value.
216
+
217
+ If no default is specified, the value will be nil.
218
+
219
+ If the default value is static, it can be specified in the class definition.
220
+
221
+ If the default value is dynamic, you may provide a block to compute the default value.
222
+
223
+ ⚠️‍ Heads Up: The default value blocks DO NOT provide access to the state or its other variables!
224
+
225
+ ```ruby
226
+ class ExampleBatch < ApplicationBatch
227
+ class Collection < BatchCollection
228
+ option :attribution_source
229
+ option :favorite_foods, default: %w[pizza ice_cream gluten]
230
+ option(:favorite_color) { SecureRandom.hex(3) }
231
+
232
+ def items
233
+ [ attribution_source, favorite_foods, favorite_color ]
234
+ end
235
+ end
236
+ end
237
+
238
+ batch = ExampleBatch.process(favorite_foods: %w[avocado hummus nutritional_yeast])
239
+ collection = batch.collection
240
+
241
+ collection.attribution_source # => nil
242
+ collection.favorite_color # => "1a1f1e"
243
+ collection.favorite_foods # => ["avocado", "hummus" ,"nutritional_yeast"]
244
+ ```
245
+
246
+ ##### Validations
247
+
248
+ Collections are `ActiveModels` which means they have access to [ActiveModel::Validations](https://api.rubyonrails.org/classes/ActiveModel/Validations.html).
249
+
250
+ It is considered a best practice to write validations in your collections.
251
+
252
+ Batches which have an invalid collection will NOT start and therefore will not process any Jobs, so it is inherently the safest and clearest way to proactively communicate about missed expectations.
253
+
254
+ 💁‍ Pro Tip: There is a `process!` method on Batches that will raise any errors (which are normally silenced). Invalid states are one such example!
255
+
256
+ ```ruby
257
+ class ExampleBatch < ApplicationBatch
258
+ class Collection < BatchCollection
259
+ argument :first_name
260
+
261
+ validates :first_name, length: { minimum: 2 }
262
+
263
+ def items
264
+ [ first_name ]
265
+ end
266
+ end
267
+ end
268
+
269
+ ExampleBatch.process!(first_name: "a") # => raises BatchProcessor::CollectionInvalidError
270
+
271
+ batch = ExampleBatch.process(first_name: "a")
272
+ batch.started? # => false
273
+ batch.collection_valid? # => false
274
+ batch.collection.errors.messages # => {:first_name=>["is too short (minimum is 2 characters)"]}
275
+ ```
276
+
277
+ #### ActiveJob
278
+
279
+ When `.process` is called on a Batch, `.execute` is called on the `Processor` specified in the Batch's definition.
280
+
281
+ Unless otherwise specified a **Batch** assumes its Job class shares a common name.
282
+
283
+ Ex: `FooBarBazBatch` assumes there is a defined `FooBarBazJob`.
284
+
285
+ If you want to customize this behavior, define the job class explicitly:
286
+
287
+ ```ruby
288
+ class ExampleBatch < ApplicationBatch
289
+ process_with_job SomeOtherJob
290
+ end
291
+ ```
292
+
293
+ ##### Retries
294
+
295
+ BatchProcessor is designed to work with ActiveJob's built in retries.
296
+
297
+ Any job with a valid retry strategy will be allowed to exhaust all of it's attempts before it will be considered failed.
298
+
299
+ When a job raises with retries remaining, the batch essentially "ignores" it ever ran, which allows it to be retried.
300
+
301
+ To keep track of how often these handled failures are happening, the batch keeps a running tally of total retries.
302
+
303
+ ```ruby
304
+ batch = AppplicationBatch.find(batch_id)
305
+ batch.details.total_retries_count # => 15
306
+ ```
307
+
308
+ In this example the `15` count could mean any number of things:
309
+
310
+ 1. A single job raised, and was retried, 15 times.
311
+ 2. 3 jobs raised and were retried their maximum of 5 times before failing.
312
+ 3. 5 jobs raised and were retried their maximum of 3 times before failing.
313
+ 4. 15 different jobs all raised once and were retried, all of which were successful.
314
+ 5. 13 different jobs all raised once, and one failed twice more on top of that, before finished successfully.
315
+
316
+ Because of the wide variety of cases this covers, the batch cannot and doesn't try to make decisions off this.
317
+
318
+ Instead, this information is tracked to provide developers with some introspection as to the behavior of the batch.
319
+
320
+ Ideally, the final state of the batch combined with the retry information and server logs should allow you to determine.
321
+
322
+ 💡 **Note**: Batch failure is only triggered after **all** retries are exhausted for the job.
323
+
324
+ #### Details
325
+
326
+ The **Details** of a batch are the times of critical lifecycle events and the summary counts of processed jobs.
327
+
328
+ ```ruby
329
+ batch = ExampleBatch.process
330
+ details = batch.details
331
+
332
+ details.started_at # => 2019-07-25 12:13:44 UTC
333
+ details.size # => 1
334
+ details.pending_jobs_count # => 1
335
+ details.to_h # => {"class_name"=>"ExampleBatch", "started_at"=>"2019-07-25 08:13:44 -0400", "size"=>"1", "pending_jobs_count"=>"1"}
336
+ ```
337
+
338
+ The details object is built with [RedisHash](https://github.com/Freshly/spicerack/tree/master/redis_hash) which works just like a plain old ruby Hash which makes calls to fetch data automatically.
339
+
340
+ ⚠️ **Warning**: This hash is **NOT** cached so each method call makes a `Redis` call! `#FeatureNotABug`
341
+
342
+ ```ruby
343
+ batch = ExampleBatch.process
344
+ details = batch.details
345
+
346
+ details.pending_jobs_count # => 3
347
+
348
+ # rake resque:work in another window...
349
+
350
+ details.pending_jobs_count # => 2
351
+ details.pending_jobs_count # => 1
352
+ ```
353
+
354
+ ##### Detail Methods
355
+
356
+ | Name | Type | Description |
357
+ | --------------------- | -------- | ------------------------------------------ |
358
+ | batch_id | String | The unique ID of the batch's instance. |
359
+ | class_name | String | The name of the batch's class. |
360
+ | started_at | DateTime | When processing began on the batch. |
361
+ | enqueued_at | DateTime | `[Parallel]` When all jobs were enqueued. |
362
+ | aborted_at | DateTime | When `#abort!` was called on the batch. |
363
+ | cleared_at | DateTime | When `#clear!` was called on the batch. |
364
+ | finished_at | DateTime | When processing finished on the batch. |
365
+ | size | Number | Count of items in the batch's collection. |
366
+ | enqueued_jobs_count | Number | `[Parallel]` Count of the jobs enqueued. |
367
+ | pending_jobs_count | Number | Count of jobs waiting to be performed. |
368
+ | running_jobs_count | Number | Count of jobs currently being performed. |
369
+ | successful_jobs_count | Number | Count of jobs performed successfully. |
370
+ | failed_jobs_count | Number | Count of jobs which raised errors. |
371
+ | canceled_jobs_count | Number | Count of jobs NOT performed from `abort`. |
372
+ | cleared_jobs_count | Number | Count of missing jobs flushed by `clear`. |
373
+ | total_retries_count | Number | Total count of retry attempts by all jobs. |
374
+ | unfinished_jobs_count | Number | Current count of jobs pending and running. |
375
+ | finished_jobs_count | Number | Current count of jobs already performed. |
376
+ | total_jobs_count | Number | Count of jobs (which should equal `size`). |
377
+
378
+ #### Status
379
+
380
+ The **Status** of a batch is manifested by a collection of predicates which track certain lifecycle events.
381
+
382
+ ```ruby
383
+ batch = ExampleBatch.process
384
+ batch.started? # => true
385
+ batch.enqueued? # => false
386
+ batch.aborted? # => false
387
+ batch.finished? # => true
388
+
389
+ batch.enqueued_jobs? # => false
390
+ batch.finished_jobs? # => true
391
+ ```
392
+
393
+ ##### Status Methods
394
+
395
+ | Name | Description |
396
+ | ----------------- | ----------------------------------------------- |
397
+ | started? | True if `started_at` is defined for the batch. |
398
+ | enqueued? | True if `enqueued_at` is defined for the batch. |
399
+ | aborted? | True if `aborted_at` is defined for the batch. |
400
+ | cleared? | True if `cleared_at` is defined for the batch. |
401
+ | finished? | True if `finished_at` is defined for the batch. |
402
+ | enqueued_jobs? | True if `enqueued_jobs_count > 0`. |
403
+ | pending_jobs? | True if `pending_jobs_count > 0`. |
404
+ | running_jobs? | True if `running_jobs_count > 0`. |
405
+ | failed_jobs? | True if `failed_jobs_count > 0`. |
406
+ | canceled_jobs? | True if `canceled_jobs_count > 0`. |
407
+ | unfinished_jobs? | True if `unfinished_jobs_count > 0`. |
408
+ | finished_jobs? | True if `finished_jobs_count > 0`. |
409
+ | collection_valid? | True if all the Collection's validations pass. |
410
+ | processing? | True if started, unfinished, and not aborted. |
411
+
412
+ #### Callbacks
413
+
414
+ Batches have a status which is driven by the jobs it is processing. Callbacks are fired in response to status changes.
415
+
416
+ ```ruby
417
+ class ExampleBatch < ApplicationBatch
418
+ class Collection < BatchCollection
419
+ def items
420
+ [ SecureRandom.hex ]
421
+ end
422
+ end
423
+
424
+ on_batch_started { SlackClient.send_message("Batch started!") }
425
+ on_batch_finished { SlackClient.send_message("Batch finished!") }
426
+
427
+ on_batch_aborted :handle_batch_aborted, unless: -> { Business.during_business_hours? }
428
+ on_batch_cleared :handle_batch_cleared, if: :important?
429
+
430
+ def important?
431
+ batch_id.include?("vip")
432
+ end
433
+
434
+ def handle_batch_aborted
435
+ EmailClient.send_email("management@business.engineering", "Unexpected batch abort!", batch_id)
436
+ end
437
+
438
+ def handle_batch_cleared
439
+ EmailClient.send_email("developers@business.engineering", "Crazy stuff happened!", details.to_h)
440
+ end
441
+ end
442
+ ```
443
+
444
+ ##### Callback Methods
445
+
446
+ | Name | Triggered when... |
447
+ | ----------------- | --------------------------------------------------- |
448
+ | on_batch_started | The batch is started. |
449
+ | on_batch_enqueued | `[Parallel]` All batch jobs are enqueued. |
450
+ | on_batch_aborted | The batch is aborted. |
451
+ | on_batch_cleared | The batch is cleared. |
452
+ | on_batch_finished | The batch is finished. |
453
+ | on_job_enqueued | A batch job is enqueued. |
454
+ | on_job_running | A batch job begins performing. |
455
+ | on_job_success | A batch job is successfully performed. |
456
+ | on_job_failure | A batch job raises an error being performed. |
457
+ | on_job_retried | A batch job is retried rather than failing. |
458
+ | on_job_canceled | A batch job skips perform after a batch is aborted. |
459
+
460
+ ### Processors
461
+
462
+ A **Processor** is a service object which determines how to perform a Batch's jobs to properly process its collection.
463
+
464
+ Unless otherwise specified a **Batch** uses the `default` **Parallel** Processor.
465
+
466
+ ```ruby
467
+ class DefaultBatch < ApplicationBatch; end
468
+ DefaultBatch.processor_class # => BatchProcessor::Processors::Parallel
469
+
470
+ class ExampleBatch < ApplicationBatch
471
+ with_sequential_processor
472
+ end
473
+ ExampleBatch.processor_class # => BatchProcessor::Processors::Sequential
474
+
475
+ class OtherBatch < ApplicationBatch
476
+ with_parallel_processor
477
+ end
478
+ OtherBatch.processor_class # => BatchProcessor::Processors::Parallel
479
+ ```
480
+
481
+ The default processors can be redefined and new [custom processors](#custom-processors) can be added as well.
482
+
483
+ Create a `config/initializers/batch_processor.rb` to define these:
484
+
485
+ ```ruby
486
+ # Make sequential processor the default
487
+ ApplicationBatch::PROCESSOR_CLASS_BY_STRATEGY[:default] = BatchProcessor::Processors::Sequential
488
+ ```
489
+
490
+ Certain processors have configurable options; this configuration is specified in the Batch's definition.
491
+
492
+ ```ruby
493
+ class ExampleBatch < ApplicationBatch
494
+ with_sequential_processor
495
+ processor_option :continue_after_exception, true
496
+ end
497
+ ```
498
+
499
+ BatchProcessor comes with two standard processors: **Parallel** and **Sequential**.
500
+
501
+ #### Parallel Processor
502
+
503
+ ![parallel](docs/images/parallel-processor.png)
504
+
505
+ The Parallel Processor enqueues jobs to be performed later.
506
+
507
+ #### Sequential Processor
508
+
509
+ ![sequential](docs/images/sequential-processor.png)
510
+
511
+ The Sequential Processor uses `.perform_now` to procedurally process each job within the current thread.
512
+
513
+ ⚠️ **WARNING**: Using a sequential processors disables job retries in a batch **even if they are defined and valid**!
514
+
515
+ ##### Processor Options
516
+
517
+ | Name | Description |
518
+ | -------------------------- | ------------------------------------------- |
519
+ | `continue_after_exception` | If true, batch continues after job error. |
520
+ | `sorted`* | If true, `#find_each` will **not** be used. |
521
+
522
+ 💁 **HEADS UP**: `find_each` is used when possible, which ignores `order`; the flag only forces `#each`.
523
+
524
+ ### Jobs
525
+
526
+ BatchProcessor depends on ActiveJob for handling the processing of individual items in a collection.
527
+
528
+ Only a **BatchJob** can be used to perform work, but it can be run outside of a batch as well.
529
+
530
+ Therefore, the recommendation is to make `ApplicationJob` inherit from `BatchJob`.
531
+
532
+ The `rails g batch_processor:install` does this for you:
533
+
534
+ ```ruby
535
+ class ApplicationJob < BatchProcessor::BatchJob; end
536
+ ```
537
+
538
+ A BatchJob calls into the Batch to report on it's lifecycle from start to finish, including on success and failure.
539
+
540
+ #### Handling Errors
541
+
542
+ When an error occurs in a BatchJob it will be tracked as a failure within a batch.
543
+
544
+ This is true even if a `rescue_from` handler is defined for the batch.
545
+
546
+ Intentionality is very difficult to ascertain in a topic as nuanced as error handling, so batches make some assumptions.
547
+
548
+ 1. If you define a `rescue_from`, you want to treat that exception as a batch failure BUT NOT a job failure.
549
+ 1. If you define a `rescue` in the `perform` block, you want to treat the exception as NEITHER a batch NOR job failure.
550
+ 1. If you define no rescue of any kind, you want to treat that exception as BOTH a batch AND a job failure.
551
+
552
+ Because `BatchProcessor` cannot speculate on it therefore doesn't attempt to control your application's error handling.
553
+
554
+ Instead, it only brings this incredibly dire warning:
555
+
556
+ ⚠️ **WARNING**: You should never **EVER** "manually retry" a batch job! This can mess up the counter!
557
+
558
+ Defining a valid retry strategy within the job is the **ONLY** way to handle retries of a batch job!
559
+
560
+ If you attempt to manually re-enqueue a batch job from your processors failed queue, you **WILL** have a bad time.
561
+
562
+ Instead, you should always follow the [Troubleshooting](#troubleshooting) guide to handle exceptional failures.
563
+
564
+ 👍 **NOTE**: It is considered a "best practice" to define error handling for all your jobs, batchable or otherwise!
565
+
566
+ ## Troubleshooting
567
+
568
+ Sometimes, `"weird stuff"` (this is a technical term) happens on the internet.
569
+
570
+ One example is a vanishing job:
571
+
572
+ - A job is picked off the queue and usually takes 18 seconds process.
573
+ - 5 seconds into performing, the worker received a `SIGTERM`.
574
+ - The worker, being Resque, decides to dirty exit instead of graceful shutdown.
575
+ - The job never completes, never is retried, never enters the queue again, and never reports status.
576
+ - The `running_jobs_count` of your batch and will contain a count that will never go down.
577
+ - Because one of the jobs has not reported in, the batch will never complete.
578
+
579
+ ⚠️ **Warning**: This kind of "weird stuff" can always happen, and at scale **WILL** always happen! Be prepared!
580
+
581
+ ### Best Practice
582
+
583
+ Troubleshooting this issue will be very similar to troubleshooting any batch issues, but no two issues are fully alike.
584
+
585
+ What follows is therefore the generic "best practice" for handling any class of batch issue.
586
+
587
+ 1. Abort the Batch. This stops any new batches from processing and allows any enqueued jobs to flush from the workers.
588
+ 2. Damage Report. Figure out what went wrong and what needs to be cleaned up.
589
+ 3. Cleanup Fallout. Perform all the cleanup as determined in step 2.
590
+ 4. Wait. Allow time for the workers to chew through and cancel the pending jobs in your aborted batch.
591
+ 5. Clear the Batch. Manually flush any lost jobs, forcing the batch to run it's completion events.
592
+
593
+ **Abort the Batch**
594
+
595
+ ```ruby
596
+ batch = ApplicationBatch.find(batch_id)
597
+ batch.abort!
598
+ ```
599
+
600
+ **Damage Report**
601
+
602
+ 💡 **Note**: By the nature of async processing your jobs can (and likely will, given enough workers) fail at every line:
603
+
604
+ ```ruby
605
+ class ExampleJob < ApplicationJob
606
+ def perform(order)
607
+ raise NotProcessing unless order.payment_processing?
608
+
609
+ order.mark_charge_starting!
610
+
611
+ charge_service = ChargeService.new(order)
612
+ charge_result = charge_service.charge!
613
+
614
+ if charge_result.success?
615
+ order.mark_charge_success!
616
+ else
617
+ order.mark_payment_failed!
618
+ end
619
+ end
620
+ end
621
+ ```
622
+
623
+ In this example, if you had say, 30 workers processing your batch, you could expect to see the following issues:
624
+
625
+ - Orders which were taken off the queue, marked as running, and then never passed the guard clause.
626
+ - Orders which were marked that the charge was starting, but the service was never instantiated.
627
+ - 😱 Orders which were submitted and a customer's money was taken, but your application has no record of that!
628
+ - Orders submitted and a customer did not have funds available, but the application has no record of that EITHER!!
629
+ - We get the response, but are not capable of reporting success about the charge in the database.
630
+ - We actually record success in the database but the job cannot report itself as having completed to the batch!
631
+
632
+ 💁‍ **The Rule of Law**: For every `N` lines of code in your job, you create `N+2` **at least** unique problems. 😬
633
+
634
+ ### Aborting
635
+
636
+ Batches can be **Aborted**.
637
+
638
+ ![aborting](docs/images/aborting.png)
639
+
640
+ When aborted, processing *will continue* on enqueued jobs but **those jobs will not be performed**.
641
+
642
+ Abort only prevents new jobs from being performed, as this is less disruptive (and much easier) than queue flushing.
643
+
644
+ When a job is skipped because of an aborted batch, it reports itself as **canceled**.
645
+
646
+ ```ruby
647
+ batch = ApplicationBatch.find(some_batch_id)
648
+ details = batch.details
649
+
650
+ details.performed_jobs_count # => 7
651
+ details.performed_jobs_count # => 8
652
+ details.canceled_jobs_count # => 0
653
+
654
+ batch.abort!
655
+
656
+ details.performed_jobs_count # => 8
657
+ details.canceled_jobs_count # => 1
658
+ details.canceled_jobs_count # => 2
659
+ ```
660
+
661
+ 💡 **Note**: Running jobs will complete normally if `#abort!` was called after perform began on them.
662
+
663
+ #### Clearing
664
+
665
+ Because clearing is a manual process only to be used in exceptional circumstances, it **requires** the batch be aborted.
666
+
667
+ In these cases, after a developer intervenes to assess the impact of the failure, the batch can be manually cleared.
668
+
669
+ ```ruby
670
+ batch = ApplicationBatch.find(some_batch_id)
671
+ details = batch.details
672
+
673
+ details.size # => 10
674
+ details.pending_jobs_count # => 2
675
+ details.running_jobs_count # => 2
676
+ details.finished_jobs_count # => 6
677
+ details.cleared_jobs_count # => 0
678
+
679
+ details.clear!
680
+
681
+ details.running_jobs_count # => 0
682
+ details.pending_jobs_count # => 0
683
+ details.cleared_jobs_count # => 4
684
+ ```
685
+
686
+ 💡 **Note**: Calling `#clear!` on a batch will trigger the batch completion events and finish the batch.
687
+
688
+ There is no use case to `#clear!` an in-flight batch and doing so is incredibly disruptive and corrupt the counts.
689
+
690
+ ### Monitor Job
691
+
692
+ Because of the nature of `"weird stuff"` that can happen in processing, it's highly encouraged you add monitoring.
693
+
694
+ The most common kind of monitoring that supports operations like this well are "dead man switches".
695
+
696
+ [Dead Man's Snitch](https://deadmanssnitch.com) is an example of a monitoring service that can help!
697
+
698
+ You could also construct a batch monitor job that works for any / all batches and can be enqueued alongside:
699
+
700
+ ```ruby
701
+ class BatchMonitorJob < ApplicationBatch
702
+ queue_name :a_higher_priority_than_any_jobs
703
+
704
+ def perform(batch_id)
705
+ batch = ApplicationBatch.find(batch_id)
706
+ batch.abort! unless batch.finished?
707
+ end
708
+ end
709
+ ```
710
+
711
+ With a job like this, you can add a monitor to any batch:
712
+
713
+ ```ruby
714
+ batch_id = SecureRandom.hex
715
+ SomeImportantBatch.process(batch_id: batch_id)
716
+ BatchMonitorJob.set(wait: 20.minutes).perform_later(batch_id)
717
+ ```
718
+
719
+ There are *several* important notes here.
720
+
721
+ **Delayed Jobs**
722
+
723
+ You must have configured ActiveJob with a queue processor that supports delayed jobs, and any extra hoops therein.
724
+
725
+ 💁 **Example**: On Resque, a second worker and suite of gems is required to support `.set(wait: 20.minutes)`!
726
+
727
+ Please make sure to consult your given ActiveJob queue processor's own documentation for supporting delayed jobs.
728
+
729
+ **Delay**
730
+
731
+ There is no such thing as "normal" for what to expect for processing time. You can expectation set using the following:
732
+
733
+ ```text
734
+ good_timeout = ((number_of_jobs * average_time_per_job) / average_number_of_workers_available) + rand(5).minutes
735
+ ```
736
+
737
+ For some batches, you may expect 10 minutes to be more than sufficient to process the whole workload.
738
+
739
+ For other batches, you may expect 2 hours to be a very conservative and aggressive estimate for completion.
740
+
741
+ Sometimes, the 10 minute batch could take over 2 hours if enqueued at the same time if other jobs are taking resources.
742
+
743
+ Because of the great variability here, you need to assess the differences of your own environment and adapt accordingly.
744
+
745
+ **Queue Priority**
746
+
747
+ Delayed jobs get enqueued after their delay at the REAR of the queue.
748
+
749
+ As such, your monitor **must** have a higher priority than any of jobs it's meant to monitor!
750
+
751
+ Otherwise it will be enqueued and processed after all the jobs finish, which defeats the purpose of this!
752
+
753
+ Because queueing theory is hard and there isn't really a great standard and several approaches, you're on your own here.
754
+
755
+ Sorry, and godspeed!
756
+
757
+ **Notifications**
758
+
759
+ This automation is only a "best practice" for keeping things more or less clean.
760
+
761
+ Anytime an abort happens, developers likely need to look into why, as it should be considered an exceptional event.
762
+
763
+ As such, you are encouraged to setup notifications and alerting to communicate these events to your staff.
764
+
765
+ There are lots of solutions for how to setup this kind of monitoring, like using an internal mailer:
766
+
767
+ ```ruby
768
+ class ApplicationBatch < BatchProcessor::BatchBase
769
+ on_batch_aborted { EngineeringMailer.batch_timeout(batch: self) }
770
+ end
771
+ ```
772
+
773
+ ### Monitor Cron
774
+
775
+ Another alternative which inherits different constraints is setting up a cron task to monitory our batches.
776
+
777
+ Again, depending on the specifics of your environment and needs, there are several different solutions here as well.
778
+
779
+ Let's assume you already have invested in a cron solution like the [clockwork gem](https://rubygems.org/gems/clockwork).
780
+
781
+ If you know for instances that your charge batch was certainly supposed to have started by 8PM, write that:
782
+
783
+ ```ruby
784
+ every(1.day, 'charge_start.job', at: '20:00') do
785
+ batch_id = "charge-batch-for-#{Date.today}"
786
+ begin
787
+ batch = ApplicationBatch.find(batch_id)
788
+ EngineeringMailer.charge_batch_not_started(batch_id) unless batch.started?
789
+ rescue StandardError => exception
790
+ EngineeringMailer.charge_batch_not_found(exception)
791
+ end
792
+ end
793
+ ```
794
+
795
+ Likewise, if by 10PM it needs to be finished or you want it aborted, write that:
796
+
797
+ ```ruby
798
+ every(1.day, 'charge_start.job', at: '22:00') do
799
+ batch_id = "charge-batch-for-#{Date.today}"
800
+ begin
801
+ batch = ApplicationBatch.find(batch_id)
802
+ # Assuming you have some kind of generic `on_batch_abort` that handles notifications
803
+ batch.abort! unless batch.completed?
804
+ rescue StandardError => exception
805
+ EngineeringMailer.charge_batch_not_found(exception)
806
+ end
807
+ end
808
+ ```
809
+
810
+ ## Testing
811
+
812
+ If you plan on writing `RSpec` tests `BatchProcessor` comes packaged with some custom matchers.
813
+
814
+ ### Testing Setup
815
+
816
+ Add the following to your spec/rails_helper.rb file:
817
+
818
+ ```ruby
819
+ require "batch_prcessor/spec_helper"
820
+ ```
821
+
822
+ BatchProcessor works best with [shoulda-matchers](https://github.com/thoughtbot/shoulda-matchers) and [rspice](https://github.com/Freshly/spicerack/tree/master/rspice).
823
+
824
+ Add them to the development and test group of your Gemfile:
825
+
826
+ ```ruby
827
+ group :development, :test do
828
+ gem "shoulda-matchers", git: "https://github.com/thoughtbot/shoulda-matchers.git", branch: "rails-5"
829
+ gem "rspice"
830
+ end
831
+ ```
832
+
833
+ Then run `bundle install` and add the following into `spec/rails_helper.rb`:
834
+
835
+ ```ruby
836
+ require "rspec/rails"
837
+ require "rspice"
838
+ require "batch_processor/spec_helper"
839
+
840
+ # Configuration for the shoulda-matchers gem
841
+ Shoulda::Matchers.configure do |config|
842
+ config.integrate do |with|
843
+ with.test_framework :rspec
844
+ with.library :rails
845
+ end
846
+ end
847
+ ```
848
+
849
+ This will allow you to use the following custom matchers:
850
+
851
+ * [set_processor_option](lib/batch_processor/rspec/custom_matchers/set_processor_option.rb) tests usages of batches specifying processor options.
852
+ * [use_default_job_class](lib/batch_processor/rspec/custom_matchers/use_default_job_class.rb) tests usages of batches which do not explicitly specify a job.
853
+ * [use_default_processor](lib/batch_processor/rspec/custom_matchers/use_default_processor.rb) tests usages of batches which do not explicitly specify a processor.
854
+ * [use_job_class](lib/batch_processor/rspec/custom_matchers/use_job_class.rb) tests usages of batches which explicitly specify a job.
855
+ * [use_parallel_processor](lib/batch_processor/rspec/custom_matchers/use_parallel_processor.rb) tests usages of `.with_parallel_processor`.
856
+ * [use_sequential_processor](lib/batch_processor/rspec/custom_matchers/use_sequential_processor.rb) tests usages of `.with_sequential_processor`.
857
+
858
+ There are also some internal matchers added:
859
+
860
+ * [use_batch_processor_strategy](lib/batch_processor/rspec/custom_matchers/use_batch_processor_strategy.rb) is used to DRY out the similarities between the other batch processor matchers.
861
+
862
+ ### Testing Batches
863
+
864
+ The best way to test a Batch is with an integration test.
865
+
866
+ The easiest way to test a Batch is with a unit test.
867
+
868
+ Batches are generated with the following RSPec template:
869
+
870
+ ```ruby
871
+ # frozen_string_literal: true
872
+
873
+ require "rails_helper"
874
+
875
+ RSpec.describe FooBatch, type: :batch do
876
+ subject { described_class }
877
+
878
+ it { is_expected.to inherit_from BatchProcessor::BatchBase }
879
+
880
+ # it { is_expected.to use_sequential_processor }
881
+ # it { is_expected.to use_parallel_processor }
882
+
883
+ # it { is_expected.to be_allow_empty }
884
+
885
+ # it { is_expected.to use_default_job_class }
886
+ # it { is_expected.to use_job_class OtherJob }
887
+
888
+ # it { is_expected.to set_processor_option :continue_after_exception, true }
889
+ # it { is_expected.to set_processor_option :sorted, true }
890
+ # it { is_expected.not_to be_allow_empty }
891
+
892
+ describe FooBatch::Collection, type: :batch_collection do
893
+ subject { described_class.new }
894
+
895
+ it { is_expected.to inherit_from BatchProcessor::BatchBase::BatchCollection }
896
+ # it { is_expected.to define_argument :arg, allow_nil: false }
897
+ # it { is_expected.to define_option :opt, default: 3 }
898
+ end
899
+ end
900
+ ```
901
+
902
+ #### Testing Collections
903
+
904
+ If your Collections are complicated enough that you want to put them into a separate file, they are **too** complicated.
905
+
906
+ Collections are expected to be incredibly straightforward objects with minimal validations and logic.
907
+
908
+ Highly maintainable collections should essentially be input sanity checks around something like an `ActiveRecord` scope:
909
+
910
+ ```ruby
911
+ class Collection < BatchCollection
912
+ argument :charge_date
913
+
914
+ validate :charge_date, date: { is_today_or_future: true }
915
+
916
+ def items
917
+ Orders.to_charge_on(charge_date).payment_pending
918
+ end
919
+ end
920
+ ```
921
+
922
+ This will allow you to keep this class as a slim, which is the intent!
923
+
924
+ ### Testing Jobs
925
+
926
+ BatchProcessor, though heavily reliant on jobs, does not include anything special or specific to test them.
927
+
928
+ Any job descending from `BatchProcessor::BatchableJob` (which has `ActiveJob::Base` as its parent) is batchable.
929
+
930
+ You can **and should** test your jobs, but there is nothing "special about them".
931
+
932
+ Ideally, if you're starting with a collection of well-built jobs, they should work nearly effortlessly here.
933
+
934
+ Admittedly "well-built" is subjective, but taken here to mean "clearly defined error handling and one responsibility".
935
+
936
+ Generally speaking the same suggestions for testing collections apply to `ActiveJob`.
937
+
938
+ 🤓 **HUMBLE OBSERVATION**: The best jobs are slim wrappers around clearly defined services.
939
+
940
+ ```ruby
941
+ class ExampleJob < ApplicationJob
942
+ def perform(id)
943
+ the_thing = Thing.find(id)
944
+ raise Thing::Locked if the_thing.locked?
945
+
946
+ the_thing.lock!
947
+ info :locked_thing, the_thing: the_thing
948
+
949
+ the_stuff = Stuff.for(the_thing)
950
+ info :got_stuff, the_stuff: the_stuff
951
+
952
+ the_thing.make_do(the_stuff)
953
+ info :made_the_thing_do_the_stuff
954
+ end
955
+ end
956
+
957
+ # Executed with...
958
+ ExampleJob.perform_later(the_thing.id)
959
+ ```
960
+
961
+ You should always move all that stuff into a service object:
962
+
963
+ ```ruby
964
+ class ExampleJob < ApplicationJob
965
+ def perform(the_thing)
966
+ DoTheStuffService.for(the_thing)
967
+ end
968
+ end
969
+
970
+ # Let globalID take care of this! Don't reinvent wheels!
971
+ ExampleJob.perform(the_thing)
972
+ ```
973
+
974
+ If this feels right to you, check out [flow](https://github.com/Freshly/flow). You'll like what you see, I guarantee it.
975
+
976
+ ### Integration Testing
977
+
978
+ Effective integration testing for batches requires you to configure `ActiveJob` for testing.
979
+
980
+ There are lots of solutions to this puzzle, so you're expected to pick your own poison in that regard.
981
+
982
+ Once you have everything configured to effectively unit test jobs, you can confirm the behavior of your batch.
983
+
984
+ A comprehensive suite of integration tests for a batch will cover three contexts:
985
+
986
+ 1) When the batch finishes successfully.
987
+ 2) When the batch finishes with some errors.
988
+ 3) When you manually intervene with the batch.
989
+
990
+ These are basically the only circumstances that you will actually encounter in the real world, so you should test them.
991
+
992
+ Writing general purpose handlers for batch aborts and clears will save you a lot of trouble and excess testing!
993
+
994
+ Pretty much every manual intervention in a batch will elicit a "stuff is on fire" response from the team anyway.
995
+
996
+ ## Custom Processors
997
+
998
+ You are able to define your own custom processors and use them with the batch processor.
999
+
1000
+ The following example is incredibly contrived for the purposes of demonstration:
1001
+
1002
+ Let's say you wanted a `NoBobProcessor` which enqueued jobs for anyone unless they had `Bob` in their name.
1003
+
1004
+ First, create an `app/batch_processors` directory. (really it can be in any folder, but why not be explicit?)
1005
+
1006
+ Then create your new `NoBobProcessor` class which is a descendant of `BatchProcessor::ProcessorBase`.
1007
+
1008
+ Generally speaking, when defining a processor you only need to define one method: `#process_collection_item`
1009
+
1010
+ This method is called with each and every item from a Batch's `Collection` and the processor decides what to do with it.
1011
+
1012
+ In our example, we will add a string-matching guard clause to exclude the `Bob`s of the world from processing.
1013
+
1014
+ When writing processors, it's always best to assume a generic case.
1015
+
1016
+ For now, let's assume that's either being a string representing a name, or an object with a `#name` property to check.
1017
+
1018
+ ```ruby
1019
+ class NoBobProcessor < BatchProcessor::ProcessorBase
1020
+ # Required for parallel processors to keep accurate and expected reporting
1021
+ set_callback(:collection_processed, :after) { batch.enqueued }
1022
+
1023
+ def process_collection_item(item)
1024
+ return if for_a_bob?(item)
1025
+
1026
+ job = batch.job_class.new(item)
1027
+ job.batch_id = batch.batch_id
1028
+ job.enqueue
1029
+ end
1030
+
1031
+ private
1032
+
1033
+ def for_a_bob?(item)
1034
+ name = item.name if item.respond_to?(:name)
1035
+ name ||= item if item.is_a?(String)
1036
+ raise ArgumentError, "Unknown item: #{item}" if name.nil?
1037
+
1038
+ name.include?("Bob")
1039
+ end
1040
+ end
1041
+ ```
1042
+
1043
+ Then, it needs to be register as a processing strategy so batches can utilize it.
1044
+
1045
+ To define it, create or edit a `config/initializers/batch_processor.rb` file and add the following line:
1046
+
1047
+ ```ruby
1048
+ ApplicationBatch::PROCESSOR_CLASS_BY_STRATEGY[:no_bobs] = NoBobProcessor
1049
+ ```
1050
+
1051
+ This will enable you to specify this processor within your batch:
1052
+
1053
+ ```ruby
1054
+ class ChargeNoBobsBatch < ApplicationBatch
1055
+ use_no_bobs_processor
1056
+
1057
+ # ...
1058
+ end
1059
+ ```
1060
+
1061
+ Reference your new processor by the name you used to enter it with the `PROCESSOR_CLASS_BY_STRATEGY` hash.
1062
+
1063
+ You can refer to the existing [processors](lib/batch_processor/processors) for reference.
1064
+
1065
+ ### Testing Processors
1066
+
1067
+ Testing custom processors is best suited by unit tests and confirmed by integration tests.
1068
+
1069
+ There's a lot that can go wrong when batching lots of jobs is involved, and it really helps to have unit tests on this.
1070
+
1071
+ I can't offer more guidance on writing good unit tests for processors other than suggesting [riffing on these](spec/batch_processor/processors).
43
1072
 
44
1073
  ## Contributing
45
1074
 
46
1075
  Bug reports and pull requests are welcome on GitHub at https://github.com/Freshly/batch_processor.
47
1076
 
1077
+ ### Development
1078
+
1079
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
1080
+
1081
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
1082
+
48
1083
  ## License
49
1084
 
50
1085
  The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).