each_in_batches 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +15 -0
- data/.rspec +3 -0
- data/.travis.yml +3 -0
- data/CODE_OF_CONDUCT.md +13 -0
- data/Gemfile +4 -0
- data/MIT-LICENSE +20 -0
- data/README.md +158 -0
- data/Rakefile +11 -0
- data/bin/console +14 -0
- data/bin/setup +7 -0
- data/each_in_batches.gemspec +25 -0
- data/lib/each_in_batches.rb +272 -0
- data/lib/each_in_batches/version.rb +3 -0
- metadata +112 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 28acf126d324e3a7c48d91337598cd629b8d543b
|
4
|
+
data.tar.gz: 2ae38eab8fd980bef8d56eebd47f24004832d67c
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: c23c43b348d6cc63bee83a1edf31002c56e17a5ede1adcb8a9fc546f20214b67480c9c5432f7464d0a49a0f733ed32f17d49dfbf17ab9b7ee2d2d718f143f954
|
7
|
+
data.tar.gz: 056a77ad490cf78fb6d1c3d847cf9a8f66fdca7794eca941becef1ffb9bf79d14c8fb0ce4a5b42742c89d6d3f8d215a68531de8afbb30ee8ded053800a73e056
|
data/.gitignore
ADDED
data/.rspec
ADDED
data/.travis.yml
ADDED
data/CODE_OF_CONDUCT.md
ADDED
@@ -0,0 +1,13 @@
|
|
1
|
+
# Contributor Code of Conduct
|
2
|
+
|
3
|
+
As contributors and maintainers of this project, we pledge to respect all people who contribute through reporting issues, posting feature requests, updating documentation, submitting pull requests or patches, and other activities.
|
4
|
+
|
5
|
+
We are committed to making participation in this project a harassment-free experience for everyone, regardless of level of experience, gender, gender identity and expression, sexual orientation, disability, personal appearance, body size, race, age, or religion.
|
6
|
+
|
7
|
+
Examples of unacceptable behavior by participants include the use of sexual language or imagery, derogatory comments or personal attacks, trolling, public or private harassment, insults, or other unprofessional conduct.
|
8
|
+
|
9
|
+
Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct. Project maintainers who do not follow the Code of Conduct may be removed from the project team.
|
10
|
+
|
11
|
+
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by opening an issue or contacting one or more of the project maintainers.
|
12
|
+
|
13
|
+
This Code of Conduct is adapted from the [Contributor Covenant](http:contributor-covenant.org), version 1.0.0, available at [http://contributor-covenant.org/version/1/0/0/](http://contributor-covenant.org/version/1/0/0/)
|
data/Gemfile
ADDED
data/MIT-LICENSE
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
Copyright (c) 2008 Peter H. Boling
|
2
|
+
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
4
|
+
a copy of this software and associated documentation files (the
|
5
|
+
"Software"), to deal in the Software without restriction, including
|
6
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
7
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
8
|
+
permit persons to whom the Software is furnished to do so, subject to
|
9
|
+
the following conditions:
|
10
|
+
|
11
|
+
The above copyright notice and this permission notice shall be
|
12
|
+
included in all copies or substantial portions of the Software.
|
13
|
+
|
14
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
15
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
16
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
17
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
18
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
19
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
20
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,158 @@
|
|
1
|
+
# EachInBatches
|
2
|
+
|
3
|
+
(Originally BolingForBatches)
|
4
|
+
|
5
|
+
### NOTE:
|
6
|
+
I am resurrecting this code because I still have this recurring need, and Rail's native batching doesn't cut mustard.
|
7
|
+
It is some of my most ancient code, and it isn't pretty, but I hope to improve it over time.
|
8
|
+
|
9
|
+
I often need to execute really large computations on really large data sets.
|
10
|
+
I usually end up writing a rake task to do it, which calls methods in my models.
|
11
|
+
But something about the process bugged me. Each time I had to re-implement my
|
12
|
+
'batching code' that allowed me to not chew up GB after GB of memory due to
|
13
|
+
klass.find(:all, :include => [:everything_under_the_sun]). Re-implementation of
|
14
|
+
the same logic over and over across many projects is not very DRY, so I got out
|
15
|
+
my blow torch and lit it up. The difficulty was that the part that was different
|
16
|
+
each time I batched was at the center of the code, right in the middle of the
|
17
|
+
batch loop. But I didn't let that stop me!
|
18
|
+
|
19
|
+
## Why this plugin is way better than standard Rails batching
|
20
|
+
1. I've been doing batching in Rails a lot longer than Rails has.
|
21
|
+
2. Metrics. I measure stuff.
|
22
|
+
3. I can batch from the top down (a.k.a backwards), making it possible to DELETE things in batches.
|
23
|
+
A. If you've never tried using the built-in rails batching for deleting millions of records... don't start now. Use this gem instead.
|
24
|
+
4. Exception Handling. Exceptions occurring within the batching can be rescued, in a customizable fashion, which means that the process doesn't need to die on batch 309,675 of 402,540.
|
25
|
+
5. Merged in the EachInBatches fork (from Brian Kidd):
|
26
|
+
I needed to iterate over the results and perform more actions than a single
|
27
|
+
method would provide. I didn't want to write a method in my app that performed
|
28
|
+
the needed functionality as I felt the plugin should support this directly.
|
29
|
+
I modified the original plugin so that it takes a block instead of a method.
|
30
|
+
It will pass the object instance to the block. It works pretty much the same
|
31
|
+
as Class.find(:all).each {|x| do something}, except in batches n that you
|
32
|
+
specify with :batch_size.
|
33
|
+
|
34
|
+
## Installation
|
35
|
+
|
36
|
+
Add this line to your application's Gemfile:
|
37
|
+
|
38
|
+
```ruby
|
39
|
+
gem 'each_in_batches'
|
40
|
+
```
|
41
|
+
|
42
|
+
And then execute:
|
43
|
+
|
44
|
+
$ bundle
|
45
|
+
|
46
|
+
Or install it yourself as:
|
47
|
+
|
48
|
+
$ gem install each_in_batches
|
49
|
+
|
50
|
+
## Usage
|
51
|
+
|
52
|
+
To create a new Batch, call `Batch#new` pass it the class and any additional arguments (all as a hash).
|
53
|
+
|
54
|
+
batch = EachInBatches::Batch.new(:arel => Payment.canceled.order("transaction_id ASC"), :batch_size => 50)
|
55
|
+
|
56
|
+
To process the batched data, pass a block to `Batch#run` the same way you would to an object in a block like `Klass.all.each {|x| x.do_something }`.
|
57
|
+
`Batch#run` will pass the data to your block, one at a time, in batches set by the :batch_size argument.
|
58
|
+
|
59
|
+
batch.run {|x| puts x.id; puts x.transaction_id}
|
60
|
+
|
61
|
+
Print the results!
|
62
|
+
|
63
|
+
batch.print_results
|
64
|
+
|
65
|
+
Or...
|
66
|
+
|
67
|
+
Consolidate your code if you prefer
|
68
|
+
|
69
|
+
EachInBatches::Batch.new(:arel => Payment.canceled.order("transaction_id ASC"), batch_size => 50, :show_results => true).run{|x| puts x.id; puts x.transaction_id}
|
70
|
+
|
71
|
+
## Configuration
|
72
|
+
|
73
|
+
Arguements for the initializer (Batch.new) method are:
|
74
|
+
|
75
|
+
Required:
|
76
|
+
|
77
|
+
:arel - Usage: :arel => Payment.canceled.order("transaction_id ASC")
|
78
|
+
Required, as this is the class that will be batched
|
79
|
+
|
80
|
+
Optional:
|
81
|
+
|
82
|
+
:verbose - Usage: :verbose => true or false
|
83
|
+
Sets verbosity of output
|
84
|
+
Default: false (if not provided)
|
85
|
+
|
86
|
+
:batch_size - Usage: :batch_size => x
|
87
|
+
Where x is some number.
|
88
|
+
How many AR Objects should be processed at once?
|
89
|
+
Default: 50 (if not provided)
|
90
|
+
|
91
|
+
:last_batch - Usage: :last_batch => x
|
92
|
+
Where x is some number.
|
93
|
+
Only process up to and including batch #x.
|
94
|
+
Batch numbers start at 0 for the first batch.
|
95
|
+
Default: won't be used (no limit if not provided)
|
96
|
+
|
97
|
+
:first_batch - Usage: first_batch => x
|
98
|
+
Where x is some number.
|
99
|
+
Begin processing batches beginning at batch #x.
|
100
|
+
Batch numbers start at 0 for the first batch.
|
101
|
+
Default: won't be used (no offset if not provided)
|
102
|
+
|
103
|
+
:show_results - Usage: :show_results => true or false
|
104
|
+
Prints statistics about the results of Batch#run.
|
105
|
+
Default: true if verbose is set to true and :show_results is not provided, otherwise false
|
106
|
+
|
107
|
+
## Output
|
108
|
+
|
109
|
+
Interpreting the output:
|
110
|
+
|
111
|
+
'[O]' means the batch was skipped due to an offset.
|
112
|
+
'[L]' means the batch was skipped due to a limit.
|
113
|
+
'[P]' means the batch is processing.
|
114
|
+
'[C]' means the batch is complete.
|
115
|
+
and yes... it was a coincidence. This class is not affiliated with 'one laptop per child'
|
116
|
+
|
117
|
+
## License
|
118
|
+
|
119
|
+
Copyright ©2008-2015 Peter H. Boling, Brian Kidd, released under the MIT license
|
120
|
+
|
121
|
+
## Development
|
122
|
+
|
123
|
+
After checking out the repo, run `bin/setup` to install dependencies. Then, run `bin/console` for an interactive prompt that will allow you to experiment.
|
124
|
+
|
125
|
+
To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release` to create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
|
126
|
+
|
127
|
+
## Maintenance
|
128
|
+
|
129
|
+
To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release` to create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
|
130
|
+
|
131
|
+
## Versioning
|
132
|
+
|
133
|
+
This library aims to adhere to [Semantic Versioning 2.0.0](http://semver.org/).
|
134
|
+
Violations of this scheme should be reported as bugs. Specifically,
|
135
|
+
if a minor or patch version is released that breaks backward
|
136
|
+
compatibility, a new version should be immediately released that
|
137
|
+
restores compatibility. Breaking changes to the public API will
|
138
|
+
only be introduced with new major versions.
|
139
|
+
|
140
|
+
As a result of this policy, you can (and should) specify a
|
141
|
+
dependency on this gem using the [Pessimistic Version Constraint](http://docs.rubygems.org/read/chapter/16#page74) with two digits of precision.
|
142
|
+
|
143
|
+
For example:
|
144
|
+
|
145
|
+
spec.add_dependency 'each_in_batches', '~> 0.0'
|
146
|
+
|
147
|
+
## Contributing
|
148
|
+
|
149
|
+
1. Fork it ( https://github.com/[my-github-username]/each_in_batches/fork )
|
150
|
+
2. Create your feature branch (`git checkout -b my-new-feature`)
|
151
|
+
3. Commit your changes (`git commit -am 'Add some feature'`)
|
152
|
+
4. Push to the branch (`git push origin my-new-feature`)
|
153
|
+
5. Make sure to add tests!
|
154
|
+
6. Create a new Pull Request
|
155
|
+
|
156
|
+
## Contributors
|
157
|
+
|
158
|
+
See the [Network View](https://github.com/pboling/each_in_batches/network)
|
data/Rakefile
ADDED
data/bin/console
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
require "bundler/setup"
|
4
|
+
require "each_in_batches"
|
5
|
+
|
6
|
+
# You can add fixtures and/or initialization code here to make experimenting
|
7
|
+
# with your gem easier. You can also use a different console, if you like.
|
8
|
+
|
9
|
+
# (If you use this, don't forget to add pry to your Gemfile!)
|
10
|
+
# require "pry"
|
11
|
+
# Pry.start
|
12
|
+
|
13
|
+
require "irb"
|
14
|
+
IRB.start
|
data/bin/setup
ADDED
@@ -0,0 +1,25 @@
|
|
1
|
+
# coding: utf-8
|
2
|
+
lib = File.expand_path('../lib', __FILE__)
|
3
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
4
|
+
require 'each_in_batches/version'
|
5
|
+
|
6
|
+
Gem::Specification.new do |spec|
|
7
|
+
spec.name = "each_in_batches"
|
8
|
+
spec.version = EachInBatches::VERSION
|
9
|
+
spec.authors = ["Peter Boling"]
|
10
|
+
spec.email = ["peter.boling@gmail.com"]
|
11
|
+
|
12
|
+
spec.summary = "Batch Processing of Records with Blocks in Rails"
|
13
|
+
spec.description = "Batch Processing of Records with Blocks in Rails"
|
14
|
+
spec.homepage = "https://github.com/pboling/boling_for_batches"
|
15
|
+
|
16
|
+
spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
|
17
|
+
spec.bindir = "exe"
|
18
|
+
spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
|
19
|
+
spec.require_paths = ["lib"]
|
20
|
+
|
21
|
+
spec.add_dependency "activerecord", "~> 3.2"
|
22
|
+
spec.add_development_dependency "bundler", "~> 1.9"
|
23
|
+
spec.add_development_dependency "rake", "~> 10.0"
|
24
|
+
spec.add_development_dependency "rspec", "~> 3.2"
|
25
|
+
end
|
@@ -0,0 +1,272 @@
|
|
1
|
+
# BolingForBatches
|
2
|
+
# Copyright ©2008-2015 Peter H. Boling, Brian Kidd, released under the MIT license
|
3
|
+
# Gem Plugin for Rails: A Better Way To Run Heavy Queries
|
4
|
+
# License: MIT License
|
5
|
+
# Labels: Ruby, Rails, Gem
|
6
|
+
# Project owners:
|
7
|
+
# Peter Boling, Brian Kidd
|
8
|
+
require "each_in_batches/version"
|
9
|
+
require "active_record"
|
10
|
+
|
11
|
+
module EachInBatches
|
12
|
+
|
13
|
+
class Batch
|
14
|
+
|
15
|
+
attr_accessor :arel
|
16
|
+
attr_accessor :verbose
|
17
|
+
attr_accessor :batch_size
|
18
|
+
attr_accessor :backwards
|
19
|
+
attr_accessor :last_batch
|
20
|
+
attr_accessor :first_batch
|
21
|
+
attr_accessor :offset_array
|
22
|
+
attr_accessor :total_records
|
23
|
+
attr_accessor :size_of_last_run
|
24
|
+
attr_accessor :extra_run
|
25
|
+
attr_accessor :num_runs
|
26
|
+
attr_accessor :total_time
|
27
|
+
attr_accessor :elapsed_time
|
28
|
+
attr_accessor :start_time
|
29
|
+
attr_accessor :end_time
|
30
|
+
attr_accessor :overhead_time
|
31
|
+
attr_accessor :completion_times
|
32
|
+
attr_accessor :show_results
|
33
|
+
|
34
|
+
def print_debug
|
35
|
+
print "verbose: #{verbose}\nbatch_size: #{batch_size}\nbackwards: #{backwards}\nlast_batch: #{last_batch}\nfirst_batch: #{first_batch}\noffset_array: #{offset_array}\ntotal_records: #{total_records}\nsize_of_last_run: #{size_of_last_run}\nextra_run: #{extra_run}\nnum_runs: #{num_runs}\ntotal_time: #{total_time}\nelapsed_time: #{elapsed_time}\nstart_time: #{start_time}\nend_time: #{end_time}\noverhead_time: #{overhead_time}\ncompletion_times: #{completion_times.inspect}\nshow_results: #{show_results.inspect}\n"
|
36
|
+
end
|
37
|
+
|
38
|
+
def self.help_text
|
39
|
+
<<-HEREDOC
|
40
|
+
Arguments for the initializer (Batch.new) method are:
|
41
|
+
|
42
|
+
Required:
|
43
|
+
|
44
|
+
:arel - Usage: :arel => MyClass.some_scope.order("some_column ASC")
|
45
|
+
Required, as this is the class that will be batched
|
46
|
+
|
47
|
+
Optional:
|
48
|
+
|
49
|
+
:backwards - Usage: :backwards => true or false
|
50
|
+
Whether or not the batches should be processed in reverse order or not.
|
51
|
+
NOTE: deletions must be processed backwards or you eat the set as you process
|
52
|
+
and end the run half way through
|
53
|
+
Default: false (if not provided)
|
54
|
+
|
55
|
+
:verbose - Usage: :verbose => true or false
|
56
|
+
Sets verbosity of output
|
57
|
+
Default: false (if not provided)
|
58
|
+
|
59
|
+
:batch_size - Usage: :batch_size => x
|
60
|
+
Where x is some number.
|
61
|
+
How many AR Objects should be processed at once?
|
62
|
+
Default: 50 (if not provided)
|
63
|
+
|
64
|
+
:last_batch - Usage: :last_batch => x
|
65
|
+
Where x is some number.
|
66
|
+
Only process up to and including batch #x.
|
67
|
+
Batch numbers start at 0 for the first batch.
|
68
|
+
Default: won't be used (no limit if not provided)
|
69
|
+
|
70
|
+
:first_batch - Usage: first_batch => x
|
71
|
+
Where x is some number.
|
72
|
+
Begin processing batches beginning at batch #x.
|
73
|
+
Batch numbers start at 0 for the first batch.
|
74
|
+
Default: won't be used (no offset if not provided)
|
75
|
+
|
76
|
+
:show_results - Usage: :show_results => true or false
|
77
|
+
Prints statistics about the results of Batch#run.
|
78
|
+
Default: true if verbose is set to true and :show_results is not provided, otherwise false
|
79
|
+
|
80
|
+
EXAMPLE:
|
81
|
+
|
82
|
+
To create a new Batch, call Batch#new and pass it the class and any additional arguements (all as a hash).
|
83
|
+
|
84
|
+
batch = EachInBatches::Batch.new(:arel => Payment.canceled.order("transaction_id ASC"), :batch_size => 50)
|
85
|
+
|
86
|
+
To process the batched data, pass a block to Batch#run the same way you would to an object returned by
|
87
|
+
|
88
|
+
Klass.all.each {|x| x.method}
|
89
|
+
|
90
|
+
Batch#run will pass the data to your block, one at a time, in batches set by the :batch_size argument.
|
91
|
+
|
92
|
+
batch.run {|x| puts x.id; puts x.transaction_id}
|
93
|
+
|
94
|
+
Print the results!
|
95
|
+
|
96
|
+
batch.print_results
|
97
|
+
|
98
|
+
Or...
|
99
|
+
|
100
|
+
Consolidate your code if you prefer
|
101
|
+
|
102
|
+
EachInBatches::Batch.new(:arel => Payment.canceled.order("transaction_id ASC"), :batch_size => 50, :show_results => true).run{|x| puts x.id; puts x.transaction_id}
|
103
|
+
|
104
|
+
Interpreting the output:
|
105
|
+
'[O]' means the batch was skipped due to an offset.
|
106
|
+
'[L]' means the batch was skipped due to a limit.
|
107
|
+
'[P]' means the batch is processing.
|
108
|
+
'[C]' means the batch is complete.
|
109
|
+
and yes... it was a coincidence. This class is not affiliated with 'one laptop per child'
|
110
|
+
HEREDOC
|
111
|
+
end
|
112
|
+
|
113
|
+
def self.check(*args)
|
114
|
+
if args.empty?
|
115
|
+
puts self.help_text and return false
|
116
|
+
#Are the values of these parameters going to be valid integers?
|
117
|
+
elsif args.first[:batch_size] && (args.first[:batch_size].to_s.gsub(/\d/,'foo') == args.first[:batch_size].to_s)
|
118
|
+
puts self.help_text and return false
|
119
|
+
elsif args.first[:last_batch] && (args.first[:last_batch].to_s.gsub(/\d/,'foo') == args.first[:last_batch].to_s)
|
120
|
+
puts self.help_text and return false
|
121
|
+
elsif args.first[:first_batch] && (args.first[:first_batch].to_s.gsub(/\d/,'foo') == args.first[:first_batch].to_s)
|
122
|
+
puts self.help_text and return false
|
123
|
+
else
|
124
|
+
return true
|
125
|
+
end
|
126
|
+
end
|
127
|
+
|
128
|
+
def initialize(*args)
|
129
|
+
return false unless Batch.check(*args)
|
130
|
+
@arel = args.first[:arel]
|
131
|
+
@verbose = args.first[:verbose].blank? ? false : args.first[:verbose]
|
132
|
+
@backwards = args.first[:backwards].nil? ? false : !(args.first[:backwards] == 'false' || args.first[:backwards] == false)
|
133
|
+
@batch_size = args.first[:batch_size] ? args.first[:batch_size].is_a?(Integer) ? args.first[:batch_size] : args.first[:batch_size].to_i : 50
|
134
|
+
@last_batch = args.first[:last_batch] ? args.first[:last_batch].is_a?(Integer) ? args.first[:last_batch] : args.first[:last_batch].to_i : false
|
135
|
+
@first_batch = args.first[:first_batch] ? args.first[:first_batch].is_a?(Integer) ? args.first[:first_batch] : args.first[:first_batch].to_i : 0
|
136
|
+
@show_results = case
|
137
|
+
when args.first[:show_results].blank? && @verbose.blank?; false
|
138
|
+
when args.first[:show_results].blank? && @verbose == true; true
|
139
|
+
else args.first[:show_results]
|
140
|
+
end
|
141
|
+
@total_time = 0
|
142
|
+
@skipped_batches = []
|
143
|
+
|
144
|
+
puts "Counting Records..." if self.verbose
|
145
|
+
@total_records = @arel.count
|
146
|
+
@num_runs = @total_records / @batch_size
|
147
|
+
@size_of_last_run = @total_records.modulo(@batch_size)
|
148
|
+
|
149
|
+
if @size_of_last_run > 0
|
150
|
+
@num_runs += 1
|
151
|
+
@extra_run = true
|
152
|
+
else
|
153
|
+
@extra_run = false
|
154
|
+
end
|
155
|
+
|
156
|
+
puts "Records: #{@total_records}, Batches: #{@num_runs}" if @verbose
|
157
|
+
|
158
|
+
@last_batch = @num_runs - 1 unless @num_runs == 0 || @last_batch #because batch numbers start at 0 like array indexes, but only if it was not set in *args
|
159
|
+
|
160
|
+
current_batch = 0
|
161
|
+
@offset_array = Array.new
|
162
|
+
if @verbose
|
163
|
+
puts "Batch Numbering Begins With 0 (ZERO) and counts up"
|
164
|
+
puts "Batch Size (SQL Limit): #{@batch_size}" #This is the SQL Limit
|
165
|
+
puts "First Batch # to run: #{@first_batch}" #This is the number of the first batch to run
|
166
|
+
puts "Last Batch # to run: #{@last_batch}" # This is the number of the last batch to run
|
167
|
+
puts "Batches Before First and After Last will be skipped."
|
168
|
+
puts "Creating Batches:\n"
|
169
|
+
end
|
170
|
+
while current_batch < @num_runs
|
171
|
+
@offset_array << (current_batch * @batch_size)
|
172
|
+
print "." if @verbose
|
173
|
+
current_batch += 1
|
174
|
+
end
|
175
|
+
puts " #{@num_runs} Batches Created" if @verbose
|
176
|
+
#in order to use batching for record deletion, the offsets need to start with largest first
|
177
|
+
if @backwards
|
178
|
+
@offset_array.reverse!
|
179
|
+
puts "Backwards Mode:" if @verbose
|
180
|
+
else
|
181
|
+
puts "Normal Mode:" if @verbose
|
182
|
+
end
|
183
|
+
if @verbose
|
184
|
+
puts " First Offset: #{@offset_array.first}"
|
185
|
+
puts " Last Offset: #{@offset_array.last}"
|
186
|
+
# technically the last run doesn't need a limit, and we don't technically use a limit on the last run,
|
187
|
+
# but there are only that many records left to process,
|
188
|
+
# so the effect is the same as if a limit were applied.
|
189
|
+
# We do need the limit when running the batches backwards, however
|
190
|
+
if @extra_run
|
191
|
+
if @backwards
|
192
|
+
puts " Limit of first run: #{@size_of_last_run}"
|
193
|
+
else
|
194
|
+
puts " Size of Last Run: #{@size_of_last_run}"
|
195
|
+
end
|
196
|
+
end
|
197
|
+
puts " Limit of all #{@extra_run ? "other" : ""} runs: #{@batch_size}" #This is the SQL Limit
|
198
|
+
end
|
199
|
+
end
|
200
|
+
|
201
|
+
def is_first_run?
|
202
|
+
#if no batches have been completed then we are in a first run situation
|
203
|
+
self.completion_times.empty?
|
204
|
+
end
|
205
|
+
|
206
|
+
def run(&block)
|
207
|
+
return false unless block_given?
|
208
|
+
self.start_time = Time.current
|
209
|
+
puts "There are no batches to run" and return false unless self.num_runs > 0
|
210
|
+
self.total_time = 0
|
211
|
+
self.completion_times = Array.new
|
212
|
+
self.offset_array.each_with_index do |offset, current_batch|
|
213
|
+
if self.backwards && self.is_first_run?
|
214
|
+
limite = self.size_of_last_run
|
215
|
+
else
|
216
|
+
limite = self.batch_size
|
217
|
+
end
|
218
|
+
if self.first_batch > current_batch
|
219
|
+
print "[O] #{show_status(current_batch, limite)} skipped" if self.verbose
|
220
|
+
self.skipped_batches << current_batch
|
221
|
+
elsif self.last_batch && self.last_batch < current_batch
|
222
|
+
print "[L] #{show_status(current_batch, limite)} skipped" if self.verbose
|
223
|
+
self.skipped_batches << current_batch
|
224
|
+
else
|
225
|
+
print "[P] #{show_status(current_batch, limite)}" if self.verbose
|
226
|
+
|
227
|
+
#start the timer
|
228
|
+
beg_time = Time.current
|
229
|
+
|
230
|
+
self.arel.limit(limite).offset(offset).each {|obj| yield obj}
|
231
|
+
|
232
|
+
#stop the timer
|
233
|
+
fin_time = Time.current
|
234
|
+
|
235
|
+
this_time = fin_time.to_i - beg_time.to_i
|
236
|
+
self.total_time += this_time unless extra_run && current_batch == self.num_runs
|
237
|
+
puts "[C] #{show_status(current_batch, limite)} in #{this_time} seconds" if self.verbose
|
238
|
+
self.completion_times << [current_batch, {:elapsed => this_time, :begin_time => beg_time, :end_time => fin_time}]
|
239
|
+
end
|
240
|
+
end
|
241
|
+
self.num_runs -= 1 if self.extra_run
|
242
|
+
self.end_time = Time.current
|
243
|
+
self.elapsed_time = (self.end_time.to_i - self.start_time.to_i)
|
244
|
+
self.overhead_time = self.elapsed_time - self.total_time
|
245
|
+
print_results if self.show_results
|
246
|
+
return "Process Complete"
|
247
|
+
end
|
248
|
+
|
249
|
+
def show_status(current_batch, limite)
|
250
|
+
"{#{current_batch} / #{self.last_batch} / #{limite}}"
|
251
|
+
end
|
252
|
+
|
253
|
+
# Allow caller to override verbosity when called from console
|
254
|
+
def print_results(verbose = self.verbose)
|
255
|
+
printf "Results..."
|
256
|
+
printf "Average time per complete batch was %.1f seconds\n", (self.total_time/Float(self.num_runs)) unless self.num_runs < 1
|
257
|
+
printf "Total time elapsed was %.1f seconds, about #{self.elapsed_time/60} minute(s)\n", (self.elapsed_time)
|
258
|
+
if self.backwards # When backwards might be deleting records
|
259
|
+
puts "Total # of #{self.arel.table} - Before: #{self.total_records}"
|
260
|
+
puts "Total # of #{self.arel.table} - After : #{self.arel.count}"
|
261
|
+
end
|
262
|
+
# With a large number of batches this is far too verbose, but don't want to introduce a more complicated verbosity setting.
|
263
|
+
# if verbose
|
264
|
+
# puts "Completion times for each batch:"
|
265
|
+
# self.completion_times.each do |x|
|
266
|
+
# puts "Batch #{x[0]}: Time Elapsed: #{x[1][:elapsed]}s, Begin: #{x[1][:begin_time].strftime("%m.%d.%Y %I:%M:%S %p")}, End: #{x[1][:end_time].strftime("%m.%d.%Y %I:%M:%S %p")}"
|
267
|
+
# end
|
268
|
+
# end
|
269
|
+
end
|
270
|
+
|
271
|
+
end
|
272
|
+
end
|
metadata
ADDED
@@ -0,0 +1,112 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: each_in_batches
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.1.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Peter Boling
|
8
|
+
autorequire:
|
9
|
+
bindir: exe
|
10
|
+
cert_chain: []
|
11
|
+
date: 2015-06-18 00:00:00.000000000 Z
|
12
|
+
dependencies:
|
13
|
+
- !ruby/object:Gem::Dependency
|
14
|
+
name: activerecord
|
15
|
+
requirement: !ruby/object:Gem::Requirement
|
16
|
+
requirements:
|
17
|
+
- - "~>"
|
18
|
+
- !ruby/object:Gem::Version
|
19
|
+
version: '3.2'
|
20
|
+
type: :runtime
|
21
|
+
prerelease: false
|
22
|
+
version_requirements: !ruby/object:Gem::Requirement
|
23
|
+
requirements:
|
24
|
+
- - "~>"
|
25
|
+
- !ruby/object:Gem::Version
|
26
|
+
version: '3.2'
|
27
|
+
- !ruby/object:Gem::Dependency
|
28
|
+
name: bundler
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - "~>"
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '1.9'
|
34
|
+
type: :development
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - "~>"
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '1.9'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: rake
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - "~>"
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: '10.0'
|
48
|
+
type: :development
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - "~>"
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: '10.0'
|
55
|
+
- !ruby/object:Gem::Dependency
|
56
|
+
name: rspec
|
57
|
+
requirement: !ruby/object:Gem::Requirement
|
58
|
+
requirements:
|
59
|
+
- - "~>"
|
60
|
+
- !ruby/object:Gem::Version
|
61
|
+
version: '3.2'
|
62
|
+
type: :development
|
63
|
+
prerelease: false
|
64
|
+
version_requirements: !ruby/object:Gem::Requirement
|
65
|
+
requirements:
|
66
|
+
- - "~>"
|
67
|
+
- !ruby/object:Gem::Version
|
68
|
+
version: '3.2'
|
69
|
+
description: Batch Processing of Records with Blocks in Rails
|
70
|
+
email:
|
71
|
+
- peter.boling@gmail.com
|
72
|
+
executables: []
|
73
|
+
extensions: []
|
74
|
+
extra_rdoc_files: []
|
75
|
+
files:
|
76
|
+
- ".gitignore"
|
77
|
+
- ".rspec"
|
78
|
+
- ".travis.yml"
|
79
|
+
- CODE_OF_CONDUCT.md
|
80
|
+
- Gemfile
|
81
|
+
- MIT-LICENSE
|
82
|
+
- README.md
|
83
|
+
- Rakefile
|
84
|
+
- bin/console
|
85
|
+
- bin/setup
|
86
|
+
- each_in_batches.gemspec
|
87
|
+
- lib/each_in_batches.rb
|
88
|
+
- lib/each_in_batches/version.rb
|
89
|
+
homepage: https://github.com/pboling/boling_for_batches
|
90
|
+
licenses: []
|
91
|
+
metadata: {}
|
92
|
+
post_install_message:
|
93
|
+
rdoc_options: []
|
94
|
+
require_paths:
|
95
|
+
- lib
|
96
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
97
|
+
requirements:
|
98
|
+
- - ">="
|
99
|
+
- !ruby/object:Gem::Version
|
100
|
+
version: '0'
|
101
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
102
|
+
requirements:
|
103
|
+
- - ">="
|
104
|
+
- !ruby/object:Gem::Version
|
105
|
+
version: '0'
|
106
|
+
requirements: []
|
107
|
+
rubyforge_project:
|
108
|
+
rubygems_version: 2.4.2
|
109
|
+
signing_key:
|
110
|
+
specification_version: 4
|
111
|
+
summary: Batch Processing of Records with Blocks in Rails
|
112
|
+
test_files: []
|