in-parallel 0.1.10 → 0.1.11

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 616a0cfcbc49b019d69c306e9e919a86efdd9fee
4
- data.tar.gz: 4e7aba20c6ca158d124242c74ce44ca59fbf9179
3
+ metadata.gz: ecbae8756fd4310b2b2330b342f7ada3ee48dae6
4
+ data.tar.gz: c42d6e4ffb5d87d1418d9bccbc65c4ec3347f5b5
5
5
  SHA512:
6
- metadata.gz: d9e961955959052dd40268155c142deefa2950145519d9ba1d778ac18c7dfcf256e66328ec55ef08c1fb0e10ad41a81760154e86f986240082055a2c024b3363
7
- data.tar.gz: 03ea21bf0625a57e368595faf8bbdf8a5bfed798a4663e43e92da26f89e793b897b1a175824f61a355e84158a3abde04bae7fba10a35bbbecdc8772345e757a3
6
+ metadata.gz: 0510bb42cb5342dee946b5f383acd9da2c0277bc60c466c80f17ab10b7f243d34672868c2f95c36525781a6fe95c7d47e52bb4003c6d64d67886201b832aaf3c
7
+ data.tar.gz: 30d1528fa0ce04a39636f10c0e3e969315e5c679ada2b763f8b0046a802ee7a15c5d1bd6abcb61aa8a60b63a2e39cea44c706b60cb17d36f40e03f2d635d10f0
data/README.md CHANGED
@@ -1,16 +1,16 @@
1
1
  # in-parallel
2
- A lightweight Ruby library with very simple syntax, making use of process.fork for parallelization
2
+ A lightweight Ruby library with very simple syntax, making use of Process.fork to execute code in parallel.
3
3
 
4
- Other popular Ruby libraries that do parallel execution support one primary use case - crunching through a large queue of small tasks as quickly and efficiently as possible. This library primarily supports the use case of executing a few larger tasks in parallel and managing the stdout and return values to make it easy to understand which processes are logging what, and what the outcome of the execution was. This library was created to be used by Puppet's Beaker test framework to enable parallel execution of some of the framework's tasks, and allow people within thier tests to execute code in parallel when wanted. This solution does not check to see how many processors you have, it just forks as many processes as you ask for. That means that it will handle a handful of parallel processes well, but could definitely overload your system with ruby processes if you try to spin up a LOT of processes. If you're looking for something simple and light-weight and on either linux or mac (forking processes is not supported on Windows), then this solution could be what you want.
4
+ Other popular Ruby libraries that do parallel execution support one primary use case - crunching through a large queue of small tasks as quickly and efficiently as possible. This library primarily supports the use case of executing a few larger tasks in parallel and managing the stdout and return values to make it easy to understand which processes are logging what, and what the outcome of the execution was. This library was created to be used by Puppet's Beaker test framework to enable parallel execution of some of the framework's tasks, and allow users to execute code in parallel within their tests. This solution does not check to see how many processors you have, it just forks as many processes as you ask for. That means that it will handle a handful of parallel processes well, but could definitely overload your system with ruby processes if you try to spin up a LOT of processes. If you're looking for something intuitive and simple, but with a slight memory and processing overhead, and are on either Linux or Mac (forking processes is not supported on Windows), then this solution is what you want.
5
5
 
6
- If you are looking for something to support executing a lot of tasks in parallel as efficiently as possible, you should take a look at the [parallel](https://github.com/grosser/parallel) project.
6
+ If you are looking for something that excels at executing a large queue of tasks in parallel as efficiently as possible, you should take a look at the [parallel](https://github.com/grosser/parallel) project.
7
7
 
8
8
  ## Methods:
9
9
 
10
10
  ### run_in_parallel(timeout=nil, kill_all_on_error = false, &block)
11
- 1. You can put whatever methods you want to execute in parallel into a block, and each method will be executed in parallel (unless the method is defined in kernel).
11
+ 1. Each method in a block will be executed in parallel (unless the method is defined in Kernel or BaseObject).
12
12
  1. Any methods further down the stack won't be affected, only the ones directly within the block.
13
- 2. You can assign the results to instance variables and it just works, no dealing with an array or map of results.
13
+ 2. You can assign return values to instance variables and it 'just works'.
14
14
  3. Log STDOUT and STDERR chunked per process to the console so that it is easy to see what happened in which process.
15
15
  4. Waits for each process in realtime and logs immediately upon completion of each process
16
16
  5. If an exception is raised by a child process, it will optionally (kill_all_on_error) be re-raised in the primary process and kill all other still running child processes. The default will wait for all processes to complete execution before re-raising any unhandled exception from the child processes.
@@ -33,10 +33,10 @@ If you are looking for something to support executing a lot of tasks in parallel
33
33
  # Example:
34
34
  # will spawn 2 processes, (1 for each method) wait until they both complete, log chunked STDOUT/STDERR for
35
35
  # each process and assign the method return values to instance variables:
36
- InParallel.run_in_parallel {
36
+ run_in_parallel do
37
37
  @result_1 = method_with_param('world')
38
38
  @result_2 = method_without_param
39
- }
39
+ end
40
40
 
41
41
  puts "#{@result_1}, #{@result_2[:foo]}"
42
42
  ```
@@ -64,9 +64,7 @@ hello world, bar
64
64
  5. Times out by default at 30 minutes. Timeout default can be changed with InParallel::InParallelExecutor.parallel_default_timeout=X, or you can set the timeout param when calling the method
65
65
 
66
66
  ```ruby
67
- ["foo", "bar", "baz"].each_in_parallel { |item|
68
- puts item
69
- }
67
+ ["foo", "bar", "baz"].each_in_parallel { |item| puts item }
70
68
 
71
69
  ```
72
70
  STDOUT:
@@ -98,14 +96,12 @@ baz
98
96
 
99
97
  def create_file_with_delay(file_path)
100
98
  sleep 2
101
- File.open(file_path, 'w') { |f| f.write('contents')}
99
+ File.open(file_path, 'w') { |f| f.write('contents') }
102
100
  return true
103
101
  end
104
102
 
105
103
  # Example 1 - ignore results
106
- run_in_background{
107
- create_file_with_delay(TMP_FILE)
108
- }
104
+ run_in_background { create_file_with_delay(TMP_FILE) }
109
105
 
110
106
  # Should not exist immediately upon block completion
111
107
  puts(File.exists?(TMP_FILE)) # false
@@ -114,15 +110,11 @@ baz
114
110
  puts(File.exists?(TMP_FILE)) # true
115
111
 
116
112
  # Example 2 - delay results
117
- run_in_background(false){
118
- @result = create_file_with_delay(TMP_FILE)
119
- }
113
+ run_in_background(false) { @result = create_file_with_delay(TMP_FILE) }
120
114
 
121
115
  # Do something else
122
116
 
123
- run_in_background(false){
124
- @result2 = create_file_with_delay('/tmp/someotherfile.txt')
125
- }
117
+ run_in_background(false) { @result2 = create_file_with_delay('/tmp/someotherfile.txt') }
126
118
 
127
119
  # @result has not been assigned yet
128
120
  puts @result >> "unresolved_parallel_result_0"
@@ -1,3 +1,3 @@
1
1
  module InParallel
2
- VERSION = Version = '0.1.10'
2
+ VERSION = Version = '0.1.11'
3
3
  end
@@ -11,13 +11,14 @@ module InParallel
11
11
  @@parallel_signal_interval = 30
12
12
  @@parallel_default_timeout = 1800
13
13
 
14
- @@process_infos = []
14
+ @@process_infos = []
15
+
15
16
  def self.process_infos
16
17
  @@process_infos
17
18
  end
18
19
 
19
20
  @@background_objs = []
20
- @@result_id = 0
21
+ @@result_id = 0
21
22
 
22
23
  @@pids = []
23
24
 
@@ -46,10 +47,10 @@ module InParallel
46
47
  # Runs all methods within the block in parallel and waits for them to complete
47
48
  #
48
49
  # Example - will spawn 2 processes, (1 for each method) wait until they both complete, and log STDOUT:
49
- # InParallel.run_in_parallel {
50
+ # InParallel.run_in_parallel do
50
51
  # @result_1 = method1
51
52
  # @result_2 = method2
52
- # }
53
+ # end
53
54
  # NOTE: Only supports assigning instance variables within the block, not local variables
54
55
  def self.run_in_parallel(timeout = @@parallel_default_timeout, kill_all_on_error = false, &block)
55
56
  if fork_supported?
@@ -64,10 +65,10 @@ module InParallel
64
65
  # Runs all methods within the block in parallel in the background
65
66
  #
66
67
  # Example - Will spawn a process in the background to run puppet agent on two agents and return immediately:
67
- # Parallel.run_in_background {
68
+ # Parallel.run_in_background do
68
69
  # @result_1 = method1
69
70
  # @result_2 = method2
70
- # }
71
+ # end
71
72
  # # Do something else here before waiting for the process to complete
72
73
  #
73
74
  # # Optionally wait for the processes to complete before continuing.
@@ -84,7 +85,7 @@ module InParallel
84
85
  Process.detach(@@process_infos.last[:pid])
85
86
  @@process_infos.pop
86
87
  else
87
- @@background_objs << {:proxy => proxy, :target => block.binding}
88
+ @@background_objs << { :proxy => proxy, :target => block.binding }
88
89
  return process_infos.last[:tmp_result]
89
90
  end
90
91
  return
@@ -102,7 +103,7 @@ module InParallel
102
103
  # @param [Boolean] kill_all_on_error Whether to wait for all processes to complete, or fail immediately - killing all other forked processes - when one process errors.
103
104
  def self.wait_for_processes(proxy = self, binding = nil, timeout = nil, kill_all_on_error = false)
104
105
  raise_error = nil
105
- timeout ||= @@parallel_default_timeout
106
+ timeout ||= @@parallel_default_timeout
106
107
  trap(:INT) do
107
108
  # Can't use logger inside of trap
108
109
  puts "Warning, recieved interrupt. Processing child results and exiting."
@@ -112,8 +113,8 @@ module InParallel
112
113
  # Custom process to wait so that we can do things like time out, and kill child processes if
113
114
  # one process returns with an error before the others complete.
114
115
  results_map = Array.new(@@process_infos.count)
115
- start_time = Time.now
116
- timer = start_time
116
+ start_time = Time.now
117
+ timer = start_time
117
118
  while !@@process_infos.empty? do
118
119
  if @@parallel_signal_interval > 0 && Time.now > timer + @@parallel_signal_interval
119
120
  @@logger.debug 'Waiting for child processes.'
@@ -123,7 +124,7 @@ module InParallel
123
124
  kill_child_processes
124
125
  raise_error = ::RuntimeError.new("Child process ran longer than timeout of #{timeout}")
125
126
  end
126
- @@process_infos.each {|process_info|
127
+ @@process_infos.each do |process_info|
127
128
  # wait up to half a second for each thread to see if it is complete, if not, check the next thread.
128
129
  # returns immediately if the process has completed.
129
130
  thr = process_info[:wait_thread].join(0.5)
@@ -136,7 +137,7 @@ module InParallel
136
137
  # So don't use logger, just use puts.
137
138
  puts " " + File.new(process_info[:std_out], 'r').readlines.join(" ")
138
139
  @@logger.info "------ Completed output for #{process_info[:method_sym]} - #{process_info[:pid]}"
139
- result = process_info[:result].read
140
+ result = process_info[:result].read
140
141
  marshalled_result = (result.nil? || result.empty?) ? result : Marshal.load(result)
141
142
  # Kill all other processes and let them log their stdout before re-raising
142
143
  # if a child process raised an error.
@@ -145,7 +146,7 @@ module InParallel
145
146
  kill_child_processes if kill_all_on_error
146
147
  marshalled_result = nil
147
148
  end
148
- results_map[process_info[:index]] = {process_info[:tmp_result] => marshalled_result}
149
+ results_map[process_info[:index]] = { process_info[:tmp_result] => marshalled_result }
149
150
  ensure
150
151
  File.delete(process_info[:std_out]) if File.exists?(process_info[:std_out])
151
152
  # close the read end pipe
@@ -153,7 +154,7 @@ module InParallel
153
154
  @@process_infos.delete(process_info)
154
155
  end
155
156
  end
156
- }
157
+ end
157
158
  end
158
159
 
159
160
  results = []
@@ -166,9 +167,7 @@ module InParallel
166
167
  # If there are background_objs AND results, don't return the background obj results
167
168
  # (which would mess up expected results from each_in_parallel),
168
169
  # but do process their results in case they are assigned to instance variables
169
- @@background_objs.each {|obj|
170
- result_lookup(obj[:proxy], obj[:target], results_map)
171
- }
170
+ @@background_objs.each { |obj| result_lookup(obj[:proxy], obj[:target], results_map) }
172
171
  @@background_objs.clear
173
172
 
174
173
  raise raise_error unless raise_error.nil?
@@ -178,10 +177,10 @@ module InParallel
178
177
 
179
178
  # private method to execute some code in a separate process and store the STDOUT and STDERR for later retrieval
180
179
  def self._execute_in_parallel(method_sym, obj = self, &block)
181
- ret_val = nil
180
+ ret_val = nil
182
181
  # Communicate the return value of the method or block
183
182
  read_result, write_result = IO.pipe
184
- pid = fork do
183
+ pid = fork do
185
184
  Dir.mkdir('tmp') unless Dir.exists? 'tmp'
186
185
  stdout_file = File.new("tmp/pp_#{Process.pid}", 'w')
187
186
  exit_status = 0
@@ -204,13 +203,13 @@ module InParallel
204
203
  ret_val = obj.instance_eval(&block)
205
204
  # Write the result to the write_result IO stream.
206
205
  # Have to serialize the value so it can be transmitted via IO
207
- if(!ret_val.nil? && ret_val.singleton_methods && ret_val.class != TrueClass && ret_val.class != FalseClass && ret_val.class != Fixnum)
206
+ if (!ret_val.nil? && ret_val.singleton_methods && ret_val.class != TrueClass && ret_val.class != FalseClass && ret_val.class != Fixnum)
208
207
  #in case there are other types that can't be duped
209
208
  begin
210
209
  ret_val = ret_val.dup
211
210
  rescue StandardError => err
212
211
  @@logger.warn "Warning: return value from child process #{ret_val} " +
213
- "could not be transferred to parent process: #{err.message}"
212
+ "could not be transferred to parent process: #{err.message}"
214
213
  end
215
214
  end
216
215
  # In case there are other types that can't be dumped
@@ -218,7 +217,7 @@ module InParallel
218
217
  Marshal.dump(ret_val, write_result) unless ret_val.nil?
219
218
  rescue StandardError => err
220
219
  @@logger.warn "Warning: return value from child process #{ret_val} " +
221
- "could not be transferred to parent process: #{err.message}"
220
+ "could not be transferred to parent process: #{err.message}"
222
221
  end
223
222
  rescue Exception => err
224
223
  @@logger.error "Error in process #{pid}: #{err.message}"
@@ -235,15 +234,15 @@ module InParallel
235
234
  write_result.close
236
235
  # Process.detach returns a thread that will be nil if the process is still running and thr if not.
237
236
  # This allows us to check to see if processes have exited without having to call the blocking Process.wait functions.
238
- wait_thread = Process.detach(pid)
237
+ wait_thread = Process.detach(pid)
239
238
  # store the IO object with the STDOUT and waiting thread for each pid
240
239
  process_info = { :wait_thread => wait_thread,
241
- :pid => pid,
242
- :method_sym => method_sym,
243
- :std_out => "tmp/pp_#{pid}",
244
- :result => read_result,
245
- :tmp_result => "unresolved_parallel_result_#{@@result_id}",
246
- :index => @@process_infos.count }
240
+ :pid => pid,
241
+ :method_sym => method_sym,
242
+ :std_out => "tmp/pp_#{pid}",
243
+ :result => read_result,
244
+ :tmp_result => "unresolved_parallel_result_#{@@result_id}",
245
+ :index => @@process_infos.count }
247
246
  @@process_infos.push(process_info)
248
247
  @@result_id += 1
249
248
  process_info
@@ -256,35 +255,37 @@ module InParallel
256
255
  end
257
256
 
258
257
  def self.kill_child_processes
259
- @@process_infos.each { |process_info|
258
+ @@process_infos.each do |process_info|
260
259
  # Send INT to each child process so it returns and can print stdout and stderr to console before exiting.
261
260
  begin
262
261
  Process.kill("INT", process_info[:pid])
263
262
  rescue Errno::ESRCH
264
263
  # If one of the other processes has completed in the very short time before we try to kill it, handle the exception
265
264
  end
266
- }
265
+ end
267
266
  end
267
+
268
268
  private_class_method :kill_child_processes
269
269
 
270
270
  # Private method to lookup results from the results_map and replace the
271
271
  # temp values with actual return values
272
272
  def self.result_lookup(proxy_obj, target_obj, results_map)
273
273
  target_obj = eval('self', target_obj)
274
- proxy_obj ||= target_obj
275
- vars = (proxy_obj.instance_variables)
276
- results = []
277
- results_map.each { |tmp_result|
274
+ proxy_obj ||= target_obj
275
+ vars = proxy_obj.instance_variables
276
+ results = []
277
+ results_map.each do |tmp_result|
278
278
  results << tmp_result.values[0]
279
- vars.each {|var|
279
+ vars.each do |var|
280
280
  if proxy_obj.instance_variable_get(var) == tmp_result.keys[0]
281
281
  target_obj.instance_variable_set(var, tmp_result.values[0])
282
282
  break
283
283
  end
284
- }
285
- }
284
+ end
285
+ end
286
286
  results
287
287
  end
288
+
288
289
  private_class_method :result_lookup
289
290
 
290
291
  # Proxy class used to wrap each method execution in a block and run it in parallel
@@ -294,14 +295,15 @@ module InParallel
294
295
  include ::Kernel
295
296
 
296
297
  def initialize(obj)
297
- @object = obj
298
+ @object = obj
298
299
  @result_id = 0
299
300
  end
300
301
 
301
302
  # All methods within the block should show up as missing (unless defined in :Kernel)
302
303
  def method_missing(method_sym, *args, &block)
303
304
  if InParallelExecutor.main_pid == ::Process.pid
304
- out = InParallelExecutor._execute_in_parallel("'#{method_sym.to_s}' #{caller_locations[0].to_s}", @object.eval('self')) {send(method_sym, *args, &block)}
305
+ out = InParallelExecutor._execute_in_parallel("'#{method_sym.to_s}' #{caller_locations[0].to_s}",
306
+ @object.eval('self')) { send(method_sym, *args, &block) }
305
307
  out[:tmp_result]
306
308
  end
307
309
  end
@@ -310,18 +312,24 @@ module InParallel
310
312
 
311
313
  InParallelExecutor.logger = @logger
312
314
 
315
+ # Gets how many seconds to wait between logging a 'Waiting for child processes.'
313
316
  def parallel_signal_interval
314
317
  InParallelExecutor.parallel_signal_interval
315
318
  end
316
319
 
320
+ # Sets how many seconds to wait between logging a 'Waiting for child processes.'
321
+ # @param [Int] value Time in seconds to wait before logging 'Waiting for child processes.'
317
322
  def parallel_signal_interval=(value)
318
323
  InParallelExecutor.parallel_signal_interval = value
319
324
  end
320
325
 
326
+ # Gets how many seconds to wait before timing out a forked child process and raising an exception
321
327
  def parallel_default_timeout
322
328
  InParallelExecutor.parallel_default_timeout
323
329
  end
324
330
 
331
+ # Sets how many seconds to wait before timing out a forked child process and raising an exception
332
+ # @param [Int] value Time in seconds to wait before timing out and raising an exception
325
333
  def parallel_default_timeout=(value)
326
334
  InParallelExecutor.parallel_default_timeout = value
327
335
  end
@@ -329,10 +337,10 @@ module InParallel
329
337
  # Executes each method within a block in a different process.
330
338
  #
331
339
  # Example - Will spawn a process in the background to execute each method
332
- # Parallel.run_in_parallel {
340
+ # Parallel.run_in_parallel do
333
341
  # @result_1 = method1
334
342
  # @result_2 = method2
335
- # }
343
+ # end
336
344
  # NOTE - Only instance variables can be assigned the return values of the methods within the block. Local variables will not be assigned any values.
337
345
  # @param [Int] timeout Time in seconds to wait before giving up on a child process
338
346
  # @param [Boolean] kill_all_on_error Whether to wait for all processes to complete, or fail immediately - killing all other forked processes - when one process errors.
@@ -346,17 +354,17 @@ module InParallel
346
354
  # Forks a process for each method within a block and returns immediately.
347
355
  #
348
356
  # Example 1 - Will fork a process in the background to execute each method and return immediately:
349
- # Parallel.run_in_background {
357
+ # Parallel.run_in_background do
350
358
  # @result_1 = method1
351
359
  # @result_2 = method2
352
- # }
360
+ # end
353
361
  #
354
362
  # Example 2 - Will fork a process in the background to execute each method, return immediately, then later
355
363
  # wait for the process to complete, printing it's STDOUT and assigning return values to instance variables:
356
- # Parallel.run_in_background(false) {
364
+ # Parallel.run_in_background(false) do
357
365
  # @result_1 = method1
358
366
  # @result_2 = method2
359
- # }
367
+ # end
360
368
  # # Do something else here before waiting for the process to complete
361
369
  #
362
370
  # wait_for_processes
@@ -4,9 +4,7 @@ module Enumerable
4
4
  #
5
5
  # Example - Will execute each iteration in a separate process, in parallel, log STDOUT per process, and return an array of results.
6
6
  # my_array = [1,2,3]
7
- # my_array.each_in_parallel { |int|
8
- # my_method(int)
9
- # }
7
+ # my_array.each_in_parallel { |int| my_method(int) }
10
8
  # @param [String] identifier - Optional identifier for logging purposes only. Will use the block location by default.
11
9
  # @param [Int] timeout - Seconds to wait for a forked process to complete before timing out
12
10
  # @return [Array<Object>] results - the return value of each block execution.
@@ -14,7 +12,7 @@ module Enumerable
14
12
  if InParallel::InParallelExecutor.fork_supported? && count > 1
15
13
  identifier ||= "#{caller_locations[0]}"
16
14
  each do |item|
17
- out = InParallel::InParallelExecutor._execute_in_parallel(identifier) {block.call(item)}
15
+ InParallel::InParallelExecutor._execute_in_parallel(identifier) { block.call(item) }
18
16
  end
19
17
  # return the array of values, no need to look up from the map.
20
18
  return InParallel::InParallelExecutor.wait_for_processes(nil, block.binding, timeout, kill_all_on_error)
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: in-parallel
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.10
4
+ version: 0.1.11
5
5
  platform: ruby
6
6
  authors:
7
7
  - samwoods1
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2016-06-27 00:00:00.000000000 Z
11
+ date: 2016-06-29 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler