resque-pool 0.2.0 → 0.3.0.beta.1

Sign up to get free protection for your applications and to get access to all the features.
data/Changelog.md CHANGED
@@ -1,5 +1,19 @@
1
1
  ## unreleased
2
2
 
3
+ * enhancement: new callbacks for configuration
4
+ * `Resque::Pool.configure do |pool| ... end`
5
+ * `pool.after_manager_wakeup`
6
+ * `pool.to_calculate_worker_offset`
7
+ * `pool.after_prefork` (instance callback prefered over class callback)
8
+ * experimental: memory management
9
+ * experimental: check orphaned workers
10
+ * development: a good bit of code cleanup and rearrangement
11
+
12
+ See ExperimentalFeatures.md for more info. Thanks to Jason Haruska for these
13
+ features!
14
+
15
+ ## 0.2.0 (2011-03-15)
16
+
3
17
  * new feature: sending `HUP` to pool manager will reload the logfiles and
4
18
  gracefully restart all workers.
5
19
  * enhancement: logging now includes timestamp, process "name" (worker or
@@ -0,0 +1,48 @@
1
+ Experimental Features
2
+ ---------------------
3
+
4
+ Features listed here should not cause you problems if you don't use them... and
5
+ probably won't cause you problems if you *do* use them. Maybe. We hope. ;-)
6
+ Once these features have stood the test of time, can be trusted to work cross
7
+ platform, are well documented and tested, and have settled on a stable API,
8
+ then we'll stop calling them experimental and probably give them command line
9
+ options. Until then, be forewarned that it might not work for you and the API
10
+ might change between minor releases.
11
+
12
+ ### Memory management
13
+
14
+ A memory manager is provided which can check once a minute to see if any of
15
+ your workers are using too much memory. If a worker is over the soft limit it
16
+ will be sent a QUIT signal. If a worker is over the hard signal it will be
17
+ sent a TERM signal, and if it is still running a minute later it will be sent a
18
+ KILL signal. To use the memory manager, add something like the following to
19
+ your Rakefile config:
20
+
21
+ task "resque:pool:setup" do
22
+ Resque::Pool.configure do |pool|
23
+ # memory limits are in MB
24
+ hard_limit = 250
25
+ soft_limit = 200
26
+ mm = Resque::Pool::MemoryManager.new(pool, hard_limit, soft_limit)
27
+ pool.after_manager_wakeup { mm.monitor_memory_usage }
28
+ end
29
+ end
30
+
31
+ ### Wait for orphaned workers to quit.
32
+
33
+ When restarting resque-pool, some orphaned workers may be left over from the
34
+ previous manager, even after the new manager has started up and forked its
35
+ workers. If you set the `RESQUE_WAIT_FOR_ORPHANS` environment variable, then
36
+ the new manager will not start up all of its configured workers until all of
37
+ the orphaned workers have finished. This might be useful for long running jobs
38
+ on memory constrained servers.
39
+
40
+ To use this, add something like the following to your Rakefile config:
41
+
42
+ task "resque:pool:setup" do
43
+ Resque::Pool.configure do |pool|
44
+ orphan_watcher = Resque::Pool::OrphanWatcher.new(pool)
45
+ pool.to_calculate_worker_offset { orphan_watcher.worker_offset }
46
+ end
47
+ end
48
+
data/README.md CHANGED
@@ -130,6 +130,11 @@ their current job) if the manager process disappears before them.
130
130
  You can specify an alternate config file by setting the `RESQUE_POOL_CONFIG` or
131
131
  with the `--config` command line option.
132
132
 
133
+ Experimental Features
134
+ ---------------------
135
+
136
+ See `ExperimentalFeatures.md` for some potentially useful works in progress.
137
+
133
138
  TODO
134
139
  -----
135
140
 
@@ -138,8 +143,8 @@ See [the TODO list](https://github.com/nevans/resque-pool/issues) at github issu
138
143
  Contributors
139
144
  -------------
140
145
 
146
+ * Jason Haruska from Backupify (pidfile management, memory management, wait for orphaned workers)
141
147
  * John Schult (config file can be split by environment)
142
148
  * Stephen Celis (increased gemspec sanity)
143
149
  * Vincent Agnello, Robert Kamunyori, Paul Kauders; for pairing with me at
144
- B'more on Rails Open Source Hack Nights. :)
145
-
150
+ [B'more on Rails](http://twitter.com/bmoreonrails) Open Source Hack Nights. :)
data/lib/resque/pool.rb CHANGED
@@ -2,7 +2,12 @@
2
2
  require 'resque'
3
3
  require 'resque/pool/version'
4
4
  require 'resque/pool/logging'
5
- require 'resque/pool/pooled_worker'
5
+ require 'resque/pool/manager'
6
+ require 'resque/pool/worker_type_manager'
7
+ # experimental features!
8
+ require 'resque/pool/memory_manager'
9
+ require 'resque/pool/orphan_watcher'
10
+
6
11
  require 'fcntl'
7
12
  require 'yaml'
8
13
 
@@ -23,26 +28,70 @@ module Resque
23
28
  procline "(initialized)"
24
29
  end
25
30
 
26
- # Config: after_prefork {{{
31
+ # Config: hooks {{{
32
+
33
+ # The +configure+ block will be run once during pool startup, after the
34
+ # internal initialization is complete, but before any workers are started
35
+ # up. This is for configuring any runtime callbacks that you want to
36
+ # customize (e.g. +after_wakup+, +worker_offset_handler+)
37
+ #
38
+ # Call with a block to set.
39
+ # Call with no arguments to return the hook.
40
+ def self.configure(&block)
41
+ block ? (@configure = block) : @configure
42
+ end
43
+
44
+ def call_configure!
45
+ self.class.configure && self.class.configure.call(self)
46
+ end
47
+
48
+ # The +after_manager_wakeup+ hook will be run in the pool manager every
49
+ # time the pool manager wakes up from IO.select (normally once a second).
50
+ # It will run immediately before the pool manager performs its normal
51
+ # worker maintenance (starting and stopping workers as necessary). If you
52
+ # have some special logic for killing workers that are taking too long or
53
+ # using too much memory, this is the place to put it.
54
+ #
55
+ # Call with a block to set the hook.
56
+ # Call with no arguments to return the hook.
57
+ def after_manager_wakeup(&block)
58
+ block ? (@after_manager_wakeup = block) : @after_manager_wakeup
59
+ end
60
+
61
+ def call_after_manager_wakeup!
62
+ after_manager_wakeup && after_manager_wakeup.call(self)
63
+ end
64
+
65
+ def to_calculate_worker_offset(&block)
66
+ block ? (@to_calculate_worker_offset = block) : @to_calculate_worker_offset
67
+ end
68
+
69
+ # deprecated by instance level +after_prefork+ hook
70
+ def self.after_prefork(&block)
71
+ #log '"Resque::Pool.after_prefork(&blk)" is deprecated.'
72
+ #log 'Please use "Resque::Pool.configure {|p| p.after_prefork(&blk) }" instead.'
73
+ block ? (@after_prefork = block) : @after_prefork
74
+ end
27
75
 
28
- # The `after_prefork` hook will be run in workers if you are using the
76
+ # The +after_prefork+ hook will be run in workers if you are using the
29
77
  # preforking master worker to save memory. Use this hook to reload
30
78
  # database connections and so forth to ensure that they're not shared
31
79
  # among workers.
32
80
  #
33
81
  # Call with a block to set the hook.
34
82
  # Call with no arguments to return the hook.
35
- def self.after_prefork(&block)
83
+ def after_prefork(&block)
36
84
  block ? (@after_prefork = block) : @after_prefork
37
85
  end
38
86
 
39
87
  # Set the after_prefork proc.
40
- def self.after_prefork=(after_prefork)
88
+ def after_prefork=(after_prefork)
41
89
  @after_prefork = after_prefork
42
90
  end
43
91
 
44
92
  def call_after_prefork!
45
- self.class.after_prefork && self.class.after_prefork.call
93
+ (after_prefork && after_prefork.call) ||
94
+ (self.class.after_prefork && self.class.after_prefork.call)
46
95
  end
47
96
 
48
97
  # }}}
@@ -93,7 +142,9 @@ module Resque
93
142
  end
94
143
 
95
144
  def environment
96
- if defined? RAILS_ENV
145
+ if defined? Rails
146
+ Rails.env
147
+ elsif defined? RAILS_ENV # keep compatibility with older versions of rails
97
148
  RAILS_ENV
98
149
  else
99
150
  ENV['RACK_ENV'] || ENV['RAILS_ENV'] || ENV['RESQUE_ENV']
@@ -144,10 +195,6 @@ module Resque
144
195
  end
145
196
  end
146
197
 
147
- def reset_sig_handlers!
148
- QUEUE_SIGS.each {|sig| trap(sig, "DEFAULT") }
149
- end
150
-
151
198
  def handle_sig_queue!
152
199
  case signal = sig_queue.shift
153
200
  when :USR1, :USR2, :CONT
@@ -185,6 +232,8 @@ module Resque
185
232
  # start, join, and master sleep {{{
186
233
 
187
234
  def start
235
+ procline("(configuring)")
236
+ call_configure!
188
237
  procline("(starting)")
189
238
  init_self_pipe!
190
239
  init_sig_handlers!
@@ -209,6 +258,7 @@ module Resque
209
258
  break if handle_sig_queue! == :break
210
259
  if sig_queue.empty?
211
260
  master_sleep
261
+ call_after_manager_wakeup!
212
262
  maintain_worker_count
213
263
  end
214
264
  procline("managing #{all_pids.inspect}")
@@ -242,6 +292,8 @@ module Resque
242
292
  # TODO: close any file descriptors connected to worker, if any
243
293
  log "Reaped resque worker[#{status.pid}] (status: #{status.exitstatus}) queues: #{worker.queues.join(",")}"
244
294
  end
295
+ rescue Errno::EINTR
296
+ retry
245
297
  rescue Errno::ECHILD, QuitNowException
246
298
  end
247
299
  end
@@ -265,64 +317,20 @@ module Resque
265
317
  end
266
318
 
267
319
  # }}}
268
- # ???: maintain_worker_count, all_known_queues {{{
320
+ # maintain_worker_count, all_known_worker_types, worker_offset {{{
269
321
 
270
322
  def maintain_worker_count
271
- all_known_queues.each do |queues|
272
- delta = worker_delta_for(queues)
273
- spawn_missing_workers_for(queues) if delta > 0
274
- quit_excess_workers_for(queues) if delta < 0
323
+ all_known_worker_types.each do |queues|
324
+ WorkerTypeManager.new(self, queues).maintain_worker_count(worker_offset)
275
325
  end
276
326
  end
277
327
 
278
- def all_known_queues
328
+ def all_known_worker_types
279
329
  config.keys | workers.keys
280
330
  end
281
331
 
282
- # }}}
283
- # methods that operate on a single grouping of queues {{{
284
- # perhaps this means a class is waiting to be extracted
285
-
286
- def spawn_missing_workers_for(queues)
287
- worker_delta_for(queues).times do |nr|
288
- spawn_worker!(queues)
289
- end
290
- end
291
-
292
- def quit_excess_workers_for(queues)
293
- delta = -worker_delta_for(queues)
294
- pids_for(queues)[0...delta].each do |pid|
295
- Process.kill("QUIT", pid)
296
- end
297
- end
298
-
299
- def worker_delta_for(queues)
300
- config.fetch(queues, 0) - workers.fetch(queues, []).size
301
- end
302
-
303
- def pids_for(queues)
304
- workers[queues].keys
305
- end
306
-
307
- def spawn_worker!(queues)
308
- worker = create_worker(queues)
309
- pid = fork do
310
- log_worker "Starting worker #{worker}"
311
- call_after_prefork!
312
- reset_sig_handlers!
313
- #self_pipe.each {|io| io.close }
314
- worker.work(ENV['INTERVAL'] || DEFAULT_WORKER_INTERVAL) # interval, will block
315
- end
316
- workers[queues] ||= {}
317
- workers[queues][pid] = worker
318
- end
319
-
320
- def create_worker(queues)
321
- queues = queues.to_s.split(',')
322
- worker = PooledWorker.new(*queues)
323
- worker.verbose = ENV['LOGGING'] || ENV['VERBOSE']
324
- worker.very_verbose = ENV['VVERBOSE']
325
- worker
332
+ def worker_offset
333
+ to_calculate_worker_offset && to_calculate_worker_offset.call || 0
326
334
  end
327
335
 
328
336
  # }}}
@@ -99,7 +99,7 @@ where [options] are:
99
99
  def setup_environment(opts)
100
100
  ENV["RACK_ENV"] = ENV["RAILS_ENV"] = ENV["RESQUE_ENV"] = opts[:environment] if opts[:environment]
101
101
  log "Resque Pool running in #{ENV["RAILS_ENV"] || "development"} environment"
102
- ENV["RESQUE_POOL_CONFIG"] = opts[:config] if opts[:config]
102
+ ENV["RESQUE_POOL_CONFIG"] = opts[:config].to_s if opts[:config]
103
103
  end
104
104
 
105
105
  def start_pool
@@ -0,0 +1,11 @@
1
+ # TODO: reorganize code that is currently in resque/pool.rb into this file
2
+ require 'resque/pool/logging'
3
+
4
+ module Resque
5
+ class Pool
6
+ class Manager
7
+ include Logging
8
+
9
+ end
10
+ end
11
+ end
@@ -0,0 +1,118 @@
1
+ require 'resque/pool/logging'
2
+
3
+ module Resque
4
+ class Pool
5
+ class MemoryManager
6
+ include Logging
7
+
8
+ attr_reader :hard_limit, :soft_limit
9
+
10
+ def initialize(hard_limit=250, soft_limit=200)
11
+ @hard_limit = hard_limit
12
+ @soft_limit = soft_limit
13
+ end
14
+
15
+ def monitor_memory_usage(pool_manager)
16
+ #only check every minute
17
+ if @last_mem_check.nil? || @last_mem_check < Time.now - 60
18
+ hard_kill_workers
19
+ pool_manager.all_pids.each do |pid|
20
+ total_usage = memory_usage(pid)
21
+ child_pid = find_child_pid(pid)
22
+ total_usage += memory_usage(child_pid) if child_pid
23
+
24
+ if total_usage > hard_limit
25
+ log "Terminating worker #{pid} for using #{total_usage}MB memory"
26
+ stop_worker(pid)
27
+ elsif total_usage > soft_limit
28
+ log "Gracefully shutting down worker #{pid} for using #{total_usage}MB memory"
29
+ stop_worker(pid, :QUIT)
30
+ end
31
+ end
32
+ @last_mem_check = Time.now
33
+ end
34
+ end
35
+
36
+ private
37
+
38
+ def add_killed_worker(pid)
39
+ @term_workers ||= []
40
+ @term_workers << pid if pid
41
+ end
42
+
43
+ def find_child_pid(parent_pid)
44
+ begin
45
+ p = `ps --ppid #{parent_pid} -o pid --no-header`.to_i
46
+ p == 0 ? nil : p
47
+ rescue Errno::EINTR
48
+ retry
49
+ end
50
+ end
51
+
52
+ def hard_kill_workers
53
+ @term_workers ||= []
54
+ #look for workers that didn't terminate
55
+ @term_workers.delete_if {|pid| !process_exists?(pid)}
56
+ #send the rest a -9
57
+ @term_workers.each {|pid| `kill -9 #{pid}`}
58
+ end
59
+
60
+ def hostname
61
+ begin
62
+ @hostname ||= `hostname`.strip
63
+ rescue Errno::EINTR
64
+ retry
65
+ end
66
+ end
67
+
68
+ def memory_usage(pid)
69
+ smaps_filename = "/proc/#{pid}/smaps"
70
+ #Grab actual memory usage from proc in MB
71
+ begin
72
+ mem_usage = `
73
+ if [ -f #{smaps_filename} ];
74
+ then
75
+ grep Private_Dirty #{smaps_filename} | awk '{s+=$2} END {printf("%d", s/1000)}'
76
+ else echo "0"
77
+ fi
78
+ `.to_i
79
+ rescue Errno::EINTR
80
+ retry
81
+ end
82
+ end
83
+
84
+ # TODO: DRY up this and the pidfile checker in Resque::Pool::CLI
85
+ def process_exists?(pid)
86
+ begin
87
+ ps_line = `ps -p #{pid} --no-header`
88
+ rescue Errno::EINTR
89
+ retry
90
+ end
91
+ !ps_line.nil? && ps_line.strip != ''
92
+ end
93
+
94
+ def stop_worker(pid, signal=:TERM)
95
+ begin
96
+ worker = Resque.working.find do |w|
97
+ host, worker_pid, queues = w.id.split(':')
98
+ w if worker_pid.to_i == pid.to_i && host == hostname
99
+ end
100
+ if worker
101
+ encoded_job = worker.job
102
+ verb = signal == :QUIT ? 'Graceful' : 'Forcing'
103
+ total_time = Time.now - Time.parse(encoded_job['run_at']) rescue 0
104
+ log "#{verb} shutdown while processing: #{encoded_job} -- ran for #{'%.2f' % total_time}s"
105
+ end
106
+ Process.kill signal, pid
107
+ if signal == :TERM
108
+ add_killed_worker(pid)
109
+ add_killed_worker(find_child_pid(pid))
110
+ end
111
+ rescue Errno::EINTR
112
+ retry
113
+ end
114
+ end
115
+
116
+ end
117
+ end
118
+ end
@@ -0,0 +1,49 @@
1
+ require 'resque/pool/logging'
2
+
3
+ module Resque
4
+ class Pool
5
+ class OrphanWatcher
6
+ include Logging
7
+
8
+ attr_reader :pool_manager
9
+
10
+ def initialize(pool_manager)
11
+ @pool_manager = pool_manager
12
+ end
13
+
14
+ def worker_offset
15
+ orphaned_worker_count / pool_manager.all_known_worker_types.size
16
+ end
17
+
18
+ def orphaned_worker_count
19
+ if @last_orphaned_check.nil? || @last_orphaned_check < Time.now - 60
20
+ if @orphaned_pids.nil?
21
+ begin
22
+ pids_with_parents = `ps -Af | grep resque | grep -v grep | grep -v resque-web | grep -v master | awk '{printf("%d %d\\n", $2, $3)}'`.split("\n")
23
+ rescue Errno::EINTR
24
+ retry
25
+ end
26
+ pids = pids_with_parents.collect {|x| x.split[0].to_i}
27
+ parents = pids_with_parents.collect {|x| x.split[1].to_i}
28
+ pids.delete_if {|x| parents.include?(x)}
29
+ pids.delete_if {|x| all_pids.include?(x)}
30
+ @orphaned_pids = pids
31
+ elsif @orphaned_pids.size > 0
32
+ @orphaned_pids.delete_if do |pid|
33
+ begin
34
+ ps_out = `ps --no-heading p #{pid}`
35
+ ps_out.nil? || ps_out.strip == ''
36
+ rescue Errno::EINTR
37
+ retry
38
+ end
39
+ end
40
+ end
41
+ @last_orphaned_check = Time.now
42
+ log "Current orphaned pids: #{@orphaned_pids}" if @orphaned_pids.size > 0
43
+ end
44
+ @orphaned_pids.size
45
+ end
46
+
47
+ end
48
+ end
49
+ end
@@ -1,5 +1,5 @@
1
1
  module Resque
2
2
  class Pool
3
- VERSION = "0.2.0"
3
+ VERSION = "0.3.0.beta.1"
4
4
  end
5
5
  end
@@ -0,0 +1,102 @@
1
+ require 'resque/pool/logging'
2
+ require 'resque/pool/pooled_worker'
3
+
4
+ module Resque
5
+ class Pool
6
+ class WorkerTypeManager
7
+ include Logging
8
+
9
+ attr_reader :pool_manager, :queues, :queue_array
10
+
11
+ def initialize(pool_manager, queues)
12
+ @pool_manager = pool_manager
13
+ @queues = queues
14
+ @queue_array = queues.to_s.split(',')
15
+ end
16
+
17
+ # TODO: Pool manager will hold onto WorkerTypeManagers, and will push the
18
+ # configuration directly into the WorkerTypeManagers
19
+ def configuration
20
+ { :count => pool_manager.config.fetch(queues, 0), }
21
+ end
22
+
23
+ def maintain_worker_count(offset)
24
+ delta = worker_delta - offset
25
+ spawn_missing_workers_for(delta) if delta > 0
26
+ quit_excess_workers_for(delta) if delta < 0
27
+ end
28
+
29
+ def pids
30
+ running_workers.keys
31
+ end
32
+
33
+ # TODO: Pool manager will hold onto WorkerTypeManagers,
34
+ # and WorkerTypeManager will store the workers itself
35
+ def running_workers
36
+ pool_manager.workers.fetch(queues, {})
37
+ end
38
+
39
+ def worker_delta
40
+ configuration[:count] - running_workers.size
41
+ end
42
+
43
+ private
44
+
45
+ def create_worker
46
+ worker = PooledWorker.new(*queue_array)
47
+ worker.verbose = ENV['LOGGING'] || ENV['VERBOSE']
48
+ worker.very_verbose = ENV['VVERBOSE']
49
+ worker
50
+ end
51
+
52
+ def quit_excess_workers_for(delta)
53
+ if delta < 0
54
+ queue_pids = pids.clone
55
+ if queue_pids.size >= delta.abs
56
+ queue_pids[0...delta.abs].each {|pid| Process.kill("QUIT", pid)}
57
+ else
58
+ queue_pids.each {|pid| Process.kill("QUIT", pid)}
59
+ end
60
+ end
61
+ end
62
+
63
+ # TODO: Pool manager will hold onto WorkerTypeManagers,
64
+ # and WorkerTypeManager will store the pids itself
65
+ def register_new_worker_with_manager(worker, pid)
66
+ pool_manager.workers[queues] ||= {}
67
+ pool_manager.workers[queues][pid] = worker
68
+ end
69
+
70
+ def reset_sig_handlers!
71
+ QUEUE_SIGS.each {|sig| trap(sig, "DEFAULT") }
72
+ end
73
+
74
+ def spawn_missing_workers_for(delta)
75
+ delta.times { spawn_worker! } if delta > 0
76
+ end
77
+
78
+ def spawn_worker!
79
+ worker = create_worker
80
+ pid = fork do
81
+ start_forked_worker worker
82
+ end
83
+ register_new_worker_with_manager worker, pid
84
+ end
85
+
86
+ def start_forked_worker(worker)
87
+ reset_sig_handlers!
88
+ log_worker "Starting worker #{worker}"
89
+ pool_manager.call_after_prefork!
90
+ begin
91
+ worker.work(ENV['INTERVAL'] || DEFAULT_WORKER_INTERVAL) # interval, will block
92
+ rescue Errno::EINTR
93
+ log_worker "Caught interrupted system call Errno::EINTR. Retrying."
94
+ retry
95
+ end
96
+ end
97
+
98
+
99
+ end
100
+ end
101
+ end
102
+
metadata CHANGED
@@ -1,13 +1,15 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: resque-pool
3
3
  version: !ruby/object:Gem::Version
4
- hash: 23
5
- prerelease: false
4
+ hash: 62196369
5
+ prerelease: true
6
6
  segments:
7
7
  - 0
8
- - 2
8
+ - 3
9
9
  - 0
10
- version: 0.2.0
10
+ - beta
11
+ - 1
12
+ version: 0.3.0.beta.1
11
13
  platform: ruby
12
14
  authors:
13
15
  - nicholas a. evans
@@ -179,11 +181,16 @@ files:
179
181
  - lib/resque/pool/tasks.rb
180
182
  - lib/resque/pool/cli.rb
181
183
  - lib/resque/pool/version.rb
184
+ - lib/resque/pool/manager.rb
182
185
  - lib/resque/pool/pooled_worker.rb
186
+ - lib/resque/pool/memory_manager.rb
187
+ - lib/resque/pool/orphan_watcher.rb
188
+ - lib/resque/pool/worker_type_manager.rb
183
189
  - lib/resque/pool/logging.rb
184
190
  - config/cucumber.yml
185
191
  - config/alternate.yml
186
192
  - Gemfile
193
+ - ExperimentalFeatures.md
187
194
  has_rdoc: true
188
195
  homepage: http://github.com/nevans/resque-pool
189
196
  licenses: []
@@ -205,12 +212,14 @@ required_ruby_version: !ruby/object:Gem::Requirement
205
212
  required_rubygems_version: !ruby/object:Gem::Requirement
206
213
  none: false
207
214
  requirements:
208
- - - ">="
215
+ - - ">"
209
216
  - !ruby/object:Gem::Version
210
- hash: 3
217
+ hash: 25
211
218
  segments:
212
- - 0
213
- version: "0"
219
+ - 1
220
+ - 3
221
+ - 1
222
+ version: 1.3.1
214
223
  requirements: []
215
224
 
216
225
  rubyforge_project: