resque_stuck_queue 0.1.1 → 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +27 -6
- data/lib/resque_stuck_queue.rb +22 -24
- data/lib/resque_stuck_queue/config.rb +3 -3
- data/lib/resque_stuck_queue/heartbeat_job.rb +14 -0
- data/lib/resque_stuck_queue/version.rb +1 -1
- data/test/test_helper.rb +0 -1
- data/test/test_integration.rb +1 -2
- data/test/test_lagtime.rb +0 -1
- data/test/test_set_custom_refresh_job.rb +3 -3
- metadata +3 -4
- data/test/resque/refresh_latest_timestamp.rb +0 -8
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA512:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 8441e82a5f962f49c9de740fd9a843c50b3125458c69f08598b210774b2920f6bac676f87a748c095aafdcc86799934229a03ba5c2cc0c8a906f3834f484cc7c
|
4
|
+
data.tar.gz: 824e77bbf4ab0c7fa6b150e69ed0a4845d4e6a18022715b51880e60af7ba189be704277d0776aa91bbe39003b09f5b8f22895421484b10bb6c454b73f77391d7
|
5
5
|
SHA1:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: ec5948be0403dbafd9958e4dbb650f88ebd15de2
|
7
|
+
data.tar.gz: a97a2968698d72fa37b14d1ba3eabe7e9345d9cf
|
data/README.md
CHANGED
@@ -8,6 +8,8 @@ This is to be used to satisfy an ops problem. There have been cases resque proce
|
|
8
8
|
|
9
9
|
If resque doesn't run jobs in specific queues (defaults to `@queue = :app`) within a certain timeframe, it will trigger a pre-defined handler of your choice. You can use this to send an email, pager duty, add more resque workers, restart resque, send you a txt...whatever suits you.
|
10
10
|
|
11
|
+
It will also fire a proc to notify you when it's recovered.
|
12
|
+
|
11
13
|
## How it works
|
12
14
|
|
13
15
|
When you call `start` you are essentially starting two threads that will continiously run until `stop` is called or until the process shuts down.
|
@@ -16,18 +18,33 @@ One thread is responsible for pushing a 'heartbeat' job to resque which will ess
|
|
16
18
|
|
17
19
|
The other thread is a continious loop that will check redis (bypassing resque) for that key and check what the latest time the hearbeat job successfully updated that key.
|
18
20
|
|
19
|
-
|
21
|
+
StuckQueue will trigger a pre-defined proc if the queue is lagging according to the times you've configured (see below).
|
22
|
+
|
23
|
+
After firing the proc, it will continue to monitor the queue, but won't call the proc again until the queue is found to be good again (it will then call a different "recovered" handler).
|
24
|
+
|
25
|
+
By calling the recovered proc, it will then complain again the next time the lag is found.
|
20
26
|
|
21
|
-
##
|
27
|
+
## Configuration Options
|
22
28
|
|
23
|
-
Configure it first
|
29
|
+
Configure it first via something like:
|
24
30
|
|
25
31
|
<pre>
|
26
|
-
|
32
|
+
Resque::StuckQueue.config[:triggered_handler] = proc { send_email }
|
33
|
+
</pre>
|
34
|
+
|
35
|
+
Configuration settings are below. You'll most likely at the least want to tune `:triggered_handler`,`:heartbeat` and `:trigger_timeout` settings.
|
36
|
+
|
37
|
+
<pre>
|
38
|
+
triggered_handler:
|
27
39
|
set to what gets triggered when resque-stuck-queue will detect the latest heartbeat is older than the trigger_timeout time setting.
|
28
40
|
Example:
|
29
41
|
Resque::StuckQueue.config[:triggered_handler] = proc { |queue_name, lagtime| send_email('queue #{queue_name} isnt working, aaah the daemons') }
|
30
42
|
|
43
|
+
recovered_handler:
|
44
|
+
set to what gets triggered when resque-stuck-queue has triggered a problem, but then detects the queue went back down to functioning well again (it wont trigger again until it has recovered).
|
45
|
+
Example:
|
46
|
+
Resque::StuckQueue.config[:recovered_handler] = proc { |queue_name, lagtime| send_email('phew, queue #{queue_name} is ok') }
|
47
|
+
|
31
48
|
heartbeat:
|
32
49
|
set to how often to push that 'heartbeat' job to refresh the latest time it worked.
|
33
50
|
Example:
|
@@ -44,6 +61,9 @@ redis:
|
|
44
61
|
heartbeat_key:
|
45
62
|
optional, name of keys to keep track of the last good resque heartbeat time
|
46
63
|
|
64
|
+
triggered_key:
|
65
|
+
optional, name of keys to keep track of the last trigger time
|
66
|
+
|
47
67
|
logger:
|
48
68
|
optional, pass a Logger. Default a ruby logger will be instantiated. Needs to respond to that interface.
|
49
69
|
|
@@ -55,9 +75,10 @@ abort_on_exception:
|
|
55
75
|
|
56
76
|
refresh_job:
|
57
77
|
optional, your own custom refreshing job. if you are using something other than resque
|
78
|
+
|
58
79
|
</pre>
|
59
80
|
|
60
|
-
|
81
|
+
To start it:
|
61
82
|
|
62
83
|
<pre>
|
63
84
|
Resque::StuckQueue.start # blocking
|
@@ -121,7 +142,7 @@ class CustomJob
|
|
121
142
|
end
|
122
143
|
end
|
123
144
|
|
124
|
-
Resque::StuckQueue.config[:
|
145
|
+
Resque::StuckQueue.config[:heartbeat_job] = proc {
|
125
146
|
# or however else you enque your custom job, Sidekiq::Client.enqueue(CustomJob), whatever, etc.
|
126
147
|
CustomJob.perform_async
|
127
148
|
}
|
data/lib/resque_stuck_queue.rb
CHANGED
@@ -1,5 +1,6 @@
|
|
1
1
|
require "resque_stuck_queue/version"
|
2
2
|
require "resque_stuck_queue/config"
|
3
|
+
require "resque_stuck_queue/heartbeat_job"
|
3
4
|
|
4
5
|
# TODO move this require into a configurable?
|
5
6
|
require 'resque'
|
@@ -75,7 +76,7 @@ module Resque
|
|
75
76
|
|
76
77
|
Redis::Classy.db = redis if Redis::Classy.db.nil?
|
77
78
|
|
78
|
-
|
79
|
+
enqueue_repeating_heartbeat_job
|
79
80
|
setup_checker_thread
|
80
81
|
|
81
82
|
# fo-eva.
|
@@ -121,14 +122,14 @@ module Resque
|
|
121
122
|
|
122
123
|
private
|
123
124
|
|
124
|
-
def
|
125
|
+
def enqueue_repeating_heartbeat_job
|
125
126
|
@threads << Thread.new do
|
126
127
|
Thread.current.abort_on_exception = config[:abort_on_exception]
|
127
128
|
logger.info("Starting heartbeat thread")
|
128
129
|
while @running
|
129
130
|
# we want to go through resque jobs, because that's what we're trying to test here:
|
130
131
|
# ensure that jobs get executed and the time is updated!
|
131
|
-
logger.info("Sending
|
132
|
+
logger.info("Sending heartbeat jobs")
|
132
133
|
enqueue_jobs
|
133
134
|
wait_for_it
|
134
135
|
end
|
@@ -136,12 +137,14 @@ module Resque
|
|
136
137
|
end
|
137
138
|
|
138
139
|
def enqueue_jobs
|
139
|
-
if config[:
|
140
|
-
# FIXME config[:
|
141
|
-
config[:
|
140
|
+
if config[:heartbeat_job]
|
141
|
+
# FIXME config[:heartbeat_job] with mutliple queues is bad semantics
|
142
|
+
config[:heartbeat_job].call
|
142
143
|
else
|
143
144
|
queues.each do |queue_name|
|
144
|
-
Resque.enqueue_to(queue_name,
|
145
|
+
Resque.enqueue_to(queue_name, HeartbeatJob, [heartbeat_key_for(queue_name), redis.client.host, redis.client.port])
|
146
|
+
queue_name = :snapshot_progress
|
147
|
+
Resque.enqueue_to(queue_name, HeartbeatJob, [Resque::StuckQueue.heartbeat_key_for(queue_name), Resque.redis.client.host, Resque.redis.client.port])
|
145
148
|
end
|
146
149
|
end
|
147
150
|
end
|
@@ -155,17 +158,10 @@ module Resque
|
|
155
158
|
if mutex.lock
|
156
159
|
begin
|
157
160
|
queues.each do |queue_name|
|
158
|
-
|
159
|
-
if triggered_ago = last_triggered(queue_name)
|
160
|
-
logger.info("Last triggered for #{queue_name} is #{triggered_ago.inspect} seconds.")
|
161
|
-
else
|
162
|
-
logger.info("No last trigger found for #{queue_name}.")
|
163
|
-
end
|
161
|
+
log_checker_info(queue_name)
|
164
162
|
if should_trigger?(queue_name)
|
165
|
-
logger.info("Triggering :triggered handler for #{queue_name} at #{Time.now}.")
|
166
163
|
trigger_handler(queue_name, :triggered)
|
167
164
|
elsif should_recover?(queue_name)
|
168
|
-
logger.info("Triggering :recovered handler for #{queue_name} at #{Time.now}.")
|
169
165
|
trigger_handler(queue_name, :recovered)
|
170
166
|
end
|
171
167
|
end
|
@@ -235,6 +231,7 @@ module Resque
|
|
235
231
|
def trigger_handler(queue_name, type)
|
236
232
|
raise 'Must trigger either the recovered or triggered handler!' unless (type == :recovered || type == :triggered)
|
237
233
|
handler_name = :"#{type}_handler"
|
234
|
+
logger.info("Triggering #{type} handler for #{queue_name} at #{Time.now}.")
|
238
235
|
(config[handler_name] || const_get(handler_name.upcase)).call(queue_name, lag_time(queue_name))
|
239
236
|
manual_refresh(queue_name, type)
|
240
237
|
rescue => e
|
@@ -243,6 +240,16 @@ module Resque
|
|
243
240
|
force_stop!
|
244
241
|
end
|
245
242
|
|
243
|
+
def log_checker_info(queue_name)
|
244
|
+
logger.info("Lag time for #{queue_name} is #{lag_time(queue_name).inspect} seconds.")
|
245
|
+
if triggered_ago = last_triggered(queue_name)
|
246
|
+
logger.info("Last triggered for #{queue_name} is #{triggered_ago.inspect} seconds.")
|
247
|
+
else
|
248
|
+
logger.info("No last trigger found for #{queue_name}.")
|
249
|
+
end
|
250
|
+
|
251
|
+
end
|
252
|
+
|
246
253
|
def read_from_redis(keyname)
|
247
254
|
redis.get(keyname)
|
248
255
|
end
|
@@ -258,12 +265,3 @@ module Resque
|
|
258
265
|
end
|
259
266
|
end
|
260
267
|
|
261
|
-
class RefreshLatestTimestamp
|
262
|
-
def self.perform(args)
|
263
|
-
timestamp_key = args[0]
|
264
|
-
host = args[1]
|
265
|
-
port = args[2]
|
266
|
-
r = Redis.new(:host => host, :port => port)
|
267
|
-
r.set(timestamp_key, Time.now.to_i)
|
268
|
-
end
|
269
|
-
end
|
@@ -15,8 +15,8 @@ module Resque
|
|
15
15
|
class Config < Hash
|
16
16
|
|
17
17
|
OPTIONS_DESCRIPTIONS = {
|
18
|
-
:triggered_handler
|
19
|
-
:recovered_handler => "set to what gets triggered when resque-stuck-queue has triggered a problem, but then detects the queue went back down to functioning well again(
|
18
|
+
:triggered_handler => "set to what gets triggered when resque-stuck-queue will detect the latest heartbeat is older than the trigger_timeout time setting.\n\tExample:\n\tResque::StuckQueue.config[:triggered_handler] = proc { |queue_name, lagtime| send_email('queue \#{queue_name} isnt working, aaah the daemons') }",
|
19
|
+
:recovered_handler => "set to what gets triggered when resque-stuck-queue has triggered a problem, but then detects the queue went back down to functioning well again(it wont trigger again until it has recovered).\n\tExample:\n\tResque::StuckQueue.config[:recovered_handler] = proc { |queue_name, lagtime| send_email('phew, queue \#{queue_name} is ok') }",
|
20
20
|
:heartbeat => "set to how often to push that 'heartbeat' job to refresh the latest time it worked.\n\tExample:\n\tResque::StuckQueue.config[:heartbeat] = 5.minutes",
|
21
21
|
:trigger_timeout => "set to how much of a resque work lag you are willing to accept before being notified. note: take the :heartbeat setting into account when setting this timeout.\n\tExample:\n\tResque::StuckQueue.config[:trigger_timeout] = 55.minutes",
|
22
22
|
:redis => "set the Redis instance StuckQueue will use",
|
@@ -25,7 +25,7 @@ module Resque
|
|
25
25
|
:logger => "optional, pass a Logger. Default a ruby logger will be instantiated. Needs to respond to that interface.",
|
26
26
|
:queues => "optional, monitor specific queues you want to send a heartbeat/monitor to. default is :app",
|
27
27
|
:abort_on_exception => "optional, if you want the resque-stuck-queue threads to explicitly raise, default is false",
|
28
|
-
:
|
28
|
+
:heartbeat_job => "optional, your own custom refreshing job. if you are using something other than resque",
|
29
29
|
}
|
30
30
|
|
31
31
|
OPTIONS = OPTIONS_DESCRIPTIONS.keys
|
@@ -0,0 +1,14 @@
|
|
1
|
+
module Resque
|
2
|
+
module StuckQueue
|
3
|
+
class HeartbeatJob
|
4
|
+
def self.perform(args)
|
5
|
+
timestamp_key = args[0]
|
6
|
+
host = args[1]
|
7
|
+
port = args[2]
|
8
|
+
new_time = Time.now.to_i
|
9
|
+
r = Redis.new(:host => host, :port => port)
|
10
|
+
r.set(timestamp_key, new_time)
|
11
|
+
end
|
12
|
+
end
|
13
|
+
end
|
14
|
+
end
|
data/test/test_helper.rb
CHANGED
@@ -7,7 +7,6 @@ require "mocha/mini_test"
|
|
7
7
|
$:.unshift(".")
|
8
8
|
require 'resque_stuck_queue'
|
9
9
|
require File.join(File.expand_path(File.dirname(__FILE__)), "resque", "set_redis_key")
|
10
|
-
require File.join(File.expand_path(File.dirname(__FILE__)), "resque", "refresh_latest_timestamp")
|
11
10
|
|
12
11
|
module TestHelper
|
13
12
|
|
data/test/test_integration.rb
CHANGED
@@ -6,7 +6,6 @@ require 'pry'
|
|
6
6
|
$:.unshift(".")
|
7
7
|
require 'resque_stuck_queue'
|
8
8
|
require File.join(File.expand_path(File.dirname(__FILE__)), "resque", "set_redis_key")
|
9
|
-
require File.join(File.expand_path(File.dirname(__FILE__)), "resque", "refresh_latest_timestamp")
|
10
9
|
require File.join(File.expand_path(File.dirname(__FILE__)), "test_helper")
|
11
10
|
|
12
11
|
class TestIntegration < Minitest::Test
|
@@ -88,7 +87,7 @@ class TestIntegration < Minitest::Test
|
|
88
87
|
Resque::StuckQueue.config[:heartbeat] = 1
|
89
88
|
|
90
89
|
begin
|
91
|
-
Resque::StuckQueue.config[:
|
90
|
+
Resque::StuckQueue.config[:heartbeat_job] = proc { Resque.enqueue_to(:app, Resque::StuckQueue::HeartbeatJob, Resque::StuckQueue.heartbeat_key_for(:app)) }
|
92
91
|
@triggered = false
|
93
92
|
Resque::StuckQueue.config[:triggered_handler] = proc { @triggered = true }
|
94
93
|
start_and_stop_loops_after(4)
|
data/test/test_lagtime.rb
CHANGED
@@ -5,7 +5,6 @@ require 'pry'
|
|
5
5
|
$:.unshift(".")
|
6
6
|
require 'resque_stuck_queue'
|
7
7
|
require File.join(File.expand_path(File.dirname(__FILE__)), "resque", "set_redis_key")
|
8
|
-
require File.join(File.expand_path(File.dirname(__FILE__)), "resque", "refresh_latest_timestamp")
|
9
8
|
require File.join(File.expand_path(File.dirname(__FILE__)), "test_helper")
|
10
9
|
|
11
10
|
class TestLagTime < Minitest::Test
|
@@ -9,7 +9,7 @@ class TestYourOwnRefreshJob < Minitest::Test
|
|
9
9
|
Resque::StuckQueue.config[:trigger_timeout] = 1
|
10
10
|
Resque::StuckQueue.config[:heartbeat] = 1
|
11
11
|
Resque::StuckQueue.config[:abort_on_exception] = true
|
12
|
-
Resque::StuckQueue.config[:
|
12
|
+
Resque::StuckQueue.config[:heartbeat_job] = nil
|
13
13
|
Resque::StuckQueue.redis = Redis.new
|
14
14
|
Resque::StuckQueue.redis.flushall
|
15
15
|
end
|
@@ -17,7 +17,7 @@ class TestYourOwnRefreshJob < Minitest::Test
|
|
17
17
|
def test_will_trigger_with_unrefreshing_custom_heartbeat_job
|
18
18
|
# it will trigger because the key will be unrefreshed, hence 'old' and will always trigger.
|
19
19
|
puts "#{__method__}"
|
20
|
-
Resque::StuckQueue.config[:
|
20
|
+
Resque::StuckQueue.config[:heartbeat_job] = proc { nil } # does not refresh global key
|
21
21
|
@triggered = false
|
22
22
|
Resque::StuckQueue.config[:triggered_handler] = proc { @triggered = true }
|
23
23
|
start_and_stop_loops_after(3)
|
@@ -27,7 +27,7 @@ class TestYourOwnRefreshJob < Minitest::Test
|
|
27
27
|
def test_will_fail_with_bad_custom_heartbeat_job
|
28
28
|
puts "#{__method__}"
|
29
29
|
begin
|
30
|
-
Resque::StuckQueue.config[:
|
30
|
+
Resque::StuckQueue.config[:heartbeat_job] = proc { raise 'bad proc doc' } # does not refresh global key
|
31
31
|
@triggered = false
|
32
32
|
Resque::StuckQueue.config[:triggered_handler] = proc { @triggered = true }
|
33
33
|
start_and_stop_loops_after(3)
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: resque_stuck_queue
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Shai Rosenfeld
|
@@ -9,7 +9,7 @@ autorequire:
|
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
11
|
|
12
|
-
date: 2014-01-
|
12
|
+
date: 2014-01-29 00:00:00 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: redis-mutex
|
@@ -59,9 +59,9 @@ files:
|
|
59
59
|
- lib/resque/stuck_queue.rb
|
60
60
|
- lib/resque_stuck_queue.rb
|
61
61
|
- lib/resque_stuck_queue/config.rb
|
62
|
+
- lib/resque_stuck_queue/heartbeat_job.rb
|
62
63
|
- lib/resque_stuck_queue/version.rb
|
63
64
|
- resque_stuck_queue.gemspec
|
64
|
-
- test/resque/refresh_latest_timestamp.rb
|
65
65
|
- test/resque/set_redis_key.rb
|
66
66
|
- test/test_collision.rb
|
67
67
|
- test/test_config.rb
|
@@ -96,7 +96,6 @@ signing_key:
|
|
96
96
|
specification_version: 4
|
97
97
|
summary: fire a handler when your queues are wonky
|
98
98
|
test_files:
|
99
|
-
- test/resque/refresh_latest_timestamp.rb
|
100
99
|
- test/resque/set_redis_key.rb
|
101
100
|
- test/test_collision.rb
|
102
101
|
- test/test_config.rb
|