RubyGems - resque_stuck_queue - Versions diffs - 0.1.1 → 0.2.0 - Mend

resque_stuck_queue 0.1.1 → 0.2.0

Files changed (12) hide show

checksums.yaml +4 -4
data/README.md +27 -6
data/lib/resque_stuck_queue.rb +22 -24
data/lib/resque_stuck_queue/config.rb +3 -3
data/lib/resque_stuck_queue/heartbeat_job.rb +14 -0
data/lib/resque_stuck_queue/version.rb +1 -1
data/test/test_helper.rb +0 -1
data/test/test_integration.rb +1 -2
data/test/test_lagtime.rb +0 -1
data/test/test_set_custom_refresh_job.rb +3 -3
metadata +3 -4
data/test/resque/refresh_latest_timestamp.rb +0 -8

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA512:
-  metadata.gz: 18578e956daa34b9788d57db6fc5539e0e425c692d672c89302c82051f81a639eff25fbf76570a1b0ac74099051503835fca1912e051086c1a7f6df2ea485e0f
-  data.tar.gz: 2a721eec9c3ba4555a71425111b9ab322913da8df68612e6a08a6f4577c9f3a77793014617742661a4db1bfdede87e65567a8578c5f3511201d7678005e16bc0
+  metadata.gz: 8441e82a5f962f49c9de740fd9a843c50b3125458c69f08598b210774b2920f6bac676f87a748c095aafdcc86799934229a03ba5c2cc0c8a906f3834f484cc7c
+  data.tar.gz: 824e77bbf4ab0c7fa6b150e69ed0a4845d4e6a18022715b51880e60af7ba189be704277d0776aa91bbe39003b09f5b8f22895421484b10bb6c454b73f77391d7
 SHA1:
-  metadata.gz: d8a0f8ef0e451bba4b99236eb653aae4078c498e
-  data.tar.gz: 4cb3c0daad447f88893d8f2ebbd1bf991bea0d10
+  metadata.gz: ec5948be0403dbafd9958e4dbb650f88ebd15de2
+  data.tar.gz: a97a2968698d72fa37b14d1ba3eabe7e9345d9cf

data/README.md CHANGED Viewed

@@ -8,6 +8,8 @@ This is to be used to satisfy an ops problem. There have been cases resque proce
 If resque doesn't run jobs in specific queues (defaults to `@queue = :app`) within a certain timeframe, it will trigger a pre-defined handler of your choice. You can use this to send an email, pager duty, add more resque workers, restart resque, send you a txt...whatever suits you.
+It will also fire a proc to notify you when it's recovered.
 ## How it works
 When you call `start` you are essentially starting two threads that will continiously run until `stop` is called or until the process shuts down.
@@ -16,18 +18,33 @@ One thread is responsible for pushing a 'heartbeat' job to resque which will ess
 The other thread is a continious loop that will check redis (bypassing resque) for that key and check what the latest time the hearbeat job successfully updated that key.
-It will trigger a pre-defined proc (see below) if the last time the hearbeat job updated that key is older than the trigger_timeout setting (see below).
+StuckQueue will trigger a pre-defined proc if the queue is lagging according to the times you've configured (see below).
+After firing the proc, it will continue to monitor the queue, but won't call the proc again until the queue is found to be good again (it will then call a different "recovered" handler).
+By calling the recovered proc, it will then complain again the next time the lag is found.
-## Usage
+## Configuration Options
-Configure it first. Optional settings are below. You'll most likely at the least want to tune `:triggered_handler`,`:heartbeat` and `:trigger_timeout` settings.
+Configure it first via something like:
 <pre>
-handler:
+  Resque::StuckQueue.config[:triggered_handler] = proc { send_email }
+</pre>
+Configuration settings are below. You'll most likely at the least want to tune `:triggered_handler`,`:heartbeat` and `:trigger_timeout` settings.
+<pre>
+triggered_handler:
 	set to what gets triggered when resque-stuck-queue will detect the latest heartbeat is older than the trigger_timeout time setting.
 	Example:
 	Resque::StuckQueue.config[:triggered_handler] = proc { |queue_name, lagtime| send_email('queue #{queue_name} isnt working, aaah the daemons') }
+recovered_handler:
+	set to what gets triggered when resque-stuck-queue has triggered a problem, but then detects the queue went back down to functioning well again (it wont trigger again until it has recovered).
+	Example:
+	Resque::StuckQueue.config[:recovered_handler] = proc { |queue_name, lagtime| send_email('phew, queue #{queue_name} is ok') }
 heartbeat:
 	set to how often to push that 'heartbeat' job to refresh the latest time it worked.
 	Example:
@@ -44,6 +61,9 @@ redis:
 heartbeat_key:
 	optional, name of keys to keep track of the last good resque heartbeat time
+triggered_key:
+	optional, name of keys to keep track of the last trigger time
 logger:
 	optional, pass a Logger. Default a ruby logger will be instantiated. Needs to respond to that interface.
@@ -55,9 +75,10 @@ abort_on_exception:
 refresh_job:
 	optional, your own custom refreshing job. if you are using something other than resque
 </pre>
-Then start it:
+To start it:
 <pre>
 Resque::StuckQueue.start                # blocking
@@ -121,7 +142,7 @@ class CustomJob
   end
 end
-Resque::StuckQueue.config[:refresh_job] = proc {
+Resque::StuckQueue.config[:heartbeat_job] = proc {
   # or however else you enque your custom job, Sidekiq::Client.enqueue(CustomJob), whatever, etc.
   CustomJob.perform_async
 }

data/lib/resque_stuck_queue.rb CHANGED Viewed

@@ -1,5 +1,6 @@
 require "resque_stuck_queue/version"
 require "resque_stuck_queue/config"
+require "resque_stuck_queue/heartbeat_job"
 # TODO move this require into a configurable?
 require 'resque'
@@ -75,7 +76,7 @@ module Resque
         Redis::Classy.db = redis if Redis::Classy.db.nil?
-        enqueue_repeating_refresh_job
+        enqueue_repeating_heartbeat_job
         setup_checker_thread
         # fo-eva.
@@ -121,14 +122,14 @@ module Resque
       private
-      def enqueue_repeating_refresh_job
+      def enqueue_repeating_heartbeat_job
         @threads << Thread.new do
           Thread.current.abort_on_exception = config[:abort_on_exception]
           logger.info("Starting heartbeat thread")
           while @running
             # we want to go through resque jobs, because that's what we're trying to test here:
             # ensure that jobs get executed and the time is updated!
-            logger.info("Sending refresh jobs")
+            logger.info("Sending heartbeat jobs")
             enqueue_jobs
             wait_for_it
           end
@@ -136,12 +137,14 @@ module Resque
       end
       def enqueue_jobs
-        if config[:refresh_job]
-          # FIXME config[:refresh_job] with mutliple queues is bad semantics
-          config[:refresh_job].call
+        if config[:heartbeat_job]
+          # FIXME config[:heartbeat_job] with mutliple queues is bad semantics
+          config[:heartbeat_job].call
         else
           queues.each do |queue_name|
-            Resque.enqueue_to(queue_name, RefreshLatestTimestamp, [heartbeat_key_for(queue_name), redis.client.host, redis.client.port])
+            Resque.enqueue_to(queue_name, HeartbeatJob, [heartbeat_key_for(queue_name), redis.client.host, redis.client.port])
+             queue_name = :snapshot_progress
+             Resque.enqueue_to(queue_name, HeartbeatJob, [Resque::StuckQueue.heartbeat_key_for(queue_name), Resque.redis.client.host, Resque.redis.client.port])
           end
         end
       end
@@ -155,17 +158,10 @@ module Resque
             if mutex.lock
               begin
                 queues.each do |queue_name|
-                  logger.info("Lag time for #{queue_name} is #{lag_time(queue_name).inspect} seconds.")
-                  if triggered_ago = last_triggered(queue_name)
-                    logger.info("Last triggered for #{queue_name} is #{triggered_ago.inspect} seconds.")
-                  else
-                    logger.info("No last trigger found for #{queue_name}.")
-                  end
+                  log_checker_info(queue_name)
                   if should_trigger?(queue_name)
-                    logger.info("Triggering :triggered handler for #{queue_name} at #{Time.now}.")
                     trigger_handler(queue_name, :triggered)
                   elsif should_recover?(queue_name)
-                    logger.info("Triggering :recovered handler for #{queue_name} at #{Time.now}.")
                     trigger_handler(queue_name, :recovered)
                   end
                 end
@@ -235,6 +231,7 @@ module Resque
       def trigger_handler(queue_name, type)
         raise 'Must trigger either the recovered or triggered handler!' unless (type == :recovered || type == :triggered)
         handler_name = :"#{type}_handler"
+        logger.info("Triggering #{type} handler for #{queue_name} at #{Time.now}.")
         (config[handler_name] || const_get(handler_name.upcase)).call(queue_name, lag_time(queue_name))
         manual_refresh(queue_name, type)
       rescue => e
@@ -243,6 +240,16 @@ module Resque
         force_stop!
       end
+      def log_checker_info(queue_name)
+        logger.info("Lag time for #{queue_name} is #{lag_time(queue_name).inspect} seconds.")
+        if triggered_ago = last_triggered(queue_name)
+          logger.info("Last triggered for #{queue_name} is #{triggered_ago.inspect} seconds.")
+        else
+          logger.info("No last trigger found for #{queue_name}.")
+        end
+      end
       def read_from_redis(keyname)
         redis.get(keyname)
       end
@@ -258,12 +265,3 @@ module Resque
   end
 end
-class RefreshLatestTimestamp
-  def self.perform(args)
-    timestamp_key = args[0]
-    host = args[1]
-    port = args[2]
-    r = Redis.new(:host => host, :port => port)
-    r.set(timestamp_key, Time.now.to_i)
-  end
-end

data/lib/resque_stuck_queue/config.rb CHANGED Viewed

@@ -15,8 +15,8 @@ module Resque
     class Config < Hash
       OPTIONS_DESCRIPTIONS = {
-        :triggered_handler            => "set to what gets triggered when resque-stuck-queue will detect the latest heartbeat is older than the trigger_timeout time setting.\n\tExample:\n\tResque::StuckQueue.config[:triggered_handler] = proc { |queue_name, lagtime| send_email('queue \#{queue_name} isnt working, aaah the daemons') }",
-        :recovered_handler  => "set to what gets triggered when resque-stuck-queue has triggered a problem, but then detects the queue went back down to functioning well again(before the next trigger).\n\tExample:\n\tResque::StuckQueue.config[:recovered_handler] = proc { |queue_name, lagtime| send_email('phew, queue \#{queue_name} is ok') }",
+        :triggered_handler  => "set to what gets triggered when resque-stuck-queue will detect the latest heartbeat is older than the trigger_timeout time setting.\n\tExample:\n\tResque::StuckQueue.config[:triggered_handler] = proc { |queue_name, lagtime| send_email('queue \#{queue_name} isnt working, aaah the daemons') }",
+        :recovered_handler  => "set to what gets triggered when resque-stuck-queue has triggered a problem, but then detects the queue went back down to functioning well again(it wont trigger again until it has recovered).\n\tExample:\n\tResque::StuckQueue.config[:recovered_handler] = proc { |queue_name, lagtime| send_email('phew, queue \#{queue_name} is ok') }",
         :heartbeat          => "set to how often to push that 'heartbeat' job to refresh the latest time it worked.\n\tExample:\n\tResque::StuckQueue.config[:heartbeat] = 5.minutes",
         :trigger_timeout    => "set to how much of a resque work lag you are willing to accept before being notified. note: take the :heartbeat setting into account when setting this timeout.\n\tExample:\n\tResque::StuckQueue.config[:trigger_timeout] = 55.minutes",
         :redis              => "set the Redis instance StuckQueue will use",
@@ -25,7 +25,7 @@ module Resque
         :logger             => "optional, pass a Logger. Default a ruby logger will be instantiated. Needs to respond to that interface.",
         :queues             => "optional, monitor specific queues you want to send a heartbeat/monitor to. default is :app",
         :abort_on_exception => "optional, if you want the resque-stuck-queue threads to explicitly raise, default is false",
-        :refresh_job        => "optional, your own custom refreshing job. if you are using something other than resque",
+        :heartbeat_job        => "optional, your own custom refreshing job. if you are using something other than resque",
       }
       OPTIONS = OPTIONS_DESCRIPTIONS.keys

data/lib/resque_stuck_queue/heartbeat_job.rb ADDED Viewed

@@ -0,0 +1,14 @@
+module Resque
+  module StuckQueue
+    class HeartbeatJob
+      def self.perform(args)
+        timestamp_key = args[0]
+        host = args[1]
+        port = args[2]
+        new_time = Time.now.to_i
+        r = Redis.new(:host => host, :port => port)
+        r.set(timestamp_key, new_time)
+      end
+    end
+  end
+end

data/lib/resque_stuck_queue/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 module Resque
   module StuckQueue
-    VERSION = "0.1.1"
+    VERSION = "0.2.0"
   end
 end

data/test/test_helper.rb CHANGED Viewed

@@ -7,7 +7,6 @@ require "mocha/mini_test"
 $:.unshift(".")
 require 'resque_stuck_queue'
 require File.join(File.expand_path(File.dirname(__FILE__)), "resque", "set_redis_key")
-require File.join(File.expand_path(File.dirname(__FILE__)), "resque", "refresh_latest_timestamp")
 module TestHelper

data/test/test_integration.rb CHANGED Viewed

@@ -6,7 +6,6 @@ require 'pry'
 $:.unshift(".")
 require 'resque_stuck_queue'
 require File.join(File.expand_path(File.dirname(__FILE__)), "resque", "set_redis_key")
-require File.join(File.expand_path(File.dirname(__FILE__)), "resque", "refresh_latest_timestamp")
 require File.join(File.expand_path(File.dirname(__FILE__)), "test_helper")
 class TestIntegration < Minitest::Test
@@ -88,7 +87,7 @@ class TestIntegration < Minitest::Test
     Resque::StuckQueue.config[:heartbeat] = 1
     begin
-      Resque::StuckQueue.config[:refresh_job] = proc { Resque.enqueue(RefreshLatestTimestamp, Resque::StuckQueue.heartbeat_key_for(:app)) }
+      Resque::StuckQueue.config[:heartbeat_job] = proc { Resque.enqueue_to(:app, Resque::StuckQueue::HeartbeatJob, Resque::StuckQueue.heartbeat_key_for(:app)) }
       @triggered = false
       Resque::StuckQueue.config[:triggered_handler] = proc { @triggered = true }
       start_and_stop_loops_after(4)

data/test/test_lagtime.rb CHANGED Viewed

@@ -5,7 +5,6 @@ require 'pry'
 $:.unshift(".")
 require 'resque_stuck_queue'
 require File.join(File.expand_path(File.dirname(__FILE__)), "resque", "set_redis_key")
-require File.join(File.expand_path(File.dirname(__FILE__)), "resque", "refresh_latest_timestamp")
 require File.join(File.expand_path(File.dirname(__FILE__)), "test_helper")
 class TestLagTime < Minitest::Test

data/test/test_set_custom_refresh_job.rb CHANGED Viewed

@@ -9,7 +9,7 @@ class TestYourOwnRefreshJob < Minitest::Test
     Resque::StuckQueue.config[:trigger_timeout] = 1
     Resque::StuckQueue.config[:heartbeat] = 1
     Resque::StuckQueue.config[:abort_on_exception] = true
-    Resque::StuckQueue.config[:refresh_job] = nil
+    Resque::StuckQueue.config[:heartbeat_job] = nil
     Resque::StuckQueue.redis = Redis.new
     Resque::StuckQueue.redis.flushall
   end
@@ -17,7 +17,7 @@ class TestYourOwnRefreshJob < Minitest::Test
   def test_will_trigger_with_unrefreshing_custom_heartbeat_job
     # it will trigger because the key will be unrefreshed, hence 'old' and will always trigger.
     puts "#{__method__}"
-    Resque::StuckQueue.config[:refresh_job] = proc { nil } # does not refresh global key
+    Resque::StuckQueue.config[:heartbeat_job] = proc { nil } # does not refresh global key
     @triggered = false
     Resque::StuckQueue.config[:triggered_handler] = proc { @triggered = true }
     start_and_stop_loops_after(3)
@@ -27,7 +27,7 @@ class TestYourOwnRefreshJob < Minitest::Test
   def test_will_fail_with_bad_custom_heartbeat_job
     puts "#{__method__}"
     begin
-      Resque::StuckQueue.config[:refresh_job] = proc { raise 'bad proc doc' } # does not refresh global key
+      Resque::StuckQueue.config[:heartbeat_job] = proc { raise 'bad proc doc' } # does not refresh global key
       @triggered = false
       Resque::StuckQueue.config[:triggered_handler] = proc { @triggered = true }
       start_and_stop_loops_after(3)

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: resque_stuck_queue
 version: !ruby/object:Gem::Version
-  version: 0.1.1
+  version: 0.2.0
 platform: ruby
 authors:
 - Shai Rosenfeld
@@ -9,7 +9,7 @@ autorequire:
 bindir: bin
 cert_chain: []
-date: 2014-01-27 00:00:00 Z
+date: 2014-01-29 00:00:00 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: redis-mutex
@@ -59,9 +59,9 @@ files:
 - lib/resque/stuck_queue.rb
 - lib/resque_stuck_queue.rb
 - lib/resque_stuck_queue/config.rb
+- lib/resque_stuck_queue/heartbeat_job.rb
 - lib/resque_stuck_queue/version.rb
 - resque_stuck_queue.gemspec
-- test/resque/refresh_latest_timestamp.rb
 - test/resque/set_redis_key.rb
 - test/test_collision.rb
 - test/test_config.rb
@@ -96,7 +96,6 @@ signing_key:
 specification_version: 4
 summary: fire a handler when your queues are wonky
 test_files:
-- test/resque/refresh_latest_timestamp.rb
 - test/resque/set_redis_key.rb
 - test/test_collision.rb
 - test/test_config.rb

data/test/resque/refresh_latest_timestamp.rb DELETED Viewed

@@ -1,8 +0,0 @@
-class RefreshLatestTimestamp
-  @queue = :app
-  def self.perform(args)
-    timestamp_key, host, port = args[0], args[1], args[2]
-    r = Redis.new(:host => host, :port => port)
-    r.set(timestamp_key, Time.now.to_i)
-  end
-end