resque-clues 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,20 @@
1
+ *.gem
2
+ *.rbc
3
+ .bundle
4
+ .config
5
+ .yardoc
6
+ doc/
7
+ Gemfile.lock
8
+ InstalledFiles
9
+ _yardoc
10
+ coverage
11
+ lib/bundler/man
12
+ pkg
13
+ rdoc
14
+ spec/reports
15
+ test/tmp
16
+ test/version_tmp
17
+ tmp
18
+ *.swp
19
+ *.swo
20
+ .rvmrc
@@ -0,0 +1,15 @@
1
+ ---
2
+ language: ruby
3
+ rvm:
4
+ - 1.9.3
5
+ services:
6
+ - redis-server
7
+ notifications:
8
+ email: false
9
+ campfire:
10
+ rooms:
11
+ - secure: ! 'j8CJevgIDpE/cQwKm9z5uVY8nXmvQFOoQP/OIieEBMPd7IaSv6YVNMmjHju3
12
+
13
+ rKVL0471g0/+hfK1etfxeaygHjLVg1GmOnYH/CK97l3I3BnP5pEu7zz0wSs3
14
+
15
+ 5Legz1OnsK11U1FAjfSkv088V3/IgyTVegU/SaSiSi03uUsRSVs='
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in resque-clues.gemspec
4
+ gemspec
@@ -0,0 +1,22 @@
1
+ Copyright (c) 2013 PeopleAdmin
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ "Software"), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,163 @@
1
+ # Resque::Clues
2
+
3
+ Resque-Clues allows Resque to publish job lifecycle events to some external
4
+ store for analysis. It also allows for decorating jobs stored in Redis with
5
+ metadata that will be included with the published events. When coupled with
6
+ tools like Splunk, Logstash/Graphite or Cube, this can be used to:
7
+
8
+ * Quantify results of balancing efforts, hardware changes, etc...
9
+ * See how your background processes perform over time, before and after
10
+ releases, etc...
11
+ * Break down performance metrics on metadata specific to your business or
12
+ domain.
13
+ * Provide searchability for specific jobs entering the queue to aid in
14
+ debugging or support efforts.
15
+
16
+ Coupled with those tools, it will enable you to create views into your
17
+ background processes like the following:
18
+
19
+ ![splunk dashboard](http://i.imgur.com/0sZEw1L.png?1)
20
+
21
+ ## Lifecycle events
22
+
23
+ Four lifecycle events will be published for each job entering a queue:
24
+
25
+ 1. enqueued
26
+ 2. dequeued
27
+ 3. perform_started
28
+ 4. perform_finished -or- failed
29
+
30
+ Each will contain the following information (plus anything added to the
31
+ metadata via an item preprocessor):
32
+
33
+ * event_type: Either enqueued, dequeued, perform_started, perform_finished or
34
+ failed.
35
+ * event_hash: Unique hash grouping all events associated with a single
36
+ background job.
37
+ * worker_class: The job class that contains the perform logic.
38
+ * queue: The queue the job is placed into.
39
+ * timestamp: The time the event occurs.
40
+ * hostname: The hostname of the machine where the event originates from.
41
+ * process: The process on the host machine where the event originates from.
42
+ * args: The arguments passed to the perform method.
43
+
44
+ dequeued events will also include time_in_queue, which is the amount of time
45
+ the job spent in the queue. perform_finished and failed events will include
46
+ time_to_perform, which is the time it took to perform the job after it was
47
+ dequeued. Failed events will include the exception class, the exception
48
+ message and a backtrace.
49
+
50
+ ## Event Publishers
51
+
52
+ Event publishers are use to receive event data and publish them in some way.
53
+ The following event publishers are currently provided:
54
+
55
+ ```ruby
56
+ Resque::Plugins::Clues::StandardOutPublisher
57
+ Resque::Plugins::Clues::LogPublisher
58
+ Resque::Plugins::Clues::CompositePublisher
59
+ ```
60
+
61
+ You can implement your own publishers as long as they implement event handling
62
+ methods as follows:
63
+
64
+ ```ruby
65
+ def publish(event_type, timestamp, queue, metadata, klass, *args)
66
+ ...
67
+ end
68
+ ```
69
+
70
+ Where event_type is enqueued, dequeued, perform_started, perform_finished and
71
+ failed.
72
+
73
+ ## Event Marshallers
74
+
75
+ An event marshallers is used to coerce event data into a format suitable for
76
+ sending to an event publisher's destination. This is a proc or lambda with the
77
+ following call signature:
78
+
79
+ ```ruby
80
+ lambda do |event_type, timestamp, queue, metadata, worker_class, args|
81
+ # something that returns a string
82
+ end
83
+ ```
84
+
85
+ By default, clues will use an event_marshaller that will simply marshall this
86
+ data to a JSON object in the following format:
87
+
88
+ ```
89
+ {
90
+ "event_type":"dequeued",
91
+ "timestamp":"2013-06-04T20:59:58Z",
92
+ "queue":"test_queue",
93
+ "metadata": {
94
+ "event_hash":"0695f49c5e70fc18da91961113e1769a"
95
+ "hostname":"Lances-MacBook-Air.local",
96
+ "process":30731
97
+ },
98
+ "worker_class":"TestWorker",
99
+ "args":[1,2]
100
+ }
101
+ ```
102
+
103
+ ## Item Preprocessors
104
+
105
+ Immediately before Resque puts job data into a queue, the queue and the payload
106
+ hash will be sent to a configurable item_preprocessor proc. The payload hash
107
+ will contain:
108
+
109
+ * class: The class of the job to perform.
110
+ * args: The args to pass to its perform method.
111
+ * metadata: The metadata hash that contains the event_hash identifier, the
112
+ hostname and the process doing the enqueing.
113
+
114
+ You can then inject whatever you need into the metadata hash and it will be
115
+ included in the published events. At PeopleAdmin, we are using this to inject
116
+ our customer identifiers so we can look at Resque analytics broken down on a
117
+ per-customer basis.
118
+
119
+ ## Installation
120
+
121
+ Add this line to your application's Gemfile:
122
+
123
+ gem 'resque-clues'
124
+
125
+ And then execute:
126
+
127
+ $ bundle
128
+
129
+ Or install it yourself as:
130
+
131
+ $ gem install resque-clues
132
+
133
+ ## Usage
134
+
135
+ Resque-clues requires configuration of the event publishers and an item
136
+ preprocessor to be used, and this should occur before any use of Resque. Here
137
+ is an example configuration:
138
+
139
+ ```ruby
140
+ require 'resque'
141
+ require 'resque-clues'
142
+
143
+ publisher = Resque::Plugins::Clues::CompositePublisher.new
144
+ publisher << Resque::Plugins::Clues::StandardOutPublisher.new
145
+ publisher << Resque::Plugins::Clues::LogPublisher.new("/var/log/resque-clues.log")
146
+ Resque::Plugins::Clues.event_publisher = publisher
147
+
148
+ Resque::Plugins::Clues.item_preprocessor = proc do |queue, item|
149
+ ...
150
+ end
151
+ ```
152
+
153
+ If used in a Rails application, this will need to be executed in an initalizer.
154
+ After this, you should see events publishe as appropriate for your configured
155
+ event publisher.
156
+
157
+ ## Contributing
158
+
159
+ 1. Fork it
160
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
161
+ 3. Commit your changes (`git commit -am 'Add some feature'`)
162
+ 4. Push to the branch (`git push origin my-new-feature`)
163
+ 5. Create new Pull Request
@@ -0,0 +1,6 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+
4
+ RSpec::Core::RakeTask.new(:spec)
5
+
6
+ task :default => :spec
data/TODO ADDED
@@ -0,0 +1,4 @@
1
+ * Modify the publish signature to accept a hash instead of all the params.
2
+ * Add file-like streams that wrap TCP/IP and UDP sockets? (not needed for our
3
+ case). Would allow use of StreamPublisher to communicate over those
4
+ protocols
@@ -0,0 +1,78 @@
1
+ require 'resque'
2
+ require 'resque/plugins/clues/util'
3
+ require 'resque/plugins/clues/queue_extension'
4
+ require 'resque/plugins/clues/job_extension'
5
+ require 'resque/plugins/clues/event_publisher'
6
+ require 'resque/plugins/clues/version'
7
+
8
+ module Resque
9
+ module Plugins
10
+ module Clues
11
+ class << self
12
+ attr_accessor :item_preprocessor
13
+ attr_accessor :event_marshaller
14
+
15
+ def configured?
16
+ !event_publisher.nil?
17
+ end
18
+
19
+ def enable!
20
+ # Patch resque to support event broadcasting.
21
+ Resque.send(:extend, Resque::Plugins::Clues::QueueExtension)
22
+ Resque::Job.send(:include, Resque::Plugins::Clues::JobExtension)
23
+ Resque.instance_exec do
24
+ alias :_base_push :push
25
+ alias :_base_pop :pop
26
+
27
+ def push(queue, item)
28
+ _clues_push(queue, item)
29
+ end
30
+
31
+ def pop(queue)
32
+ _clues_pop(queue)
33
+ end
34
+ end
35
+
36
+ Resque::Job.class_exec do
37
+ alias :_base_perform :perform
38
+ alias :_base_fail :fail
39
+
40
+ def perform
41
+ _clues_perform
42
+ end
43
+
44
+ def fail(exception)
45
+ _clues_fail(exception)
46
+ end
47
+ end
48
+ end
49
+ end
50
+ end
51
+ end
52
+ end
53
+
54
+ # Constructs a string event for the passed args. Delegates to the
55
+ # Resque::Plugins::Clues.event_marshaller proc/lambda to do this. The
56
+ # default version will simply marshall the args to a JSON object.
57
+ #
58
+ # event_type:: enqueued, dequeued, perform_started, perform_finished or
59
+ # failed.
60
+ # timestamp:: the time the event occurred.
61
+ # queue:: the queue the job was in
62
+ # metadata:: metadata for the job, such as host, process, etc...
63
+ # worker_class:: the worker job class
64
+ # args:: arguments passed to the perform_method.
65
+ Resque::Plugins::Clues.event_marshaller =
66
+ lambda do |event_type, timestamp, queue, metadata, worker_class, args|
67
+ event = MultiJson.encode({
68
+ event_type: event_type,
69
+ timestamp: timestamp,
70
+ queue: queue,
71
+ metadata: metadata,
72
+ worker_class: worker_class,
73
+ args: args
74
+ })
75
+ "#{event}\n"
76
+ end
77
+
78
+ Resque::Plugins::Clues.enable!
@@ -0,0 +1,91 @@
1
+ require 'pp'
2
+ require 'delegate'
3
+
4
+ module Resque
5
+ module Plugins
6
+ module Clues
7
+ class << self
8
+ attr_accessor :event_publisher
9
+ end
10
+ EVENT_TYPES = %w[enqueued dequeued destroyed perform_started perform_finished failed]
11
+
12
+ # Event publisher that publishes events to a file-like stream in a JSON
13
+ # format. Each message is punctuated with a terminus character, which
14
+ # defaults to newline ("\n")
15
+ class StreamPublisher
16
+ attr_reader :stream
17
+
18
+ # Creates a new StreamPublisher that writes to the passed stream,
19
+ # terminating each event with the terminus.
20
+ #
21
+ # stream:: The file-like stream to write events to.
22
+ def initialize(stream)
23
+ @stream = stream
24
+ end
25
+
26
+ # Publishes an event to the stream.
27
+ def publish(event_type, timestamp, queue, metadata, klass, *args)
28
+ event = Clues.event_marshaller.call(event_type, timestamp, queue, metadata, klass, args)
29
+ stream.write(event)
30
+ end
31
+ end
32
+
33
+ # Event publisher that publishes events to standard output in a json
34
+ # format.
35
+ class StandardOutPublisher < StreamPublisher
36
+ def initialize
37
+ super(STDOUT)
38
+ end
39
+ end
40
+
41
+ # Event publisher that publishes events to a log file using ruby's
42
+ # stdlib logger and an optional formatter..
43
+ class LogPublisher
44
+ attr_reader :logger
45
+
46
+ # Creates a new LogPublisher that writes events to a log file at the
47
+ # specified log_path, using an optional formatter. The default format
48
+ # will simply be the event in a json format, one per line.
49
+ #
50
+ # log_path:: The path to the log file.
51
+ # formatter:: A lambda formatter for log messages. Defaults to writing
52
+ # one event per line. See
53
+ # http://www.ruby-doc.org/stdlib-1.9.3/libdoc/logger/rdoc/Logger/Formatter.html
54
+ def initialize(log_path, formatter=nil)
55
+ @logger = Logger.new(log_path)
56
+ @logger.formatter = formatter || lambda {|severity, time, program, msg| msg}
57
+ end
58
+
59
+ # Publishes an event to the log.
60
+ def publish(event_type, timestamp, queue, metadata, klass, *args)
61
+ logger.info(Clues.event_marshaller.call(
62
+ event_type, timestamp, queue, metadata, klass, args))
63
+ end
64
+ end
65
+
66
+ # A composite event publisher that groups several child publishers so
67
+ # that events received are delegated to each of the children for
68
+ # further processing.
69
+ class CompositePublisher < SimpleDelegator
70
+ def initialize
71
+ super([])
72
+ end
73
+
74
+ # Invokes publish on each child publisher for them to publish the event
75
+ # in their own way.
76
+ def publish(event_type, timestamp, queue, metadata, klass, *args)
77
+ each do |child|
78
+ child.publish(
79
+ event_type, timestamp, queue, metadata, klass, *args
80
+ ) rescue error(event_type, child)
81
+ end
82
+ end
83
+
84
+ private
85
+ def error(event_type, child)
86
+ p "Error processing #{event_type} in #{child}"
87
+ end
88
+ end
89
+ end
90
+ end
91
+ end
@@ -0,0 +1,65 @@
1
+ require 'digest/md5'
2
+ require 'time'
3
+
4
+ module Resque
5
+ module Plugins
6
+ module Clues
7
+ # Module capable of redefining the Job#perform and Job#failed methods so
8
+ # that they publish perform_started, perform_finished and failed events.
9
+ module JobExtension
10
+ # Invoked when this module is included by a class. Will redefine the
11
+ # perform and failed methods on that class.
12
+ #
13
+ # klass:: The klass including this module.
14
+ def self.included(klass)
15
+ define_perform(klass)
16
+ define_failed(klass)
17
+ end
18
+
19
+ private
20
+ # (Re)defines the perform method so that it will broadcast a
21
+ # perform_started event, invoke the original perform method, and
22
+ # then broadcast a perform_finished event if no exceptions are
23
+ # encountered. The time to perform is calculated and included in
24
+ # the metadata of the perform_finished event.
25
+ #
26
+ # klass:: The class to define the perform method on.
27
+ def self.define_perform(klass) # :doc:
28
+ klass.send(:define_method, :_clues_perform) do
29
+ if Clues.configured? and payload['clues_metadata']
30
+ Clues.event_publisher.publish(:perform_started, Clues.now, queue, payload['clues_metadata'], payload['class'], *payload['args'])
31
+ @perform_started = Time.now
32
+ _base_perform.tap do
33
+ payload['clues_metadata']['time_to_perform'] = Clues.time_delta_since(@perform_started)
34
+ Clues.event_publisher.publish(:perform_finished, Clues.now, queue, payload['clues_metadata'], payload['class'], *payload['args'])
35
+ end
36
+ else
37
+ _base_perform
38
+ end
39
+ end
40
+ end
41
+
42
+ # (Re)defines the failed method so that it will add time to perform,
43
+ # exception, error message and backtrace data to the job's payload
44
+ # metadata, then broadcast a failed event including that information.
45
+ #
46
+ # klass:: The class to define the failed method on.
47
+ #
48
+ def self.define_failed(klass) # :doc:
49
+ klass.send(:define_method, :_clues_fail) do |exception|
50
+ _base_fail(exception).tap do
51
+ metadata = payload['clues_metadata']
52
+ if Clues.configured? and metadata
53
+ metadata['time_to_perform'] = Clues.time_delta_since(@perform_started)
54
+ metadata['exception'] = exception.class
55
+ metadata['message'] = exception.message
56
+ metadata['backtrace'] = exception.backtrace
57
+ Clues.event_publisher.publish(:failed, Clues.now, queue, metadata, payload['class'], *payload['args'])
58
+ end
59
+ end
60
+ end
61
+ end
62
+ end
63
+ end
64
+ end
65
+ end