qless 0.9.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (43) hide show
  1. data/Gemfile +8 -0
  2. data/HISTORY.md +168 -0
  3. data/README.md +571 -0
  4. data/Rakefile +28 -0
  5. data/bin/qless-campfire +106 -0
  6. data/bin/qless-growl +99 -0
  7. data/bin/qless-web +23 -0
  8. data/lib/qless.rb +185 -0
  9. data/lib/qless/config.rb +31 -0
  10. data/lib/qless/job.rb +259 -0
  11. data/lib/qless/job_reservers/ordered.rb +23 -0
  12. data/lib/qless/job_reservers/round_robin.rb +34 -0
  13. data/lib/qless/lua.rb +25 -0
  14. data/lib/qless/qless-core/cancel.lua +71 -0
  15. data/lib/qless/qless-core/complete.lua +218 -0
  16. data/lib/qless/qless-core/config.lua +44 -0
  17. data/lib/qless/qless-core/depends.lua +65 -0
  18. data/lib/qless/qless-core/fail.lua +107 -0
  19. data/lib/qless/qless-core/failed.lua +83 -0
  20. data/lib/qless/qless-core/get.lua +37 -0
  21. data/lib/qless/qless-core/heartbeat.lua +50 -0
  22. data/lib/qless/qless-core/jobs.lua +41 -0
  23. data/lib/qless/qless-core/peek.lua +155 -0
  24. data/lib/qless/qless-core/pop.lua +278 -0
  25. data/lib/qless/qless-core/priority.lua +32 -0
  26. data/lib/qless/qless-core/put.lua +156 -0
  27. data/lib/qless/qless-core/queues.lua +58 -0
  28. data/lib/qless/qless-core/recur.lua +181 -0
  29. data/lib/qless/qless-core/retry.lua +73 -0
  30. data/lib/qless/qless-core/ruby/lib/qless-core.rb +1 -0
  31. data/lib/qless/qless-core/ruby/lib/qless/core.rb +13 -0
  32. data/lib/qless/qless-core/ruby/lib/qless/core/version.rb +5 -0
  33. data/lib/qless/qless-core/ruby/spec/qless_core_spec.rb +13 -0
  34. data/lib/qless/qless-core/stats.lua +92 -0
  35. data/lib/qless/qless-core/tag.lua +100 -0
  36. data/lib/qless/qless-core/track.lua +79 -0
  37. data/lib/qless/qless-core/workers.lua +69 -0
  38. data/lib/qless/queue.rb +141 -0
  39. data/lib/qless/server.rb +411 -0
  40. data/lib/qless/tasks.rb +10 -0
  41. data/lib/qless/version.rb +3 -0
  42. data/lib/qless/worker.rb +195 -0
  43. metadata +239 -0
data/Gemfile ADDED
@@ -0,0 +1,8 @@
1
+ source "http://rubygems.org"
2
+
3
+ # Specify your gem's dependencies in qless.gemspec
4
+ gemspec
5
+
6
+ group :development do
7
+ gem 'debugger', :platform => :mri
8
+ end
data/HISTORY.md ADDED
@@ -0,0 +1,168 @@
1
+ qless
2
+ =====
3
+
4
+ My hope for qless is that it will make certain aspects of pipeline management will be made
5
+ easier. For the moment, this is a stream of consciousness document meant to capture the
6
+ features that have been occurring to me lately. After these, I have some initial thoughts
7
+ on the implementation, concluding with the outstanding __questions__ I have.
8
+
9
+ I welcome input on any of this.
10
+
11
+ Context
12
+ -------
13
+
14
+ This is a subject that has been on my mind in particular in three contexts:
15
+
16
+ 1. `custom crawl` -- queue management has always been an annoyance, and it's reaching the
17
+ breaking point for me
18
+ 1. `freshscape` -- I'm going to be encountering very similar problems like these in freshscape,
19
+ and I'd like to be able to avoid some of the difficulties I've encountered.
20
+ 1. `general` -- There are a lot of contexts in which such a system would be useful.
21
+ __Update__ Myron pointed out that in fact `resque` is built on a simple protocol,
22
+ where each job is a JSON blob with two keys: `id` and `args`. That makes me feel
23
+ like this is on the right track!
24
+
25
+ Feature Requests
26
+ ----------------
27
+
28
+ Some of the features that I'm really after include
29
+
30
+ 1. __Jobs should not get dropped on the floor__ -- This has been a problem for certain
31
+ projects, including our custom crawler. In this case, jobs had a propensity for
32
+ getting lost in the shuffle.
33
+ 1. __Job stats should be available__ -- It would be nice to be able to track summary statistics
34
+ in one place. Perhaps about the number currently in each stage, waiting for each stage,
35
+ time spent in each stage, number of retries, etc.
36
+ 1. __Job movement should be atomic__ -- One of the problems we've encountered with using
37
+ Redis is that it's been hard to keep items moving from one queue to another in an atomic
38
+ way. This has the unfortunate effect of making it difficult to trust the queues to hold
39
+ any real meaning. For example, the queues use both a list and a hash to track items, and
40
+ the lengths of the two often get out of sync.
41
+ 1. __Retry logic__ -- For this, I believe we need the ability to support some automatic
42
+ retry logic. This should be configurable, and based on the stage
43
+ 1. __Data lookups should be easy__ -- It's been difficult to quickly identify a work item and
44
+ get information on its state. We've usually had to rely on an external storage for this.
45
+ 1. __Manual requeuing__ -- We should be able to safely and atomically move items from one
46
+ queue into another. We've had problems with race conditions in the past.
47
+ 1. __Priority__ -- Jobs should be describable with priority as well. On occasion we've had
48
+ to push items through more quickly than others, and it would be great if the underlying
49
+ system supported that.
50
+ 1. __Tagging / Tracking__ -- It would be nice to be able to mark certain jobs with tags, and
51
+ as work items that we'd like to track. It should then be possible to get a summary along
52
+ the lines of "This is the state of the jobs you said you were interested in." I have a
53
+ system for this set up for my personally, and it has been _extremely_ useful.
54
+ 1. __The system should be reliable and highly available__ -- We're trusting this system to
55
+ be a single point of failure, and as such it needs to be robust and dependable.
56
+ 1. __High Performance__ -- We should be able to expect this system to support a large number
57
+ of jobs in a short amount of time. For some context, we need custom crawler to support
58
+ about 50k state transitions in a day, but my goal is to support millions of transitions
59
+ in a day, and my bonus goal is 10 million or so transitions in a day.
60
+ 1. __Scheduled Work__ -- We should be able to schedule work items to be enqueued as some
61
+ specified time.
62
+ 1. __UI__ -- It would be nice to have a useful interface providing insight into the state of
63
+ the pipeline(s).
64
+ 1. __Namespaces__ -- It might be nice to be able to segment the jobs into namespaces based on
65
+ project, stage, type, etc. It shouldn't have any explicit meaning outside of partitioning
66
+ the work space.
67
+ 1. __Language Agnosticism__ -- The lingua franca for this should be something supported by a
68
+ large number of languages, and the interface should likewise be supported by a large number
69
+ of languages. In this way, I'd like it to be possible that a job is handled by one language
70
+ in one stage, and conceivably another in a different stage.
71
+ 1. __Clients Should be Easy to Write__ -- I don't want to put too much burden on the authors
72
+ of various clients, because I think this helps a project to gain adoption. But, the
73
+ out-of-the-box feature set should be compelling.
74
+
75
+ Thoughts / Recommendations
76
+ --------------------------
77
+
78
+ 1. `Redis` as the storage engine. It's been heavily battle-tested, and it's highly available,
79
+ supports much of the data structures we'd need for these features (atomicity, priority,
80
+ robustness, performance, replicatable, good at saving state). To boot, it's widely available.
81
+ 1. `JSON` as the lingua franca for communication of work units. Every language I've encountered
82
+ has strong support for it, it's expressive, and it's human readable.
83
+
84
+ Until recently, I had been imagining an HTTP server sitting in front of Redis. Mostly because
85
+ I figured that would be one way to make clients easy to write -- if all the logic is pinned
86
+ up in the server. That said, it's just a second moving part upon which to rely. And Myron
87
+ made the very compelling case for having the clients maintain the state and rely solely on
88
+ Redis. However, Redis 2.6 provides us with a way to get the best of both worlds -- clients
89
+ that are easy to write and yet smart enough to do everything themselves. This mechanism is
90
+ stored Lua scripts.
91
+
92
+ Redis has very tight integration with scripting language `Lua`, and the two major selling points
93
+ for us on this point are:
94
+
95
+ 1. Atomicity _and_ performance -- Lua scripts are guaranteed to be the only thing running
96
+ on the Redis instance, and so it makes certain locking mechanisms significantly easier.
97
+ And since it's running on the actual redis instance, we can enjoy great performance.
98
+ 1. All clients can use the same scripts -- Lua scripts for redis are loaded into the instance,
99
+ and then can be identified by a hash. But the language invoking them is irrelevant. As such,
100
+ the burden can still be placed on the original implementation, clients can be easy to
101
+ write, but still be smart enough to manage the queues themselves.
102
+
103
+ One other added benefit is that when using Lua, redis imports a C implementation of a JSON
104
+ parser and makes it available from Lua scripts. This is just icing on the cake.
105
+
106
+ Planned Features / Organization
107
+ ===============================
108
+
109
+ All the smarts are essentially going to go into a collection of Lua scripts to be stored and
110
+ run on Redis. In addition to these Lua scripts, I'd like to provide a simple web interface
111
+ to provide pretty access to some of this functionality.
112
+
113
+ Round 1
114
+ -------
115
+
116
+ 1. __Workers must heartbeat jobs__ -- When a worker is given a job, it is given an exclusive
117
+ lock, and no other worker will get that job so long as it continues to heartbeat. The
118
+ service keeps track of which locks are going to expire, and will give the work to another
119
+ worker if the original worker fails to check in. The expiry time is provided every time
120
+ work is given to a worker, and an updated time is provided when heartbeat-ing. If the
121
+ lock has been given to another worker, the heartbeat will return `false`.
122
+ 1. __Stats A-Plenty__ -- Stats will be kept of when a job was enqueued for a stage, when it
123
+ was popped of to be worked on, and when it was completed. In addition, summary statistics
124
+ will be kept for all the stages.
125
+ 1. __Job Data Stored Temporarily__ -- The data for each job will be stored temporarily. It's
126
+ yet to be determined exactly what the expiration policy will be, (either the last _k_
127
+ jobs or the last _x_ amount of time). But still, all the data about a job will be available
128
+ 1. __Atomic Requeueing__ -- If a work item is moved from one queue to another, it is moved.
129
+ If a worker is in the middle of processing it, its heartbeat will not be renewed, and it
130
+ will not be allowed to complete.
131
+ 1. __Scheduling / Priority__ -- Jobs can be scheduled to become active at a certain time.
132
+ This does not mean that the job will be worked on at that time, though. It simply means
133
+ that after a given scheduled time, it will be considered a candidate, and it will still
134
+ be subject to the normal priority rules. Priority is always given to an active job with
135
+ the lowest priority score.
136
+ 1. __Tracking__ -- Jobs can be marked as being of particular interest, and their progress
137
+ can be tracked accordingly.
138
+ 1. __Web App__ -- A simple web app would be nice
139
+
140
+ Round 2
141
+ -------
142
+
143
+ 1. __Retry logic__ -- For this, I believe we need the ability to support some automatic
144
+ retry logic. This should be configurable, and based on the stage
145
+ 1. __Tagging__ -- Jobs should be able to be tagged with certain meaningful flags, like a
146
+ version number for the software used to process it, or the workers used to process it.
147
+
148
+ Questions
149
+ =========
150
+
151
+ 1. __Implicit Queue Creation__ -- Each queue needs some configuration, like the heartbeat rate,
152
+ the time to live for a job, etc. And not only that, but there might be additional more complicated
153
+ configuration (flow of control). So, which of these should be supported and which not?
154
+
155
+ 1. Static queue definition -- at a very minimum, we should be able to configure some ahead of time
156
+ 1. Dynamic queue creation -- should there just be another endpoint that allows queues to be added?
157
+ If so, should these queues then be saved to persist?
158
+ 1. Implicit queue creation -- if we push to a non-existent queue, should we get a warning?
159
+ An error? Should the queue just be created with some sort of default values?
160
+
161
+ On the one hand, I would like to make the system very flexible and amenable to sort of
162
+ ad-hoc queues, but on the other hand, there may be non-default-able configuration values
163
+ for queues.
164
+
165
+ 1. __Job Data Storage__ -- How long should we keep the data about jobs around? We'd like to be
166
+ able to get information about a job, but those should probably be expired. Should expiration
167
+ policy be set to hold jobs for a certain amount of time? Should this window be configured for
168
+ simply the last _k_ jobs?
data/README.md ADDED
@@ -0,0 +1,571 @@
1
+ qless
2
+ =====
3
+
4
+ Qless is a powerful `Redis`-based job queueing system inspired by
5
+ [resque](https://github.com/defunkt/resque#readme),
6
+ but built on a collection of Lua scripts, maintained in the
7
+ [qless-core](https://github.com/seomoz/qless-core) repo.
8
+
9
+ Philosophy and Nomenclature
10
+ ===========================
11
+ A `job` is a unit of work identified by a job id or `jid`. A `queue` can contain
12
+ several jobs that are scheduled to be run at a certain time, several jobs that are
13
+ waiting to run, and jobs that are currently running. A `worker` is a process on a
14
+ host, identified uniquely, that asks for jobs from the queue, performs some process
15
+ associated with that job, and then marks it as complete. When it's completed, it
16
+ can be put into another queue.
17
+
18
+ Jobs can only be in one queue at a time. That queue is whatever queue they were last
19
+ put in. So if a worker is working on a job, and you move it, the worker's request to
20
+ complete the job will be ignored.
21
+
22
+ A job can be `canceled`, which means it disappears into the ether, and we'll never
23
+ pay it any mind every again. A job can be `dropped`, which is when a worker fails
24
+ to heartbeat or complete the job in a timely fashion, or a job can be `failed`,
25
+ which is when a host recognizes some systematically problematic state about the
26
+ job. A worker should only fail a job if the error is likely not a transient one;
27
+ otherwise, that worker should just drop it and let the system reclaim it.
28
+
29
+ Features
30
+ ========
31
+
32
+ 1. __Jobs don't get dropped on the floor__ -- Sometimes workers drop jobs. Qless
33
+ automatically picks them back up and gives them to another worker
34
+ 1. __Tagging / Tracking__ -- Some jobs are more interesting than others. Track those
35
+ jobs to get updates on their progress. Tag jobs with meaningful identifiers to
36
+ find them quickly in the UI.
37
+ 1. __Job Dependencies__ -- One job might need to wait for another job to complete
38
+ 1. __Stats__ -- `qless` automatically keeps statistics about how long jobs wait
39
+ to be processed and how long they take to be processed. Currently, we keep
40
+ track of the count, mean, standard deviation, and a histogram of these times.
41
+ 1. __Job data is stored temporarily__ -- Job info sticks around for a configurable
42
+ amount of time so you can still look back on a job's history, data, etc.
43
+ 1. __Priority__ -- Jobs with the same priority get popped in the order they were
44
+ inserted; a higher priority means that it gets popped faster
45
+ 1. __Retry logic__ -- Every job has a number of retries associated with it, which are
46
+ renewed when it is put into a new queue or completed. If a job is repeatedly
47
+ dropped, then it is presumed to be problematic, and is automatically failed.
48
+ 1. __Web App__ -- With the advent of a Ruby client, there is a Sinatra-based web
49
+ app that gives you control over certain operational issues
50
+ 1. __Scheduled Work__ -- Until a job waits for a specified delay (defaults to 0),
51
+ jobs cannot be popped by workers
52
+ 1. __Recurring Jobs__ -- Scheduling's all well and good, but we also support
53
+ jobs that need to recur periodically.
54
+ 1. __Notifications__ -- Tracked jobs emit events on pubsub channels as they get
55
+ completed, failed, put, popped, etc. Use these events to get notified of
56
+ progress on jobs you're interested in.
57
+
58
+ Enqueing Jobs
59
+ =============
60
+ First things first, require `qless` and create a client. The client accepts all the
61
+ same arguments that you'd use when constructing a redis client.
62
+
63
+ ``` ruby
64
+ require 'qless'
65
+
66
+ # Connect to localhost
67
+ client = Qless::Client.new
68
+ # Connect to somewhere else
69
+ client = Qless::Client.new(:host => 'foo.bar.com', :port => 1234)
70
+ ```
71
+
72
+ Jobs should be classes or modules that define a `perform` method, which
73
+ must accept a single `job` argument:
74
+
75
+ ``` ruby
76
+ class MyJobClass
77
+ def self.perform(job)
78
+ # job is an instance of `Qless::Job` and provides access to
79
+ # job.data, a means to cancel the job (job.cancel), and more.
80
+ end
81
+ end
82
+ ```
83
+
84
+ Now you can access a queue, and add a job to that queue.
85
+
86
+ ``` ruby
87
+ # This references a new or existing queue 'testing'
88
+ queue = client.queues['testing']
89
+ # Let's add a job, with some data. Returns Job ID
90
+ queue.put(MyJobClass, :hello => 'howdy')
91
+ # => "0c53b0404c56012f69fa482a1427ab7d"
92
+ # Now we can ask for a job
93
+ job = queue.pop
94
+ # => <Qless::Job 0c53b0404c56012f69fa482a1427ab7d (MyJobClass / testing)>
95
+ # And we can do the work associated with it!
96
+ job.perform
97
+ ```
98
+
99
+ The job data must be serializable to JSON, and it is recommended
100
+ that you use a hash for it. See below for a list of the supported job options.
101
+
102
+ The argument returned by `queue.put` is the job ID, or jid. Every Qless
103
+ job has a unique jid, and it provides a means to interact with an
104
+ existing job:
105
+
106
+ ``` ruby
107
+ # find an existing job by it's jid
108
+ job = client.jobs[jid]
109
+
110
+ # Query it to find out details about it:
111
+ job.klass # => the class of the job
112
+ job.queue # => the queue the job is in
113
+ job.data # => the data for the job
114
+ job.history # => the history of what has happened to the job sofar
115
+ job.dependencies # => the jids of other jobs that must complete before this one
116
+ job.dependents # => the jids of other jobs that depend on this one
117
+ job.priority # => the priority of this job
118
+ job.tags # => array of tags for this job
119
+ job.original_retries # => the number of times the job is allowed to be retried
120
+ job.retries_left # => the number of retries left
121
+
122
+ # You can also change the job in various ways:
123
+ job.move("some_other_queue") # move it to a new queue
124
+ job.cancel # cancel the job
125
+ job.tag("foo") # add a tag
126
+ job.untag("foo") # remove a tag
127
+ ```
128
+
129
+ Running A Worker
130
+ ================
131
+
132
+ The Qless ruby worker was heavily inspired by Resque's worker,
133
+ but thanks to the power of the qless-core lua scripts, it is
134
+ *much* simpler and you are welcome to write your own (e.g. if
135
+ you'd rather save memory by not forking the worker for each job).
136
+
137
+ As with resque...
138
+
139
+ * The worker forks a child process for each job in order to provide
140
+ resilience against memory leaks. Pass the `RUN_AS_SINGLE_PROCESS`
141
+ environment variable to force Qless to not fork the child process.
142
+ Single process mode should only be used in some test/dev
143
+ environments.
144
+ * The worker updates its procline with its status so you can see
145
+ what workers are doing using `ps`.
146
+ * The worker registers signal handlers so that you can control it
147
+ by sending it signals.
148
+ * The worker is given a list of queues to pop jobs off of.
149
+ * The worker logs out put based on `VERBOSE` or `VVERBOSE` (very
150
+ verbose) environment variables.
151
+ * Qless ships with a rake task (`qless:work`) for running workers.
152
+ It runs `qless:setup` before starting the main work loop so that
153
+ users can load their environment in that task.
154
+ * The sleep interval (for when there is no jobs available) can be
155
+ configured with the `INTERVAL` environment variable.
156
+
157
+ Resque uses queues for its notion of priority. In contrast, qless
158
+ has priority support built-in. Thus, the worker supports two strategies
159
+ for what order to pop jobs off the queues: ordered and round-robin.
160
+ The ordered reserver will keep popping jobs off the first queue until
161
+ it is empty, before trying to pop job off the second queue. The
162
+ round-robin reserver will pop a job off the first queue, then the second
163
+ queue, and so on. You could also easily implement your own.
164
+
165
+ To start a worker, load the qless rake tasks in your Rakefile, and
166
+ define a `qless:setup` task:
167
+
168
+ ``` ruby
169
+ require 'qless/tasks'
170
+ namespace :qless do
171
+ task :setup do
172
+ require 'my_app/environment' # to ensure all job classes are loaded
173
+
174
+ # Set options via environment variables
175
+ # The only required option is QUEUES; the
176
+ # rest have reasonable defaults.
177
+ ENV['REDIS_URL'] ||= 'redis://some-host:7000/3'
178
+ ENV['QUEUES'] ||= 'fizz,buzz'
179
+ ENV['JOB_RESERVER'] ||= 'Ordered'
180
+ ENV['INTERVAL'] ||= '10' # 10 seconds
181
+ ENV['VERBOSE'] ||= 'true'
182
+ end
183
+ end
184
+ ```
185
+
186
+ Then run the `qless:work` rake task:
187
+
188
+ ```
189
+ rake qless:work
190
+ ```
191
+
192
+ The following signals are supported:
193
+
194
+ * TERM: Shutdown immediately, stop processing jobs.
195
+ * INT: Shutdown immediately, stop processing jobs.
196
+ * QUIT: Shutdown after the current job has finished processing.
197
+ * USR1: Kill the forked child immediately, continue processing jobs.
198
+ * USR2: Don't process any new jobs
199
+ * CONT: Start processing jobs again after a USR2
200
+
201
+ You should send these to the master process, not the child.
202
+
203
+ Workers also support middleware modules that can be used to inject
204
+ logic before, after or around the processing of a single job in
205
+ the child process. This can be useful, for example, when you need to
206
+ re-establish a connection to your database in each job.
207
+
208
+ Define a module with an `around_perform` method that calls `super` where you
209
+ want the job to be processed:
210
+
211
+ ``` ruby
212
+ module ReEstablishDBConnection
213
+ def around_perform(job)
214
+ MyORM.establish_connection
215
+ super
216
+ end
217
+ end
218
+ ```
219
+
220
+ Then, mix-it into the worker class. You can mix-in as many
221
+ middleware modules as you like:
222
+
223
+ ``` ruby
224
+ require 'qless/worker'
225
+ Qless::Worker.class_eval do
226
+ include ReEstablishDBConnection
227
+ include SomeOtherAwesomeMiddleware
228
+ end
229
+ ```
230
+
231
+ Web Interface
232
+ =============
233
+
234
+ Qless ships with a resque-inspired web app that lets you easily
235
+ deal with failures and see what it is processing. If you're project
236
+ has a rack-based ruby web app, we recommend you mount Qless's web app
237
+ in it. Here's how you can do that with `Rack::Builder` in your `config.ru`:
238
+
239
+ ``` ruby
240
+ Qless::Server.client = Qless::Client.new(:host => "some-host", :port => 7000)
241
+
242
+ Rack::Builder.new do
243
+ use SomeMiddleware
244
+
245
+ map('/some-other-app') { run Apps::Something.new }
246
+ map('/qless') { run Qless::Server.new }
247
+ end
248
+ ```
249
+
250
+ For an app using Rails 3+, check the router documentation for how to mount
251
+ rack apps.
252
+
253
+ Job Dependencies
254
+ ================
255
+ Let's say you have one job that depends on another, but the task definitions are
256
+ fundamentally different. You need to bake a turkey, and you need to make stuffing,
257
+ but you can't make the turkey until the stuffing is made:
258
+
259
+ ``` ruby
260
+ queue = client.queues['cook']
261
+ stuffing_jid = queue.put(MakeStuffing, {:lots => 'of butter'})
262
+ turkey_jid = queue.put(MakeTurkey , {:with => 'stuffing'}, :depends=>[stuffing_jid])
263
+ ```
264
+
265
+ When the stuffing job completes, the turkey job is unlocked and free to be processed.
266
+
267
+ Priority
268
+ ========
269
+ Some jobs need to get popped sooner than others. Whether it's a trouble ticket, or
270
+ debugging, you can do this pretty easily when you put a job in a queue:
271
+
272
+ ``` ruby
273
+ queue.put(MyJobClass, {:foo => 'bar'}, :priority => 10)
274
+ ```
275
+
276
+ What happens when you want to adjust a job's priority while it's still waiting in
277
+ a queue?
278
+
279
+ ``` ruby
280
+ job = client.jobs['0c53b0404c56012f69fa482a1427ab7d']
281
+ job.priority = 10
282
+ # Now this will get popped before any job of lower priority
283
+ ```
284
+
285
+ Scheduled Jobs
286
+ ==============
287
+ If you don't want a job to be run right away but some time in the future, you can
288
+ specify a delay:
289
+
290
+ ``` ruby
291
+ # Run at least 10 minutes from now
292
+ queue.put(MyJobClass, {:foo => 'bar'}, :delay => 600)
293
+ ```
294
+
295
+ This doesn't guarantee that job will be run exactly at 10 minutes. You can accomplish
296
+ this by changing the job's priority so that once 10 minutes has elapsed, it's put before
297
+ lesser-priority jobs:
298
+
299
+ ``` ruby
300
+ # Run in 10 minutes
301
+ queue.put(MyJobClass, {:foo => 'bar'}, :delay => 600, :priority => 100)
302
+ ```
303
+
304
+ Recurring Jobs
305
+ ==============
306
+ Sometimes it's not enough simply to schedule one job, but you want to run jobs regularly.
307
+ In particular, maybe you have some batch operation that needs to get run once an hour and
308
+ you don't care what worker runs it. Recurring jobs are specified much like other jobs:
309
+
310
+ ``` ruby
311
+ # Run every hour
312
+ queue.recur(MyJobClass, {:widget => 'warble'}, 3600)
313
+ # => 22ac75008a8011e182b24cf9ab3a8f3b
314
+ ```
315
+
316
+ You can even access them in much the same way as you would normal jobs:
317
+
318
+ ``` ruby
319
+ job = client.jobs['22ac75008a8011e182b24cf9ab3a8f3b']
320
+ # => < Qless::RecurringJob 22ac75008a8011e182b24cf9ab3a8f3b >
321
+ ```
322
+
323
+ Changing the interval at which it runs after the fact is trivial:
324
+
325
+ ``` ruby
326
+ # I think I only need it to run once every two hours
327
+ job.interval = 7200
328
+ ```
329
+
330
+ If you want it to run every hour on the hour, but it's 2:37 right now, you can specify
331
+ an offset which is how long it should wait before popping the first job:
332
+
333
+ ``` ruby
334
+ # 23 minutes of waiting until it should go
335
+ queue.recur(MyJobClass, {:howdy => 'hello'}, 3600, :offset => 23 * 60)
336
+ ```
337
+
338
+ Recurring jobs also have priority, a configurable number of retries, and tags. These
339
+ settings don't apply to the recurring jobs, but rather the jobs that they create. In the
340
+ case where more than one interval passes before a worker tries to pop the job, __more than
341
+ one job is created__. The thinking is that while it's completely client-managed, the state
342
+ should not be dependent on how often workers are trying to pop jobs.
343
+
344
+ ``` ruby
345
+ # Recur every minute
346
+ queue.recur(MyJobClass, {:lots => 'of jobs'}, 60)
347
+ # Wait 5 minutes
348
+ queue.pop(10).length
349
+ # => 5 jobs got popped
350
+ ```
351
+
352
+ Configuration Options
353
+ =====================
354
+ You can get and set global (read: in the context of the same Redis instance) configuration
355
+ to change the behavior for heartbeating, and so forth. There aren't a tremendous number
356
+ of configuration options, but an important one is how long job data is kept around. Job
357
+ data is expired after it has been completed for `jobs-history` seconds, but is limited to
358
+ the last `jobs-history-count` completed jobs. These default to 50k jobs, and 30 days, but
359
+ depending on volume, your needs may change. To only keep the last 500 jobs for up to 7 days:
360
+
361
+ ``` ruby
362
+ client.config['jobs-history'] = 7 * 86400
363
+ client.config['jobs-history-count'] = 500
364
+ ```
365
+
366
+ Tagging / Tracking
367
+ ==================
368
+ In qless, 'tracking' means flagging a job as important. Tracked jobs have a tab reserved
369
+ for them in the web interface, and they also emit subscribable events as they make progress
370
+ (more on that below). You can flag a job from the web interface, or the corresponding code:
371
+
372
+ ``` ruby
373
+ client.jobs['b1882e009a3d11e192d0b174d751779d'].track
374
+ ```
375
+
376
+ Jobs can be tagged with strings which are indexed for quick searches. For example, jobs
377
+ might be associated with customer accounts, or some other key that makes sense for your
378
+ project.
379
+
380
+ ``` ruby
381
+ queue.put(MyJobClass, {:tags => 'aplenty'}, :tags => ['12345', 'foo', 'bar'])
382
+ ```
383
+
384
+ This makes them searchable in the web interface, or from code:
385
+
386
+ ``` ruby
387
+ jids = client.jobs.tagged('foo')
388
+ ```
389
+
390
+ You can add or remove tags at will, too:
391
+
392
+ ``` ruby
393
+ job = client.jobs['b1882e009a3d11e192d0b174d751779d']
394
+ job.tag('howdy', 'hello')
395
+ job.untag('foo', 'bar')
396
+ ```
397
+
398
+ Notifications
399
+ =============
400
+ Tracked jobs emit events on specific pubsub channels as things happen to them. Whether
401
+ it's getting popped off of a queue, completed by a worker, etc. A good example of how
402
+ to make use of this is in the `qless-campfire` or `qless-growl`. The jist of it goes like
403
+ this, though:
404
+
405
+ ``` ruby
406
+ client.events do |on|
407
+ on.canceled { |jid| puts "#{jid} canceled" }
408
+ on.stalled { |jid| puts "#{jid} stalled" }
409
+ on.track { |jid| puts "tracking #{jid}" }
410
+ on.untrack { |jid| puts "untracking #{jid}" }
411
+ on.completed { |jid| puts "#{jid} completed" }
412
+ on.failed { |jid| puts "#{jid} failed" }
413
+ on.popped { |jid| puts "#{jid} popped" }
414
+ on.put { |jid| puts "#{jid} put" }
415
+ end
416
+ ```
417
+
418
+ Those familiar with redis pubsub will note that a redis connection can only be used
419
+ for pubsub-y commands once listening. For this reason, invoking `client.events` actually
420
+ creates a second connection so that `client` can still be used as it normally would be:
421
+
422
+ ``` ruby
423
+ client.events do |on|
424
+ on.failed do |jid|
425
+ puts "#{jid} failed in #{client.jobs[jid].queue_name}"
426
+ end
427
+ end
428
+ ```
429
+
430
+ Heartbeating
431
+ ============
432
+ When a worker is given a job, it is given an exclusive lock to that job. That means
433
+ that job won't be given to any other worker, so long as the worker checks in with
434
+ progress on the job. By default, jobs have to either report back progress every 60
435
+ seconds, or complete it, but that's a configurable option. For longer jobs, this
436
+ may not make sense.
437
+
438
+ ``` ruby
439
+ # Hooray! We've got a piece of work!
440
+ job = queue.pop
441
+ # How long until I have to check in?
442
+ job.ttl
443
+ # => 59
444
+ # Hey! I'm still working on it!
445
+ job.heartbeat
446
+ # => 1331326141.0
447
+ # Ok, I've got some more time. Oh! Now I'm done!
448
+ job.complete
449
+ ```
450
+
451
+ If you want to increase the heartbeat in all queues,
452
+
453
+ ``` ruby
454
+ # Now jobs get 10 minutes to check in
455
+ client.config['heartbeat'] = 600
456
+ # But the testing queue doesn't get as long.
457
+ client.queues['testing'].heartbeat = 300
458
+ ```
459
+
460
+ When choosing a heartbeat interval, realize that this is the amount of time that
461
+ can pass before qless realizes if a job has been dropped. At the same time, you don't
462
+ want to burden qless with heartbeating every 10 seconds if your job is expected to
463
+ take several hours.
464
+
465
+ An idiom you're encouraged to use for long-running jobs that want to check in their
466
+ progress periodically:
467
+
468
+ ``` ruby
469
+ # Wait until we have 5 minutes left on the heartbeat, and if we find that
470
+ # we've lost our lock on a job, then honorable fall on our sword
471
+ if (job.ttl < 300) && !job.heartbeat
472
+ return / die / exit
473
+ end
474
+ ```
475
+
476
+ Stats
477
+ =====
478
+ One nice feature of `qless` is that you can get statistics about usage. Stats are
479
+ aggregated by day, so when you want stats about a queue, you need to say what queue
480
+ and what day you're talking about. By default, you just get the stats for today.
481
+ These stats include information about the mean job wait time, standard deviation,
482
+ and histogram. This same data is also provided for job completion:
483
+
484
+ ``` ruby
485
+ # So, how're we doing today?
486
+ stats = client.stats.get('testing')
487
+ # => { 'run' => {'mean' => ..., }, 'wait' => {'mean' => ..., }}
488
+ ```
489
+
490
+ Time
491
+ ====
492
+ It's important to note that Redis doesn't allow access to the system time if you're
493
+ going to be making any manipulations to data (which our scripts do). And yet, we
494
+ have heartbeating. This means that the clients actually send the current time when
495
+ making most requests, and for consistency's sake, means that your workers must be
496
+ relatively synchronized. This doesn't mean down to the tens of milliseconds, but if
497
+ you're experiencing appreciable clock drift, you should investigate NTP. For what it's
498
+ worth, this hasn't been a problem for us, but most of our jobs have heartbeat intervals
499
+ of 30 minutes or more.
500
+
501
+ Ensuring Job Uniqueness
502
+ =======================
503
+
504
+ As mentioned above, Jobs are uniquely identied by an id--their jid.
505
+ Qless will generate a UUID for each enqueued job or you can specify
506
+ one manually:
507
+
508
+ ``` ruby
509
+ queue.put(MyJobClass, { :hello => 'howdy' }, :jid => 'my-job-jid')
510
+ ```
511
+
512
+ This can be useful when you want to ensure a job's uniqueness: simply
513
+ create a jid that is a function of the Job's class and data, it'll
514
+ guaranteed that Qless won't have multiple jobs with the same class
515
+ and data.
516
+
517
+ Setting Default Job Options
518
+ ===========================
519
+
520
+ `Qless::Queue#put` accepts a number of job options (see above for their
521
+ semantics):
522
+
523
+ * jid
524
+ * delay
525
+ * priority
526
+ * tags
527
+ * retries
528
+ * depends
529
+
530
+ When enqueueing the same kind of job with the same args in multiple
531
+ places it's a pain to have to declare the job options every time.
532
+ Instead, you can define default job options directly on the job class:
533
+
534
+ ``` ruby
535
+ class MyJobClass
536
+ def self.default_job_options(data)
537
+ { :priority => 10, :delay => 100 }
538
+ end
539
+ end
540
+
541
+ queue.put(MyJobClass, { :some => "data" }, :delay => 10)
542
+ ```
543
+
544
+ Individual jobs can still specify options, so in this example,
545
+ the job would be enqueued with a priority of 10 and a delay of 10.
546
+
547
+ Testing Jobs
548
+ ============
549
+ When unit testing your jobs, you will probably want to avoid the
550
+ overhead of round-tripping them through redis. You can of course
551
+ use a mock job object and pass it to your job class's `perform`
552
+ method. Alternately, if you want a real full-fledged `Qless::Job`
553
+ instance without round-tripping it through Redis, use `Qless::Job.build`:
554
+
555
+ ``` ruby
556
+ describe MyJobClass do
557
+ let(:client) { Qless::Client.new }
558
+ let(:job) { Qless::Job.build(client, MyJobClass, :data => { "some" => "data" }) }
559
+
560
+ it 'does something' do
561
+ MyJobClass.perform(job)
562
+ # make an assertion about what happened
563
+ end
564
+ end
565
+ ```
566
+
567
+ The options hash passed to `Qless::Job.build` supports all the same
568
+ options a normal job supports. See
569
+ [the source](https://github.com/seomoz/qless/blob/master/lib/qless/job.rb)
570
+ for a full list.
571
+