qless 0.9.1

Sign up to get free protection for your applications and to get access to all the features.
Files changed (43) hide show
  1. data/Gemfile +8 -0
  2. data/HISTORY.md +168 -0
  3. data/README.md +571 -0
  4. data/Rakefile +28 -0
  5. data/bin/qless-campfire +106 -0
  6. data/bin/qless-growl +99 -0
  7. data/bin/qless-web +23 -0
  8. data/lib/qless.rb +185 -0
  9. data/lib/qless/config.rb +31 -0
  10. data/lib/qless/job.rb +259 -0
  11. data/lib/qless/job_reservers/ordered.rb +23 -0
  12. data/lib/qless/job_reservers/round_robin.rb +34 -0
  13. data/lib/qless/lua.rb +25 -0
  14. data/lib/qless/qless-core/cancel.lua +71 -0
  15. data/lib/qless/qless-core/complete.lua +218 -0
  16. data/lib/qless/qless-core/config.lua +44 -0
  17. data/lib/qless/qless-core/depends.lua +65 -0
  18. data/lib/qless/qless-core/fail.lua +107 -0
  19. data/lib/qless/qless-core/failed.lua +83 -0
  20. data/lib/qless/qless-core/get.lua +37 -0
  21. data/lib/qless/qless-core/heartbeat.lua +50 -0
  22. data/lib/qless/qless-core/jobs.lua +41 -0
  23. data/lib/qless/qless-core/peek.lua +155 -0
  24. data/lib/qless/qless-core/pop.lua +278 -0
  25. data/lib/qless/qless-core/priority.lua +32 -0
  26. data/lib/qless/qless-core/put.lua +156 -0
  27. data/lib/qless/qless-core/queues.lua +58 -0
  28. data/lib/qless/qless-core/recur.lua +181 -0
  29. data/lib/qless/qless-core/retry.lua +73 -0
  30. data/lib/qless/qless-core/ruby/lib/qless-core.rb +1 -0
  31. data/lib/qless/qless-core/ruby/lib/qless/core.rb +13 -0
  32. data/lib/qless/qless-core/ruby/lib/qless/core/version.rb +5 -0
  33. data/lib/qless/qless-core/ruby/spec/qless_core_spec.rb +13 -0
  34. data/lib/qless/qless-core/stats.lua +92 -0
  35. data/lib/qless/qless-core/tag.lua +100 -0
  36. data/lib/qless/qless-core/track.lua +79 -0
  37. data/lib/qless/qless-core/workers.lua +69 -0
  38. data/lib/qless/queue.rb +141 -0
  39. data/lib/qless/server.rb +411 -0
  40. data/lib/qless/tasks.rb +10 -0
  41. data/lib/qless/version.rb +3 -0
  42. data/lib/qless/worker.rb +195 -0
  43. metadata +239 -0
data/Gemfile ADDED
@@ -0,0 +1,8 @@
1
+ source "http://rubygems.org"
2
+
3
+ # Specify your gem's dependencies in qless.gemspec
4
+ gemspec
5
+
6
+ group :development do
7
+ gem 'debugger', :platform => :mri
8
+ end
data/HISTORY.md ADDED
@@ -0,0 +1,168 @@
1
+ qless
2
+ =====
3
+
4
+ My hope for qless is that it will make certain aspects of pipeline management will be made
5
+ easier. For the moment, this is a stream of consciousness document meant to capture the
6
+ features that have been occurring to me lately. After these, I have some initial thoughts
7
+ on the implementation, concluding with the outstanding __questions__ I have.
8
+
9
+ I welcome input on any of this.
10
+
11
+ Context
12
+ -------
13
+
14
+ This is a subject that has been on my mind in particular in three contexts:
15
+
16
+ 1. `custom crawl` -- queue management has always been an annoyance, and it's reaching the
17
+ breaking point for me
18
+ 1. `freshscape` -- I'm going to be encountering very similar problems like these in freshscape,
19
+ and I'd like to be able to avoid some of the difficulties I've encountered.
20
+ 1. `general` -- There are a lot of contexts in which such a system would be useful.
21
+ __Update__ Myron pointed out that in fact `resque` is built on a simple protocol,
22
+ where each job is a JSON blob with two keys: `id` and `args`. That makes me feel
23
+ like this is on the right track!
24
+
25
+ Feature Requests
26
+ ----------------
27
+
28
+ Some of the features that I'm really after include
29
+
30
+ 1. __Jobs should not get dropped on the floor__ -- This has been a problem for certain
31
+ projects, including our custom crawler. In this case, jobs had a propensity for
32
+ getting lost in the shuffle.
33
+ 1. __Job stats should be available__ -- It would be nice to be able to track summary statistics
34
+ in one place. Perhaps about the number currently in each stage, waiting for each stage,
35
+ time spent in each stage, number of retries, etc.
36
+ 1. __Job movement should be atomic__ -- One of the problems we've encountered with using
37
+ Redis is that it's been hard to keep items moving from one queue to another in an atomic
38
+ way. This has the unfortunate effect of making it difficult to trust the queues to hold
39
+ any real meaning. For example, the queues use both a list and a hash to track items, and
40
+ the lengths of the two often get out of sync.
41
+ 1. __Retry logic__ -- For this, I believe we need the ability to support some automatic
42
+ retry logic. This should be configurable, and based on the stage
43
+ 1. __Data lookups should be easy__ -- It's been difficult to quickly identify a work item and
44
+ get information on its state. We've usually had to rely on an external storage for this.
45
+ 1. __Manual requeuing__ -- We should be able to safely and atomically move items from one
46
+ queue into another. We've had problems with race conditions in the past.
47
+ 1. __Priority__ -- Jobs should be describable with priority as well. On occasion we've had
48
+ to push items through more quickly than others, and it would be great if the underlying
49
+ system supported that.
50
+ 1. __Tagging / Tracking__ -- It would be nice to be able to mark certain jobs with tags, and
51
+ as work items that we'd like to track. It should then be possible to get a summary along
52
+ the lines of "This is the state of the jobs you said you were interested in." I have a
53
+ system for this set up for my personally, and it has been _extremely_ useful.
54
+ 1. __The system should be reliable and highly available__ -- We're trusting this system to
55
+ be a single point of failure, and as such it needs to be robust and dependable.
56
+ 1. __High Performance__ -- We should be able to expect this system to support a large number
57
+ of jobs in a short amount of time. For some context, we need custom crawler to support
58
+ about 50k state transitions in a day, but my goal is to support millions of transitions
59
+ in a day, and my bonus goal is 10 million or so transitions in a day.
60
+ 1. __Scheduled Work__ -- We should be able to schedule work items to be enqueued as some
61
+ specified time.
62
+ 1. __UI__ -- It would be nice to have a useful interface providing insight into the state of
63
+ the pipeline(s).
64
+ 1. __Namespaces__ -- It might be nice to be able to segment the jobs into namespaces based on
65
+ project, stage, type, etc. It shouldn't have any explicit meaning outside of partitioning
66
+ the work space.
67
+ 1. __Language Agnosticism__ -- The lingua franca for this should be something supported by a
68
+ large number of languages, and the interface should likewise be supported by a large number
69
+ of languages. In this way, I'd like it to be possible that a job is handled by one language
70
+ in one stage, and conceivably another in a different stage.
71
+ 1. __Clients Should be Easy to Write__ -- I don't want to put too much burden on the authors
72
+ of various clients, because I think this helps a project to gain adoption. But, the
73
+ out-of-the-box feature set should be compelling.
74
+
75
+ Thoughts / Recommendations
76
+ --------------------------
77
+
78
+ 1. `Redis` as the storage engine. It's been heavily battle-tested, and it's highly available,
79
+ supports much of the data structures we'd need for these features (atomicity, priority,
80
+ robustness, performance, replicatable, good at saving state). To boot, it's widely available.
81
+ 1. `JSON` as the lingua franca for communication of work units. Every language I've encountered
82
+ has strong support for it, it's expressive, and it's human readable.
83
+
84
+ Until recently, I had been imagining an HTTP server sitting in front of Redis. Mostly because
85
+ I figured that would be one way to make clients easy to write -- if all the logic is pinned
86
+ up in the server. That said, it's just a second moving part upon which to rely. And Myron
87
+ made the very compelling case for having the clients maintain the state and rely solely on
88
+ Redis. However, Redis 2.6 provides us with a way to get the best of both worlds -- clients
89
+ that are easy to write and yet smart enough to do everything themselves. This mechanism is
90
+ stored Lua scripts.
91
+
92
+ Redis has very tight integration with scripting language `Lua`, and the two major selling points
93
+ for us on this point are:
94
+
95
+ 1. Atomicity _and_ performance -- Lua scripts are guaranteed to be the only thing running
96
+ on the Redis instance, and so it makes certain locking mechanisms significantly easier.
97
+ And since it's running on the actual redis instance, we can enjoy great performance.
98
+ 1. All clients can use the same scripts -- Lua scripts for redis are loaded into the instance,
99
+ and then can be identified by a hash. But the language invoking them is irrelevant. As such,
100
+ the burden can still be placed on the original implementation, clients can be easy to
101
+ write, but still be smart enough to manage the queues themselves.
102
+
103
+ One other added benefit is that when using Lua, redis imports a C implementation of a JSON
104
+ parser and makes it available from Lua scripts. This is just icing on the cake.
105
+
106
+ Planned Features / Organization
107
+ ===============================
108
+
109
+ All the smarts are essentially going to go into a collection of Lua scripts to be stored and
110
+ run on Redis. In addition to these Lua scripts, I'd like to provide a simple web interface
111
+ to provide pretty access to some of this functionality.
112
+
113
+ Round 1
114
+ -------
115
+
116
+ 1. __Workers must heartbeat jobs__ -- When a worker is given a job, it is given an exclusive
117
+ lock, and no other worker will get that job so long as it continues to heartbeat. The
118
+ service keeps track of which locks are going to expire, and will give the work to another
119
+ worker if the original worker fails to check in. The expiry time is provided every time
120
+ work is given to a worker, and an updated time is provided when heartbeat-ing. If the
121
+ lock has been given to another worker, the heartbeat will return `false`.
122
+ 1. __Stats A-Plenty__ -- Stats will be kept of when a job was enqueued for a stage, when it
123
+ was popped of to be worked on, and when it was completed. In addition, summary statistics
124
+ will be kept for all the stages.
125
+ 1. __Job Data Stored Temporarily__ -- The data for each job will be stored temporarily. It's
126
+ yet to be determined exactly what the expiration policy will be, (either the last _k_
127
+ jobs or the last _x_ amount of time). But still, all the data about a job will be available
128
+ 1. __Atomic Requeueing__ -- If a work item is moved from one queue to another, it is moved.
129
+ If a worker is in the middle of processing it, its heartbeat will not be renewed, and it
130
+ will not be allowed to complete.
131
+ 1. __Scheduling / Priority__ -- Jobs can be scheduled to become active at a certain time.
132
+ This does not mean that the job will be worked on at that time, though. It simply means
133
+ that after a given scheduled time, it will be considered a candidate, and it will still
134
+ be subject to the normal priority rules. Priority is always given to an active job with
135
+ the lowest priority score.
136
+ 1. __Tracking__ -- Jobs can be marked as being of particular interest, and their progress
137
+ can be tracked accordingly.
138
+ 1. __Web App__ -- A simple web app would be nice
139
+
140
+ Round 2
141
+ -------
142
+
143
+ 1. __Retry logic__ -- For this, I believe we need the ability to support some automatic
144
+ retry logic. This should be configurable, and based on the stage
145
+ 1. __Tagging__ -- Jobs should be able to be tagged with certain meaningful flags, like a
146
+ version number for the software used to process it, or the workers used to process it.
147
+
148
+ Questions
149
+ =========
150
+
151
+ 1. __Implicit Queue Creation__ -- Each queue needs some configuration, like the heartbeat rate,
152
+ the time to live for a job, etc. And not only that, but there might be additional more complicated
153
+ configuration (flow of control). So, which of these should be supported and which not?
154
+
155
+ 1. Static queue definition -- at a very minimum, we should be able to configure some ahead of time
156
+ 1. Dynamic queue creation -- should there just be another endpoint that allows queues to be added?
157
+ If so, should these queues then be saved to persist?
158
+ 1. Implicit queue creation -- if we push to a non-existent queue, should we get a warning?
159
+ An error? Should the queue just be created with some sort of default values?
160
+
161
+ On the one hand, I would like to make the system very flexible and amenable to sort of
162
+ ad-hoc queues, but on the other hand, there may be non-default-able configuration values
163
+ for queues.
164
+
165
+ 1. __Job Data Storage__ -- How long should we keep the data about jobs around? We'd like to be
166
+ able to get information about a job, but those should probably be expired. Should expiration
167
+ policy be set to hold jobs for a certain amount of time? Should this window be configured for
168
+ simply the last _k_ jobs?
data/README.md ADDED
@@ -0,0 +1,571 @@
1
+ qless
2
+ =====
3
+
4
+ Qless is a powerful `Redis`-based job queueing system inspired by
5
+ [resque](https://github.com/defunkt/resque#readme),
6
+ but built on a collection of Lua scripts, maintained in the
7
+ [qless-core](https://github.com/seomoz/qless-core) repo.
8
+
9
+ Philosophy and Nomenclature
10
+ ===========================
11
+ A `job` is a unit of work identified by a job id or `jid`. A `queue` can contain
12
+ several jobs that are scheduled to be run at a certain time, several jobs that are
13
+ waiting to run, and jobs that are currently running. A `worker` is a process on a
14
+ host, identified uniquely, that asks for jobs from the queue, performs some process
15
+ associated with that job, and then marks it as complete. When it's completed, it
16
+ can be put into another queue.
17
+
18
+ Jobs can only be in one queue at a time. That queue is whatever queue they were last
19
+ put in. So if a worker is working on a job, and you move it, the worker's request to
20
+ complete the job will be ignored.
21
+
22
+ A job can be `canceled`, which means it disappears into the ether, and we'll never
23
+ pay it any mind every again. A job can be `dropped`, which is when a worker fails
24
+ to heartbeat or complete the job in a timely fashion, or a job can be `failed`,
25
+ which is when a host recognizes some systematically problematic state about the
26
+ job. A worker should only fail a job if the error is likely not a transient one;
27
+ otherwise, that worker should just drop it and let the system reclaim it.
28
+
29
+ Features
30
+ ========
31
+
32
+ 1. __Jobs don't get dropped on the floor__ -- Sometimes workers drop jobs. Qless
33
+ automatically picks them back up and gives them to another worker
34
+ 1. __Tagging / Tracking__ -- Some jobs are more interesting than others. Track those
35
+ jobs to get updates on their progress. Tag jobs with meaningful identifiers to
36
+ find them quickly in the UI.
37
+ 1. __Job Dependencies__ -- One job might need to wait for another job to complete
38
+ 1. __Stats__ -- `qless` automatically keeps statistics about how long jobs wait
39
+ to be processed and how long they take to be processed. Currently, we keep
40
+ track of the count, mean, standard deviation, and a histogram of these times.
41
+ 1. __Job data is stored temporarily__ -- Job info sticks around for a configurable
42
+ amount of time so you can still look back on a job's history, data, etc.
43
+ 1. __Priority__ -- Jobs with the same priority get popped in the order they were
44
+ inserted; a higher priority means that it gets popped faster
45
+ 1. __Retry logic__ -- Every job has a number of retries associated with it, which are
46
+ renewed when it is put into a new queue or completed. If a job is repeatedly
47
+ dropped, then it is presumed to be problematic, and is automatically failed.
48
+ 1. __Web App__ -- With the advent of a Ruby client, there is a Sinatra-based web
49
+ app that gives you control over certain operational issues
50
+ 1. __Scheduled Work__ -- Until a job waits for a specified delay (defaults to 0),
51
+ jobs cannot be popped by workers
52
+ 1. __Recurring Jobs__ -- Scheduling's all well and good, but we also support
53
+ jobs that need to recur periodically.
54
+ 1. __Notifications__ -- Tracked jobs emit events on pubsub channels as they get
55
+ completed, failed, put, popped, etc. Use these events to get notified of
56
+ progress on jobs you're interested in.
57
+
58
+ Enqueing Jobs
59
+ =============
60
+ First things first, require `qless` and create a client. The client accepts all the
61
+ same arguments that you'd use when constructing a redis client.
62
+
63
+ ``` ruby
64
+ require 'qless'
65
+
66
+ # Connect to localhost
67
+ client = Qless::Client.new
68
+ # Connect to somewhere else
69
+ client = Qless::Client.new(:host => 'foo.bar.com', :port => 1234)
70
+ ```
71
+
72
+ Jobs should be classes or modules that define a `perform` method, which
73
+ must accept a single `job` argument:
74
+
75
+ ``` ruby
76
+ class MyJobClass
77
+ def self.perform(job)
78
+ # job is an instance of `Qless::Job` and provides access to
79
+ # job.data, a means to cancel the job (job.cancel), and more.
80
+ end
81
+ end
82
+ ```
83
+
84
+ Now you can access a queue, and add a job to that queue.
85
+
86
+ ``` ruby
87
+ # This references a new or existing queue 'testing'
88
+ queue = client.queues['testing']
89
+ # Let's add a job, with some data. Returns Job ID
90
+ queue.put(MyJobClass, :hello => 'howdy')
91
+ # => "0c53b0404c56012f69fa482a1427ab7d"
92
+ # Now we can ask for a job
93
+ job = queue.pop
94
+ # => <Qless::Job 0c53b0404c56012f69fa482a1427ab7d (MyJobClass / testing)>
95
+ # And we can do the work associated with it!
96
+ job.perform
97
+ ```
98
+
99
+ The job data must be serializable to JSON, and it is recommended
100
+ that you use a hash for it. See below for a list of the supported job options.
101
+
102
+ The argument returned by `queue.put` is the job ID, or jid. Every Qless
103
+ job has a unique jid, and it provides a means to interact with an
104
+ existing job:
105
+
106
+ ``` ruby
107
+ # find an existing job by it's jid
108
+ job = client.jobs[jid]
109
+
110
+ # Query it to find out details about it:
111
+ job.klass # => the class of the job
112
+ job.queue # => the queue the job is in
113
+ job.data # => the data for the job
114
+ job.history # => the history of what has happened to the job sofar
115
+ job.dependencies # => the jids of other jobs that must complete before this one
116
+ job.dependents # => the jids of other jobs that depend on this one
117
+ job.priority # => the priority of this job
118
+ job.tags # => array of tags for this job
119
+ job.original_retries # => the number of times the job is allowed to be retried
120
+ job.retries_left # => the number of retries left
121
+
122
+ # You can also change the job in various ways:
123
+ job.move("some_other_queue") # move it to a new queue
124
+ job.cancel # cancel the job
125
+ job.tag("foo") # add a tag
126
+ job.untag("foo") # remove a tag
127
+ ```
128
+
129
+ Running A Worker
130
+ ================
131
+
132
+ The Qless ruby worker was heavily inspired by Resque's worker,
133
+ but thanks to the power of the qless-core lua scripts, it is
134
+ *much* simpler and you are welcome to write your own (e.g. if
135
+ you'd rather save memory by not forking the worker for each job).
136
+
137
+ As with resque...
138
+
139
+ * The worker forks a child process for each job in order to provide
140
+ resilience against memory leaks. Pass the `RUN_AS_SINGLE_PROCESS`
141
+ environment variable to force Qless to not fork the child process.
142
+ Single process mode should only be used in some test/dev
143
+ environments.
144
+ * The worker updates its procline with its status so you can see
145
+ what workers are doing using `ps`.
146
+ * The worker registers signal handlers so that you can control it
147
+ by sending it signals.
148
+ * The worker is given a list of queues to pop jobs off of.
149
+ * The worker logs out put based on `VERBOSE` or `VVERBOSE` (very
150
+ verbose) environment variables.
151
+ * Qless ships with a rake task (`qless:work`) for running workers.
152
+ It runs `qless:setup` before starting the main work loop so that
153
+ users can load their environment in that task.
154
+ * The sleep interval (for when there is no jobs available) can be
155
+ configured with the `INTERVAL` environment variable.
156
+
157
+ Resque uses queues for its notion of priority. In contrast, qless
158
+ has priority support built-in. Thus, the worker supports two strategies
159
+ for what order to pop jobs off the queues: ordered and round-robin.
160
+ The ordered reserver will keep popping jobs off the first queue until
161
+ it is empty, before trying to pop job off the second queue. The
162
+ round-robin reserver will pop a job off the first queue, then the second
163
+ queue, and so on. You could also easily implement your own.
164
+
165
+ To start a worker, load the qless rake tasks in your Rakefile, and
166
+ define a `qless:setup` task:
167
+
168
+ ``` ruby
169
+ require 'qless/tasks'
170
+ namespace :qless do
171
+ task :setup do
172
+ require 'my_app/environment' # to ensure all job classes are loaded
173
+
174
+ # Set options via environment variables
175
+ # The only required option is QUEUES; the
176
+ # rest have reasonable defaults.
177
+ ENV['REDIS_URL'] ||= 'redis://some-host:7000/3'
178
+ ENV['QUEUES'] ||= 'fizz,buzz'
179
+ ENV['JOB_RESERVER'] ||= 'Ordered'
180
+ ENV['INTERVAL'] ||= '10' # 10 seconds
181
+ ENV['VERBOSE'] ||= 'true'
182
+ end
183
+ end
184
+ ```
185
+
186
+ Then run the `qless:work` rake task:
187
+
188
+ ```
189
+ rake qless:work
190
+ ```
191
+
192
+ The following signals are supported:
193
+
194
+ * TERM: Shutdown immediately, stop processing jobs.
195
+ * INT: Shutdown immediately, stop processing jobs.
196
+ * QUIT: Shutdown after the current job has finished processing.
197
+ * USR1: Kill the forked child immediately, continue processing jobs.
198
+ * USR2: Don't process any new jobs
199
+ * CONT: Start processing jobs again after a USR2
200
+
201
+ You should send these to the master process, not the child.
202
+
203
+ Workers also support middleware modules that can be used to inject
204
+ logic before, after or around the processing of a single job in
205
+ the child process. This can be useful, for example, when you need to
206
+ re-establish a connection to your database in each job.
207
+
208
+ Define a module with an `around_perform` method that calls `super` where you
209
+ want the job to be processed:
210
+
211
+ ``` ruby
212
+ module ReEstablishDBConnection
213
+ def around_perform(job)
214
+ MyORM.establish_connection
215
+ super
216
+ end
217
+ end
218
+ ```
219
+
220
+ Then, mix-it into the worker class. You can mix-in as many
221
+ middleware modules as you like:
222
+
223
+ ``` ruby
224
+ require 'qless/worker'
225
+ Qless::Worker.class_eval do
226
+ include ReEstablishDBConnection
227
+ include SomeOtherAwesomeMiddleware
228
+ end
229
+ ```
230
+
231
+ Web Interface
232
+ =============
233
+
234
+ Qless ships with a resque-inspired web app that lets you easily
235
+ deal with failures and see what it is processing. If you're project
236
+ has a rack-based ruby web app, we recommend you mount Qless's web app
237
+ in it. Here's how you can do that with `Rack::Builder` in your `config.ru`:
238
+
239
+ ``` ruby
240
+ Qless::Server.client = Qless::Client.new(:host => "some-host", :port => 7000)
241
+
242
+ Rack::Builder.new do
243
+ use SomeMiddleware
244
+
245
+ map('/some-other-app') { run Apps::Something.new }
246
+ map('/qless') { run Qless::Server.new }
247
+ end
248
+ ```
249
+
250
+ For an app using Rails 3+, check the router documentation for how to mount
251
+ rack apps.
252
+
253
+ Job Dependencies
254
+ ================
255
+ Let's say you have one job that depends on another, but the task definitions are
256
+ fundamentally different. You need to bake a turkey, and you need to make stuffing,
257
+ but you can't make the turkey until the stuffing is made:
258
+
259
+ ``` ruby
260
+ queue = client.queues['cook']
261
+ stuffing_jid = queue.put(MakeStuffing, {:lots => 'of butter'})
262
+ turkey_jid = queue.put(MakeTurkey , {:with => 'stuffing'}, :depends=>[stuffing_jid])
263
+ ```
264
+
265
+ When the stuffing job completes, the turkey job is unlocked and free to be processed.
266
+
267
+ Priority
268
+ ========
269
+ Some jobs need to get popped sooner than others. Whether it's a trouble ticket, or
270
+ debugging, you can do this pretty easily when you put a job in a queue:
271
+
272
+ ``` ruby
273
+ queue.put(MyJobClass, {:foo => 'bar'}, :priority => 10)
274
+ ```
275
+
276
+ What happens when you want to adjust a job's priority while it's still waiting in
277
+ a queue?
278
+
279
+ ``` ruby
280
+ job = client.jobs['0c53b0404c56012f69fa482a1427ab7d']
281
+ job.priority = 10
282
+ # Now this will get popped before any job of lower priority
283
+ ```
284
+
285
+ Scheduled Jobs
286
+ ==============
287
+ If you don't want a job to be run right away but some time in the future, you can
288
+ specify a delay:
289
+
290
+ ``` ruby
291
+ # Run at least 10 minutes from now
292
+ queue.put(MyJobClass, {:foo => 'bar'}, :delay => 600)
293
+ ```
294
+
295
+ This doesn't guarantee that job will be run exactly at 10 minutes. You can accomplish
296
+ this by changing the job's priority so that once 10 minutes has elapsed, it's put before
297
+ lesser-priority jobs:
298
+
299
+ ``` ruby
300
+ # Run in 10 minutes
301
+ queue.put(MyJobClass, {:foo => 'bar'}, :delay => 600, :priority => 100)
302
+ ```
303
+
304
+ Recurring Jobs
305
+ ==============
306
+ Sometimes it's not enough simply to schedule one job, but you want to run jobs regularly.
307
+ In particular, maybe you have some batch operation that needs to get run once an hour and
308
+ you don't care what worker runs it. Recurring jobs are specified much like other jobs:
309
+
310
+ ``` ruby
311
+ # Run every hour
312
+ queue.recur(MyJobClass, {:widget => 'warble'}, 3600)
313
+ # => 22ac75008a8011e182b24cf9ab3a8f3b
314
+ ```
315
+
316
+ You can even access them in much the same way as you would normal jobs:
317
+
318
+ ``` ruby
319
+ job = client.jobs['22ac75008a8011e182b24cf9ab3a8f3b']
320
+ # => < Qless::RecurringJob 22ac75008a8011e182b24cf9ab3a8f3b >
321
+ ```
322
+
323
+ Changing the interval at which it runs after the fact is trivial:
324
+
325
+ ``` ruby
326
+ # I think I only need it to run once every two hours
327
+ job.interval = 7200
328
+ ```
329
+
330
+ If you want it to run every hour on the hour, but it's 2:37 right now, you can specify
331
+ an offset which is how long it should wait before popping the first job:
332
+
333
+ ``` ruby
334
+ # 23 minutes of waiting until it should go
335
+ queue.recur(MyJobClass, {:howdy => 'hello'}, 3600, :offset => 23 * 60)
336
+ ```
337
+
338
+ Recurring jobs also have priority, a configurable number of retries, and tags. These
339
+ settings don't apply to the recurring jobs, but rather the jobs that they create. In the
340
+ case where more than one interval passes before a worker tries to pop the job, __more than
341
+ one job is created__. The thinking is that while it's completely client-managed, the state
342
+ should not be dependent on how often workers are trying to pop jobs.
343
+
344
+ ``` ruby
345
+ # Recur every minute
346
+ queue.recur(MyJobClass, {:lots => 'of jobs'}, 60)
347
+ # Wait 5 minutes
348
+ queue.pop(10).length
349
+ # => 5 jobs got popped
350
+ ```
351
+
352
+ Configuration Options
353
+ =====================
354
+ You can get and set global (read: in the context of the same Redis instance) configuration
355
+ to change the behavior for heartbeating, and so forth. There aren't a tremendous number
356
+ of configuration options, but an important one is how long job data is kept around. Job
357
+ data is expired after it has been completed for `jobs-history` seconds, but is limited to
358
+ the last `jobs-history-count` completed jobs. These default to 50k jobs, and 30 days, but
359
+ depending on volume, your needs may change. To only keep the last 500 jobs for up to 7 days:
360
+
361
+ ``` ruby
362
+ client.config['jobs-history'] = 7 * 86400
363
+ client.config['jobs-history-count'] = 500
364
+ ```
365
+
366
+ Tagging / Tracking
367
+ ==================
368
+ In qless, 'tracking' means flagging a job as important. Tracked jobs have a tab reserved
369
+ for them in the web interface, and they also emit subscribable events as they make progress
370
+ (more on that below). You can flag a job from the web interface, or the corresponding code:
371
+
372
+ ``` ruby
373
+ client.jobs['b1882e009a3d11e192d0b174d751779d'].track
374
+ ```
375
+
376
+ Jobs can be tagged with strings which are indexed for quick searches. For example, jobs
377
+ might be associated with customer accounts, or some other key that makes sense for your
378
+ project.
379
+
380
+ ``` ruby
381
+ queue.put(MyJobClass, {:tags => 'aplenty'}, :tags => ['12345', 'foo', 'bar'])
382
+ ```
383
+
384
+ This makes them searchable in the web interface, or from code:
385
+
386
+ ``` ruby
387
+ jids = client.jobs.tagged('foo')
388
+ ```
389
+
390
+ You can add or remove tags at will, too:
391
+
392
+ ``` ruby
393
+ job = client.jobs['b1882e009a3d11e192d0b174d751779d']
394
+ job.tag('howdy', 'hello')
395
+ job.untag('foo', 'bar')
396
+ ```
397
+
398
+ Notifications
399
+ =============
400
+ Tracked jobs emit events on specific pubsub channels as things happen to them. Whether
401
+ it's getting popped off of a queue, completed by a worker, etc. A good example of how
402
+ to make use of this is in the `qless-campfire` or `qless-growl`. The jist of it goes like
403
+ this, though:
404
+
405
+ ``` ruby
406
+ client.events do |on|
407
+ on.canceled { |jid| puts "#{jid} canceled" }
408
+ on.stalled { |jid| puts "#{jid} stalled" }
409
+ on.track { |jid| puts "tracking #{jid}" }
410
+ on.untrack { |jid| puts "untracking #{jid}" }
411
+ on.completed { |jid| puts "#{jid} completed" }
412
+ on.failed { |jid| puts "#{jid} failed" }
413
+ on.popped { |jid| puts "#{jid} popped" }
414
+ on.put { |jid| puts "#{jid} put" }
415
+ end
416
+ ```
417
+
418
+ Those familiar with redis pubsub will note that a redis connection can only be used
419
+ for pubsub-y commands once listening. For this reason, invoking `client.events` actually
420
+ creates a second connection so that `client` can still be used as it normally would be:
421
+
422
+ ``` ruby
423
+ client.events do |on|
424
+ on.failed do |jid|
425
+ puts "#{jid} failed in #{client.jobs[jid].queue_name}"
426
+ end
427
+ end
428
+ ```
429
+
430
+ Heartbeating
431
+ ============
432
+ When a worker is given a job, it is given an exclusive lock to that job. That means
433
+ that job won't be given to any other worker, so long as the worker checks in with
434
+ progress on the job. By default, jobs have to either report back progress every 60
435
+ seconds, or complete it, but that's a configurable option. For longer jobs, this
436
+ may not make sense.
437
+
438
+ ``` ruby
439
+ # Hooray! We've got a piece of work!
440
+ job = queue.pop
441
+ # How long until I have to check in?
442
+ job.ttl
443
+ # => 59
444
+ # Hey! I'm still working on it!
445
+ job.heartbeat
446
+ # => 1331326141.0
447
+ # Ok, I've got some more time. Oh! Now I'm done!
448
+ job.complete
449
+ ```
450
+
451
+ If you want to increase the heartbeat in all queues,
452
+
453
+ ``` ruby
454
+ # Now jobs get 10 minutes to check in
455
+ client.config['heartbeat'] = 600
456
+ # But the testing queue doesn't get as long.
457
+ client.queues['testing'].heartbeat = 300
458
+ ```
459
+
460
+ When choosing a heartbeat interval, realize that this is the amount of time that
461
+ can pass before qless realizes if a job has been dropped. At the same time, you don't
462
+ want to burden qless with heartbeating every 10 seconds if your job is expected to
463
+ take several hours.
464
+
465
+ An idiom you're encouraged to use for long-running jobs that want to check in their
466
+ progress periodically:
467
+
468
+ ``` ruby
469
+ # Wait until we have 5 minutes left on the heartbeat, and if we find that
470
+ # we've lost our lock on a job, then honorable fall on our sword
471
+ if (job.ttl < 300) && !job.heartbeat
472
+ return / die / exit
473
+ end
474
+ ```
475
+
476
+ Stats
477
+ =====
478
+ One nice feature of `qless` is that you can get statistics about usage. Stats are
479
+ aggregated by day, so when you want stats about a queue, you need to say what queue
480
+ and what day you're talking about. By default, you just get the stats for today.
481
+ These stats include information about the mean job wait time, standard deviation,
482
+ and histogram. This same data is also provided for job completion:
483
+
484
+ ``` ruby
485
+ # So, how're we doing today?
486
+ stats = client.stats.get('testing')
487
+ # => { 'run' => {'mean' => ..., }, 'wait' => {'mean' => ..., }}
488
+ ```
489
+
490
+ Time
491
+ ====
492
+ It's important to note that Redis doesn't allow access to the system time if you're
493
+ going to be making any manipulations to data (which our scripts do). And yet, we
494
+ have heartbeating. This means that the clients actually send the current time when
495
+ making most requests, and for consistency's sake, means that your workers must be
496
+ relatively synchronized. This doesn't mean down to the tens of milliseconds, but if
497
+ you're experiencing appreciable clock drift, you should investigate NTP. For what it's
498
+ worth, this hasn't been a problem for us, but most of our jobs have heartbeat intervals
499
+ of 30 minutes or more.
500
+
501
+ Ensuring Job Uniqueness
502
+ =======================
503
+
504
+ As mentioned above, Jobs are uniquely identied by an id--their jid.
505
+ Qless will generate a UUID for each enqueued job or you can specify
506
+ one manually:
507
+
508
+ ``` ruby
509
+ queue.put(MyJobClass, { :hello => 'howdy' }, :jid => 'my-job-jid')
510
+ ```
511
+
512
+ This can be useful when you want to ensure a job's uniqueness: simply
513
+ create a jid that is a function of the Job's class and data, it'll
514
+ guaranteed that Qless won't have multiple jobs with the same class
515
+ and data.
516
+
517
+ Setting Default Job Options
518
+ ===========================
519
+
520
+ `Qless::Queue#put` accepts a number of job options (see above for their
521
+ semantics):
522
+
523
+ * jid
524
+ * delay
525
+ * priority
526
+ * tags
527
+ * retries
528
+ * depends
529
+
530
+ When enqueueing the same kind of job with the same args in multiple
531
+ places it's a pain to have to declare the job options every time.
532
+ Instead, you can define default job options directly on the job class:
533
+
534
+ ``` ruby
535
+ class MyJobClass
536
+ def self.default_job_options(data)
537
+ { :priority => 10, :delay => 100 }
538
+ end
539
+ end
540
+
541
+ queue.put(MyJobClass, { :some => "data" }, :delay => 10)
542
+ ```
543
+
544
+ Individual jobs can still specify options, so in this example,
545
+ the job would be enqueued with a priority of 10 and a delay of 10.
546
+
547
+ Testing Jobs
548
+ ============
549
+ When unit testing your jobs, you will probably want to avoid the
550
+ overhead of round-tripping them through redis. You can of course
551
+ use a mock job object and pass it to your job class's `perform`
552
+ method. Alternately, if you want a real full-fledged `Qless::Job`
553
+ instance without round-tripping it through Redis, use `Qless::Job.build`:
554
+
555
+ ``` ruby
556
+ describe MyJobClass do
557
+ let(:client) { Qless::Client.new }
558
+ let(:job) { Qless::Job.build(client, MyJobClass, :data => { "some" => "data" }) }
559
+
560
+ it 'does something' do
561
+ MyJobClass.perform(job)
562
+ # make an assertion about what happened
563
+ end
564
+ end
565
+ ```
566
+
567
+ The options hash passed to `Qless::Job.build` supports all the same
568
+ options a normal job supports. See
569
+ [the source](https://github.com/seomoz/qless/blob/master/lib/qless/job.rb)
570
+ for a full list.
571
+